Start a Conversation

Unsolved

This post is more than 5 years old

2048

January 18th, 2012 11:00

write pending issue due to SQL backup jobs

Hi All,

One of our DMX array is experiencing write pending issue. Every time it reaching 90+%. After analysing we identified that few devices which reaching their maximum device WP limit which in turn resulting high overall system WP.

The device WP are increasing at a particular time for each device.This devices are presented to  SQL server and SQL  backup(from one drive to other within server) are starting at that time.

There are multiple SQL servers and all will perform the SQL backup that increasing the system WP.I am looking for a solution for this issue.

One solution I identified is to re schedule the backup timing. Currently backup starting on same time in most the servers.

Is there any other suggestion to reduce the WP issue ?

1.3K Posts

January 18th, 2012 11:00

WP issues are pretty simple.  Either slow the writes into the system, or increase the speed of the destage.  Adding cache generally does not help much.

You can limit the writes into the system by reducing the paths into the system, or the queue depth.

You can increase the destage rate by increasing the number of disks or DAs and/or the raid protection.

20 Posts

January 23rd, 2012 12:00

That's the main reason I refuse to let my DBA team use the vMax as a backup device. Another solution, and in my opinion a better solution, is to purchase a small disk array or a tray of disk for this purpose.

On a side note, one would think a storage company would add DBA and vMax to the site dictionary.

January 24th, 2012 09:00

With regard to your search, how did you perform it and what browser are you using? You should have been able to do a search on DBA and VMAX. However, there have been issues with Internet Explorer. Using Mozilla Firefox may solve the problem.

20 Posts

January 24th, 2012 13:00

They're not in the site spell checker.

148 Posts

January 25th, 2012 05:00

Can you explain littile about  below point ?

"You can increase the destage rate by increasing the number of disks or DAs and/or the raid protection"

1.3K Posts

January 25th, 2012 08:00

Write cache is simply a buffer between the host and the disks.  If the disks can't keep up with the host write rate, the buffer will fill and the host will be slowed to the rate of the destage on the disks.

If the disks can destage faster than the host is writing, then the host will always be able to write to cache with no delay because the buffer is full.   Fewer drives can destage slower than more drives. RAID1 can desage faster than RAID5 which can desage faster than RAID6.  Put enough drives behind a DA CPU, and the DA CPU can be the limiting factor for the destage rate, so in that case, more DAs can increase the destage rate.

Need more explanation?

148 Posts

January 27th, 2012 05:00

Currently we are using RAID 5 drive for our backup drive(to which we will dump the SQL backup).So If we configure it with a RAID 1 we can increase the destage rate right which inturn reduce WP. right ?

1.3K Posts

January 27th, 2012 05:00

With thick you need to make sure your meta is the same width as the disks you are writing to.  For example if you had 16 total drives with 3+1 protection, your meta volume should have 4 members so that you have one member on each raid group.  If it wraps around, it will increase the seek time.  If it doesn't touch all the drives, you won't get the advantage of all the disks.

1.3K Posts

January 27th, 2012 05:00

You can also consider DCP (Dynamic Cache Partitioning) to fence these writes so they don't impact the the whole system.

148 Posts

January 27th, 2012 05:00

This is a Thick one and I am using a meta volume.

I am working with my SQL team to re arrange the backup timings to reduce the writes to storage in a given time

1.3K Posts

January 27th, 2012 05:00

RAID1 would probably help, but if the workload is 100% sequential on the disks, RAID5 is more efficient. This is because of optimized writes which can be performed when the whole raid stripe is in cache.  Then we don't need to perform any reads to calculate parity. 

The other option is to spread the workload across more drives.  I don't think you said if this was VP or traditional thick.  If thick, meta volumes can help spread the load over multiple raid groups.

No Events found!

Top