bodnarg
3 Zinc

WLA writes per sec vs write hits per sec

We are trying to trouble shoot a problem relating to a mainframe pool  using SATA drives and are pretty convinved the issue is lack of cache  and physical response time.  What is odd is that we do not see write  pending limits on the devices in question being reached, but when using  the writes per sec and write hits per sec statistic we can see a pretty  big gap (say 10% of the total)  The job in question is 100% write -  there are not reads going to the volume during the problematic time  frame.

We do see the write pending limit on the device  approach about 80% of the write pending threshold and level off before  dropping.  We've confirmed that the physical drives seem to be the issue  by moving the problematic drives away from the SATA volumes to FC  volumes and still see the same volume of IO with faster completion time.

This seems to pretty conclusively show we are hitting  the physical drives for these writes - trying to understand how that can  be the case when WLA pending never seems to hit the threshold.  Am I  missing something?

This is a DMX-4 running 74  microcode.

0 Kudos
12 Replies
Quincy561
4 Beryllium

Re: WLA writes per sec vs write hits per sec

Write "misses" can be logged for other reasons than WP limits being reached.

The best way to determine if volume limits are being reached is to have someone dial into the box and look at the A1,BA page and look to see if the task 8 counter is increasing, or if there is a delay being issued on the device just before the limit is reached (A7,DWMP)

Also I assume you mean 5773 code, not 74 code.

bodnarg
3 Zinc

Re: WLA writes per sec vs write hits per sec

Yes you are correct I meant to say we are on the recent version of 73 microcode.  We do have EMC engaged so I'll forward what you suggested.

So is it safe to assume that the writes - write hits = writes that are waiting on the physical drive instead of being serviced by cache?  Is there something else via WLA or something else that we can check that would confirm a delay.

0 Kudos
Booyah2
3 Argentum

Re: WLA writes per sec vs write hits per sec

% writes 100 * (writes per sec / total ios per sec)

Percentage of total write I/O operations performed by all of the Symmetrix devices.

deferred writes per sec A deferred write is a write hit. A deferred write occurs when the I/O write operations are

staged in cache and will be written to disk at a later time.

delayed dfw per sec A delayed deferred fast write (DFW) is a write-miss. A delayed DFW occurs when the I/O

write operations are delayed because the system or device write-pending limit was reached

and the cache had to destage slots to the disks before the writes could be written to cache.

bodnarg
3 Zinc

Re: WLA writes per sec vs write hits per sec

The odd thing is that we don't see any deferred writes or delayed writes, but we are seeing write hits NOT aligned with writes.

I would expect if cache is handling all of your writes that write hits per sec = writes per sec which in our case it does not.

I would expect when they don't match that you would show deferred writes and/or pending tracks.

0 Kudos
JasonBailey
3 Argentum

Re: WLA writes per sec vs write hits per sec

yeah I would expect your logic to be correct

however experience has taught me these symm metrics aren't so straight forward and numbers are counted somewhere else

some questions:

1) are you seeing write misses against these devs?

2) on the FA/EF processor are you seeing "device write pending events" > 0

3) on the FA/EF processor are you seeing "system write pending events" > 0

can you please email me the SR number, my email is in my profile info, thanks

0 Kudos
Quincy561
4 Beryllium

Re: WLA writes per sec vs write hits per sec

Sometimes counters are based on requests, which are accesses to cache slots.  So if an IO spans a slot, it can be two requests.

Write misses can be attributed to other reasons than WP limits being reached.  A slot being locked is one example.

If you are seeing WP counts very close to the limit, then there is a good chance the IOs could be delayed because of this.  We can start issuing delays on the front end before the WP limit is reached.

The counter on the front end for WP events is a good place to look as mentioned.

0 Kudos
bodnarg
3 Zinc

Re: WLA writes per sec vs write hits per sec

No to questions 2 & 3 - there are no obvious signs of WP events at the adapter or system level.  The only oddness we see relating to write pending is some of the busy devices getting close to the limit and plateuing, but never hitting the write limit threshold.

Yes on write misses.  An example of the discrepency - during the period in question (about 30 minutes from 9PM on) we have at the System metric level about 4000 writes per sec with less than 1000 write hits per second.

What is odd is that we don't necessarily see write misses, but we see write pendings get to about 95% of the write pending threshold.

0 Kudos
bodnarg
3 Zinc

Re: WLA writes per sec vs write hits per sec

We are waiting a final answer back from EMC engineering, but looks like our DA paths were hitting saturation point.  For some reason WLA was unable to show that fact with the data that it can parse though with Symmerge the performance GURU was able to show these ports being saturated.

WIll update once we receive an answer is to why this is the case.

0 Kudos
Quincy561
4 Beryllium

Re: WLA writes per sec vs write hits per sec

It can take a lot of SATA drives to saturate a DA.  Can you upload your BTP file to the FTP area so I could take a look at it?  And the bin file as well if you have it.

0 Kudos