Average Queue Depth - Issue?

Question

Hi, I have been using SAN HQ to monitor our production SAN and I've noticed that the average queue depth is hovering around 30-35 for our main database volume (recent data database files (.MDF/.NDF)).  The SQL log volume and the older data database files as an average queue depth of 1 and 0 respectively.  Can the 30-35 average queue depth volume be an issue?  If so, how can I resolve it? Thanks!

evilensky · Answer

Hello, in general terms, a high queue count would mean that IO requests are not being serviced as fast as requested. Is the latency response for this database higher than the other, and is it behaving slow in any way from a Service Level perspective?

From an app/OS/db level, there are SQL and OS performance counters you can look at using the wonderful Performance Analysis of Logs: http://pal.codeplex.com/ to see if there are any application/database tunings which may help.

dajonx · Answer

Hi,

I have used PAL and these are the results (8 hour of performance metrics gathered):

PAL Results For Current Disk Queue Length:

Condition:Experimental: 32 or greater current disk I/O's queued. If using an HBA, then consider adjusting the queue depth.

Min: 0

Avg: 1

Max: 241

Hourly Trend: 0

Std Dev: 11

SANHQ Current:

Avg I/O Rate: 3.1 MB/sec

Avg Latency: 12.2 ms

Avg IOPS: 316.4

Avg I/O Size: 10.0 KB

Avg Queue Depth: 30.0

Percent Reads: < 0.1%

Percent Writes: 99.9%

Any ideas from these resutls?

Thanks!

JOHNADCO · Answer

In my opinion?

This would be somewhat typical for a busy SQL DB set to commit changes immeadiately. Unless your log drive(s) are a lot quicker? There would be no real reason to change it on the DB side. Generally this is exposing some weakness in your physical storage side. If the designer / implementor gave the log drives some extra Oooomph over the DB stor then he was assuming the DB would not be set for immeadiate commit. Standard SQL config would be to have the log drives be the fast, small, and very responsive stor, while allowing the DB stor to be larger capacity and slower and still maintain fast performance to the users.

Just my $.02 on it anyways.

dajonx · Answer

Thank you for responding.

The DB and Log volumes are both on RAID 10 (PS6000VX). I have DB partitioning and have placed archived data on a RAID 50 volume (PS6000E). Both Log and archived data volumes have 1 or less Disk Queue Length.

So you think it's an EqualLogic issue?

Joe S586 · Answer

In SAN-HQ, the average queue depth is in-flight I/O’s at the controller that still need to be processed.

You didn’t mention the number of arrays in the group, how many members the volume is spread across, firmware version, NIC/HBA info, Multi-Path, switch configuration. As you can see there are some variables that could also play into this.

This could also indicate an over-subscribed array.

As indicated by evilensky, what is the latency, and is the volume behaving slowly in any way from a service perspective?

Without a full set of DIAGs from the group, it is too hard to see exactly what is going on with the volume. You may need to open a support case, so that we can take a closer look at your setup, and the volume performance.

Joe

JOHNADCO · Answer

I'd bank your array is simply busy as indicated by DELL-Joe S above. At least these disks that are working this DB.

I am lookign at our most demanding DB server right now. The stated average is quite low, but I often see peaks 60 and higher.

This is for average queue length. Is that the same as queue depth?

EqualLogic

Was this post helpful?