Marcelino_Torre

8 Posts

3925

May 2nd, 2012 10:00

Total Queue Length

Hi All,

I have a performance issue in one array and all the luns that i have check i see the total queue length with high numbers.

Total Hard Errors: 0

Total Soft Errors: 0

Total Queue Length: 709977257

Name LUN 1

Minimum latency reads N/A

Bus 1 Enclosure 0 Disk 0 Queue Length: 2080534382

Bus 1 Enclosure 0 Disk 2 Queue Length: 2072672988

Bus 0 Enclosure 1 Disk 1 Queue Length: 2069807948

Bus 1 Enclosure 0 Disk 1 Queue Length: 976490026

Bus 0 Enclosure 1 Disk 0 Queue Length: 962291915

Bus 0 Enclosure 1 Disk 2 Queue Length: 96414034

Total Hard Errors: 0

Total Soft Errors: 0

Total Queue Length: 1557883738

Name LUN 2

Minimum latency reads N/A

Bus 1 Enclosure 0 Disk 0 Queue Length: 2080535408

Bus 1 Enclosure 0 Disk 2 Queue Length: 2072674181

Bus 0 Enclosure 1 Disk 1 Queue Length: 2069808981

Bus 1 Enclosure 0 Disk 1 Queue Length: 976490761

Bus 0 Enclosure 1 Disk 0 Queue Length: 962292820

Bus 0 Enclosure 1 Disk 2 Queue Length: 964141135

I have check the luns from 1 to 15 and all shows this high numbers in the Queue Lenght and all those are my productions luns.

What i can do to fix this problem all my servers are having problems with the performance lots of time outs and slow response.

Thanks.

Responses(6)

AnkitMehta

1.4K Posts

1

May 2nd, 2012 11:00

Queue Length is nothing but the average number of requests within a polling interval that are waiting to be serviced by the SP, including the one currently in service. A queue length of zero indicates an idle system. If three requests arrive at an idle SP at the same time, only one of them can be served immediately; the other two must wait in the queue, resulting in a queue length of three. (For better understanding you may refer this post.

As you mentioned you are facing time outs and slow response on all of your servers. I would suggest opening a case with EMC Technical Support and investigate that for you with the help of .nar files.

Please, refer to Primus solution emc161922 in order to gather the necessary information for a CLARiiON or VNX performance analysis.

Marcelino_Torre

8 Posts

0

May 2nd, 2012 11:00

This are the details of one of the LUNs

Prefetch size (blocks) = 0

Prefetch multiplier = 4

Segment size (blocks) = 0

Segment multiplier = 4

Maximum prefetch (blocks) = 4096

Prefetch Disable Size (blocks) = 4097

Prefetch idle count = 40

Variable length prefetching YES

Prefetched data retained YES

Read cache configured according to

specified parameters.

Total Hard Errors: 0

Total Soft Errors: 0

Total Queue Length: 709977257

Name MAGIC-A LUN 1

Minimum latency reads N/A

Read Histogram[0] 0

Read Histogram[1] 0

Read Histogram[2] 0

Read Histogram[3] 0

Read Histogram[4] 1975046387

Read Histogram[5] 0

Read Histogram[6] 0

Read Histogram[7] 0

Read Histogram[8] 4165934

Read Histogram[9] 0

Read Histogram overflows 0

Write Histogram[0] 0

Write Histogram[1] 0

Write Histogram[2] 0

Write Histogram[3] 0

Write Histogram[4] 182282115

Write Histogram[5] 0

Write Histogram[6] 0

Write Histogram[7] 0

Write Histogram[8] 0

Write Histogram[9] 0

Write Histogram overflows 75174

Read Requests: 2736608440

Write Requests: 1078842564

Blocks read: 836063235

Blocks written: 81611896

Read cache hits: 74903985

Read cache misses: N/A

Prefetched blocks: 2983715440

Unused prefetched blocks: 1829509312

Write cache hits: 56215062

Forced flushes: 797809

Read Hit Ratio: N/A

Write Hit Ratio: N/A

RAID Type: RAID1/0

RAIDGroup ID: 1

State: Bound

Stripe Crossing: 16114110

Element Size: 128

Current owner: SP A

Offset: 0

Auto-trespass: DISABLED

Auto-assign: DISABLED

Write cache: ENABLED

Read cache: ENABLED

Idle Threshold: 0

Idle Delay Time: 20

Write Aside Size: 2048

Default Owner: SP A

Rebuild Priority: High

Verify Priority: Medium

Prct Reads Forced Flushed: 0

Prct Writes Forced Flushed: 0

Prct Rebuilt: 100

Prct Bound: 100

LUN Capacity(Megabytes): 79872

LUN Capacity(Blocks): 163577856

UID: 60:06:01:60:14:B0:2A:00:DE:D8:DB:41:2B:4B:E0:11

Bus 1 Enclosure 0 Disk 0 Queue Length: 2080534382

Bus 1 Enclosure 0 Disk 2 Queue Length: 2072672988

Bus 0 Enclosure 1 Disk 1 Queue Length: 2069807948

Bus 1 Enclosure 0 Disk 1 Queue Length: 976490026

Bus 0 Enclosure 1 Disk 0 Queue Length: 962291915

Bus 0 Enclosure 1 Disk 2 Queue Length: 964140340

Bus 1 Enclosure 0 Disk 0 Hard Read Errors: 0

Bus 1 Enclosure 0 Disk 2 Hard Read Errors: 0

Bus 0 Enclosure 1 Disk 1 Hard Read Errors: 0

Bus 1 Enclosure 0 Disk 1 Hard Read Errors: 0

Bus 0 Enclosure 1 Disk 0 Hard Read Errors: 0

Bus 0 Enclosure 1 Disk 2 Hard Read Errors: 0

Bus 1 Enclosure 0 Disk 0 Hard Write Errors: 0

Bus 1 Enclosure 0 Disk 2 Hard Write Errors: 0

Bus 0 Enclosure 1 Disk 1 Hard Write Errors: 0

Bus 1 Enclosure 0 Disk 1 Hard Write Errors: 0

Bus 0 Enclosure 1 Disk 0 Hard Write Errors: 0

Bus 0 Enclosure 1 Disk 2 Hard Write Errors: 0

Bus 1 Enclosure 0 Disk 0 Soft Read Errors: 0

Bus 1 Enclosure 0 Disk 2 Soft Read Errors: 0

Bus 0 Enclosure 1 Disk 1 Soft Read Errors: 0

Bus 1 Enclosure 0 Disk 1 Soft Read Errors: 0

Bus 0 Enclosure 1 Disk 0 Soft Read Errors: 0

Bus 0 Enclosure 1 Disk 2 Soft Read Errors: 0

Bus 1 Enclosure 0 Disk 0 Soft Write Errors: 0

Bus 1 Enclosure 0 Disk 2 Soft Write Errors: 0

Bus 0 Enclosure 1 Disk 1 Soft Write Errors: 0

Bus 1 Enclosure 0 Disk 1 Soft Write Errors: 0

Bus 0 Enclosure 1 Disk 0 Soft Write Errors: 0

Bus 0 Enclosure 1 Disk 2 Soft Write Errors: 0

Bus 1 Enclosure 0 Disk 0 Enabled

Writes: 91634085

Blocks Read: 3605433834

Blocks Written: 2094829952

Queue Max: N/A

Queue Avg: N/A

Avg Service Time: N/A

Prct Idle: 87.15

Prct Busy: 12.84

Remapped Sectors: N/A

Read Retries: N/A

Write Retries: N/A

Bus 1 Enclosure 0 Disk 2 Enabled

Writes: 90688541

Blocks Read: 3629793924

Blocks Written: 2067393384

Queue Max: N/A

Queue Avg: N/A

Avg Service Time: N/A

Prct Idle: 87.26

Prct Busy: 12.73

Remapped Sectors: N/A

Read Retries: N/A

Write Retries: N/A

Bus 0 Enclosure 1 Disk 1 Enabled

Writes: 91127379

Blocks Read: 3547318867

Blocks Written: 2074187645

Queue Max: N/A

Queue Avg: N/A

Avg Service Time: N/A

Prct Idle: 87.11

Prct Busy: 12.88

Remapped Sectors: N/A

Read Retries: N/A

Write Retries: N/A

Bus 1 Enclosure 0 Disk 1 Enabled

Writes: 91634085

Blocks Read: 1372607059

Blocks Written: 2094829952

Queue Max: N/A

Queue Avg: N/A

Avg Service Time: N/A

Prct Idle: 96.45

Prct Busy: 3.54

Remapped Sectors: N/A

Read Retries: N/A

Write Retries: N/A

Bus 0 Enclosure 1 Disk 0 Enabled

Writes: 90688541

Blocks Read: 1251515530

Blocks Written: 2067393384

Queue Max: N/A

Queue Avg: N/A

Avg Service Time: N/A

Prct Idle: 96.44

Prct Busy: 3.55

Remapped Sectors: N/A

Read Retries: N/A

Write Retries: N/A

Bus 0 Enclosure 1 Disk 2 Enabled

Writes: 91127379

Blocks Read: 1326605071

Blocks Written: 2074187645

Queue Max: N/A

Queue Avg: N/A

Avg Service Time: N/A

Prct Idle: 96.46

Prct Busy: 3.53

Remapped Sectors: N/A

Read Retries: N/A

Write Retries: N/A

Is Private: NO

Snapshots List: Not Available

MirrorView Name if any: Not Available

Anirudh_Banerje

59 Posts

0

May 3rd, 2012 02:00

Hi

Suggest you to raise a SR with EMC support and provide NAR/NAZ files, so that performance team can be involved.

They are the best person to assist you with the explanation as to why the queue lengths are increasing.

You may open a chat with respective service line from powerlink for speedy process.

Thanks

zhouzengchao

2 Intern

•

1.4K Posts

0

May 6th, 2012 04:00

According to getlun info, all the I/O are 4kb small I/Os, maybe for DB or mail systems. The Queue Length for the disk maybe an accumulation, not the real value, not very sure about this. NAR/NAZ file needs to be checked by EMC performance team to determine the real cause.

tkjoffs

159 Posts

0

May 6th, 2012 20:00

Can you enable advanced logging and look at the SP Forced Flushinh and share the data there? If the issue is hitting across all systems I doubt you are facing a single LUN issue (unless one LUN is eating your entire write cache) but are more than liekly hitting a performance issue with the SPs. If you are seeing extreme drops in forced flushing or hitting over the watermarks, then you need to look to determine where the highest I/O is comming from and then consider adding more spindles to augment the I/O load for that LUN/RG.

kelleg

4.5K Posts

0

May 8th, 2012 13:00

There are some known issue with Analyzer that can affect the values for queue length for disks and other objects (LUNS, SP's, metaLUNs). Check the version of Flare that you are running on the array against the latest release and upgrade to the latest release if you are behind.

You can also try stopping Analyzer (data logging), restarting Management Server (both SPA and SPB), then start Analyzer (data logging) and see if this helps.

glen

P.S. when asking questions, it's helpful to provide the details about the array - flare version, array type, etc.

View All

No Events found!