Start a Conversation

Unsolved

This post is more than 5 years old

8747

December 23rd, 2011 11:00

RAID1 Glacially Slow on Aurora Desktop

I'm running an Aurora Desktop with an SSD boot volume for the Win7-x64 OS, and a RAID1 volume for data using two identical drives purchased from Dell/Alienware with the machine, and configured at the factory for RAID.  Recently, performance of the machine has positively tanked.  I've now tracked this issue back to VERY slow access times and high latency on my RAID1 array. 

When I run the microsoft SQLIO utility, I get disk write throughput numbers that are abysmal:

C:\Program Files (x86)\SQLIO>sqlio -kW -s60 -frandom -o256 -b4 -LS -Fparam.txt
sqlio v1.5.SG
using system counter for latency timings, 3132832 counts per second
parameter file used: param.txt
        file d:\testfile.dat with 2 threads (0-1) using mask 0x0 (0)
2 threads writing for 60 secs to file d:\testfile.dat
        using 4KB random IOs
        enabling multiple I/Os per thread with 256 outstanding
size of file d:\testfile.dat needs to be: 104857600 bytes
current file size:      0 bytes
need to expand by:      104857600 bytes
expanding d:\testfile.dat ... done.
using specified size: 100 MB for file: d:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec:   254.93
MBs/sec:     0.99
latency metrics:
Min_Latency(ms): 2
Avg_Latency(ms): 1974
Max_Latency(ms): 5846
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 99

Even the read test is horrible:

C:\Program Files (x86)\SQLIO>sqlio -kR -s60 -frandom -o256 -b4 -LS -Fparam.txt
sqlio v1.5.SG
using system counter for latency timings, 3132832 counts per second
parameter file used: param.txt
        file d:\testfile.dat with 2 threads (0-1) using mask 0x0 (0)
2 threads reading for 60 secs from file d:\testfile.dat
        using 4KB random IOs
        enabling multiple I/Os per thread with 256 outstanding
using specified size: 100 MB for file: d:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec:  1049.05
MBs/sec:     4.09
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 485
Max_Latency(ms): 771
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 12  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 88

I am at my wits end trying to troubleshoot this.  There is no way that a latency time on disk writes of over 5s can be normal.  I've run the Intel RST Verification test, and it comes back clean.  Currently the system is not failing per se, but I am worried that the slow performance is a harbinger of things to come.

Interestingly, even the SSD isn't stellar in the area of disk performance:

C:\Program Files (x86)\SQLIO>sqlio -kR -s60 -frandom -o256 -b4 -LS -Fparam.txt
sqlio v1.5.SG
using system counter for latency timings, 3132832 counts per second
parameter file used: param.txt
        file c:\testfile.dat with 2 threads (0-1) using mask 0x0 (0)
2 threads reading for 60 secs from file c:\testfile.dat
        using 4KB random IOs
        enabling multiple I/Os per thread with 256 outstanding
using specified size: 100 MB for file: c:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec:  6921.06
MBs/sec:    27.03
latency metrics:
Min_Latency(ms): 1
Avg_Latency(ms): 73
Max_Latency(ms): 84
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 100

This makes me wonder if there is a problem with the entire SATA controller itself.  When I run the test to my RAMDisk, I get GREAT performance, as expected. This tells me that the OS itself is not likely part of the problem:

C:\Program Files (x86)\SQLIO>sqlio -kR -s60 -frandom -o256 -b4 -LS -Fparam.txt
sqlio v1.5.SG
using system counter for latency timings, 3132832 counts per second
parameter file used: param.txt
        file r:\testfile.dat with 2 threads (0-1) using mask 0x0 (0)
2 threads reading for 60 secs from file r:\testfile.dat
        using 4KB random IOs
        enabling multiple I/Os per thread with 256 outstanding
using specified size: 100 MB for file: r:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 254852.81
MBs/sec:   995.51
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 1
Max_Latency(ms): 9
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%:  0 66 34  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

I've tried calling Alienware tech support 4 times today.  The first time, I got a guy who put me on hold for over 35 minutes, then the line was disconnected.  When I called back, I got the SAME guy.  Immediately after giving him my Service Tag, he hung up on me immediately.  I called a third time, and got a girl this time.  She put me on hold and I was sitting for about 3 minutes before the line got disconnected again.  The 4th time I got a guy who finally spoke with me and directed me to the wrong support group, who then tried to direct me to the right one, but they were all out for the holiday and I should try back on Tuesday.  Could have saved me 2 hours of phone time if the FIRST guy could have told me that.

Anyway, does anyone have any other suggestions for troubleshooting performance on an Aurora with RAID array?  Can someone else with a similarly configured Aurora desktop actually run the SQLIO tests with the paramaters supplied above and let me know what YOU get for these tests?  Clearly, 27MB/s is under-performing on the SATA drive, and 0.9MB/s is clearly horrible.  However, I just need some appropriate "next steps" in troubleshooting this. 

Thanks in advance.

22 Posts

December 23rd, 2011 12:00

Do you have your Tech Team or just standard?

2.4K Posts

December 23rd, 2011 21:00

You need to call Dell back and ask for a teir 2 tech and they will have one call you back. They are the real techs. Your issue is beyond the expertise of the normal techs. I would run it on mine and post but I have the Area 51 and the mobo is not the same.

My 2 cents..you need a new mobo if reinstalling doesn't work. I think the controller took a [Admin Note: TOU Removed -rp]. Unless you are using the Jmicron or something. You could try reinstalling but that will take a full OS reinstall I think if you want to be sure.

Did you try the Intel drivers yet or are you running off Windows?

December 28th, 2011 11:00

I would still LOVE It if you could run the test.  I'm just trying to find out if my IO numbers make sense, or if they are completely f-whacked out. (Don't TOU me, man!)  To me, these latencies are just WAY too high -- why should it take 5s for a disk read request?  If you can run the SQLIO tests for both your SSD RAID0 array, as well as for the HDD, that would be much appreciated.

I don't have another 230GB SSD to install fresh right now, but I do have a 120GB that I can probably use as a test.  

I am using the Intel driver v9.6.0.1014 dated 3/3/2010 for the Intel(R) ICH8R/ICH9R/ICH10R/DO/5 Series/3400 Series SATA RAID Controller in Win7-x64.

December 28th, 2011 11:00

Don't know.  I looked at the invoice, and there is no mention of support on there.  Maybe that explains why they are giving me the runaround?

I've spoken to 5 people already just today (in addition to the multiple people last week), and it seems that the only thing they can do is forward me to someone else, who forwards me to someone else, who either hangs up on me or forwards me back to the original tech support voicemail queue again.  Even requesting Tier2 support doesn't seem to help.  I've done this twice now today -- the first time they sent me to a voicemail prompt where the number keys on my phone didn't work, and I had no option but to hang up.  The second time I'm still on hold for, and its been 15 minutes and counting.

8 Wizard

 • 

17K Posts

December 28th, 2011 12:00

I would run the manufacturer's supplied, self bootable, deep diags/ long confidence test program on all HDDs. You might have to remove any RAID config so the physical drives can be seen as individuals.

Here are some notes about RST. Basically, it should work without the Intel RST installed ... but to find out, it has to be a fresh install without it ever being installed. Maybe build-up the system fresh and installed RST toward the end. Run an Acronis image before and after RST install.

Notes:

The Intel Storage (RST) software is optional in a single "spinning drive" environment.
If running a RAID HDD setup or SSD some say it's recommended. Install Intel RST 9.6.0.1014 (or latest newer version).
Once installed, it replaces the Microsoft HDD driver and CANNOT be un-installed or removed
(even if the app. interface is uninstalled).
The Intel ICH-10 SATA hardware RAID solution works from the BIOS. The Intel RST software is just a way
to monitor and change RAID parameters from inside Windows (but not required for RAID operation).
If trying to use external eSata drives with port-multiplexing, (due to incompatibilities) you
should probably skip it. Other RAID controllers (including OCZ Revo) might also have a problem with it installed.

 

757 Posts

December 28th, 2011 12:00

Is the OS on your machine the original install from Dell and did you build your data from scatch? Or did you download / transfer any data onto your new drives?

10 Posts

December 29th, 2011 06:00

Yeah I would suspect either one or more of the HDD going bad (usually doesnt happen), an issue with the mobo and SATA interface, or a software issue.

No Events found!

Top