Start a Conversation

Unsolved

This post is more than 5 years old

T

14128

August 26th, 2008 20:00

Very slow reads with SuSE Linux/OES2

A few months ago we purchased what I thought was should be a very capable server to run our file, print, and other services with SuSE Linux Enterprise Server 10 SP1 with Novell Open Enterprise Server 2. Unfortunately so far in my pre-production tests it has proven to be totally unfit performance-wise, and I just don't know what to do.

 

Let me start by describing one simple test that shows just how bad the problem is. I connected my new workstation with a Gb NIC directly to the server via a crossover cable. I copied 12.6 GB of data from the server to my workstation (with my antivirus off). It took 11 minutes, 30 seconds, an unbearable rate of 18.3 MB/s. I obviously can't put the server into production like this.

 

The server is a PowerEdge 2900 III - Quad core Xeon 2.5 GHz, 8 GB of RAM. The hard drives are all 15K RPM SAS connected to a PERC 6/i RAID controller. I've installed the 64-bit versions of SLES 10 SP1 and OES 2 onto two of the drives which are configured as RAID1 with ext3. The other 4 drives are arranged in RAID5 with a Novell NSS partition for our data. It also has an LTO4 tape drive, which I chose in part for its speed.

 

The problem began when backup tests resulted in very slow speeds, around 12 MB/s. I was eventually referred to a couple of HP test utilities which seemed to indicate that the problem is with our hard disk system, not the tape drive. I've done all kinds of benchmark tests on the drives since then, but I'll just include the most recent ones here. I've used Bonnie (I couldn't get Bonnie++ to work correctly on the NSS/RAID5) and IOzone, which is apparently favored by Novell support.

 

After making sure of all of the necessary softare, firmware, and driver updates, Dell gave me two suggestions- to experiment with blockdev, and to try changing the I/O elevator. Neither really seemed to make much difference, although I may not have known what I was doing.

 

Questions: (1) Are these benchmarks valid? Were these good tools to use and did I seem to use them properly? (2) Might this be a hardware problem, or is it more likely software - a driver, SLES, or OES? (3) Is there anything that I might do to fix the situation and make the server usable?

 

I would appreciate any assistance. Things are getting so dire that I'm considering putting NetWare on the server.

 

 

Jim Wagner

 

 

Benchmarks used 8 GB file size to eliminate the effects of caching

 

Bonnie- System (ext3/RAID1)

 

oestest4:~ # bonnie -s 8192

Bonnie 1.4: File './Bonnie.26368', size: 8589934592, volumes: 1

Writing with putc()... done: 57898 kB/s 77.7 %CPU

Rewriting... done: 49505 kB/s 6.4 %CPU

Writing intelligently... done: 151523 kB/s 30.6 %CPU

Reading with getc()... done: 58176 kB/s 68.5 %CPU

Reading intelligently... done: 111187 kB/s 6.2 %CPU

Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...

---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek-

-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)-

Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU

oestes 1*8192 57898 77.7151523 30.6 49505 6.4 58176 68.5111187 6.2 1719.8 1.4

 

 

Bonnie- Data (NSS/RAID5)

 

oestest4:/media/nss/VOL1/public/junk # bonnie -d . -s 8192

Bonnie 1.4: File './Bonnie.14854', size: 8589934592, volumes: 1

Writing with putc()... done: 71004 kB/s 99.5 %CPU

Rewriting... done: 28264 kB/s 0.5 %CPU

Writing intelligently... done: 314260 kB/s 37.2 %CPU

Reading with getc()... done: 19187 kB/s 4.8 %CPU

Reading intelligently... done: 31983 kB/s 0.0 %CPU

Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...

---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek-

-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)-

Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU

oestes 1*8192 71004 99.5314260 37.2 28264 0.5 19187 4.8 31983 0.0 1094.0 3.7

 

IOzone- System

 

Excel chart generation enabled

Auto Mode

File size set to 8388608 KB

Record Size 512 KB

Command line used: ./iozone -Ra -s 8G -r 512K

Output is in Kbytes/sec

Time Resolution = 0.000001 seconds.

Processor cache size set to 1024 Kbytes.

Processor cache line size set to 32 bytes.

File stride size set to 17 * record size.

random random bkwd record stride

KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread

8388608 512 155817 100328 109573 108636 89959 69436 94781 3623919 105521 151075 144792 110306 111444

 

IOzone- Data

 

Excel chart generation enabled

Auto Mode

File size set to 8388608 KB

Record Size 512 KB

Command line used: ./iozone -Ra -s 8G -r 512K

Output is in Kbytes/sec

Time Resolution = 0.000001 seconds.

Processor cache size set to 1024 Kbytes.

Processor cache line size set to 32 bytes.

File stride size set to 17 * record size.

random random bkwd record stride

KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread

8388608 512 276143 225427 31951 31984 24585 63248 30555 5745377 28546 264559 255717 31962 31976

36 Posts

September 11th, 2008 19:00

One question, have you attempted the same test, but instead using multiple threads for your copy, rather than just a single thread?  It looks as though your IOzone numbers (if I'm reading it right) are good (100MB/s+).  Multiple copies might give you a better feel to the true performance.

5 Posts

September 12th, 2008 12:00

I don't know if it was the only issue, but I've been able to improve the performance on the NSS/RAID5 to satisfactory levels with some NSS changes from Novell.  It may not be optimal but it's good enough.
No Events found!

Top