Start a Conversation

Unsolved

This post is more than 5 years old

17492

August 24th, 2016 19:00

Poor performance PERC H710P

I am experiencing a serious performance issue with a PERC H710P with 2 RAID 1 Mirrors. 

Sys config:

2 Intel Xeon E5-2620 CPU Running @ 2000MHz

32GB RAM

Mirror 1 

2 - 2TB SAS Drives

Mirror 2

2 - 2TB SAS Drives

Windows SBS 2011 Server.

With no users connected to the server, I run ParkDale (a HDD read / write testing prog) on mirror 1 (contains the OS and all Windows functions). On the first pass I get a Sequential write of 350 -400 MB/sec. Seq Read 400 -600MB / sec. Random Write 35 - 45  MB/sec Random Read 40 -55 MB / Sec. After the test completes I immediately run it again and typically the Seq Read / Write are cut in half (sometimes the Seq Write will drop below 100MB / Sec.

Running the same test on the 2nd mirror (contains the files for a company wide CRM prog) the numbers are always lower (Seq Write 100 - 200 MB / Sec Seq Read 250 - 400 MB / Sec); and an immediate re-running results in the same performance drop.

Within the CRM is a diagnostic program. This tests the processor and also the disk performance. While it doesn't list hard numbers for test results it does show a bar graph. On the over all tests, the server performs about mid-range - if I were to convert this to a scale of one to 10 - (1 being good); typically fast would be 2; typically slow would be 6 and this server is a 4. On the hard drive test; again 1 to 10 (1 being good) typically fast would be 2 typically slow would be 6 and this server would be 20 (the bar goes off the edge of the test page).

I have updated the FW on the RAID controller, both virtual disks have cache enabled, and are set to write-back. The server has all available updates. I have also stopped all SQL services, disconnected the server from the network and stopped every service I could for testing but nothing seems to make a difference.

Any ideas? Any suggestions? Am I missing something?

Moderator

 • 

8.8K Posts

August 25th, 2016 09:00

Johnanemone,

While we don't support resolving performance issues, mostly due to the large amount of variables that can cause performance issues, not all of which being the system itself. What i do suggest, and you start doing on your own, was to update the entire server to current. You stated you had updated the raid controller, but have you also done the BIOS, iDrac, NICS, as well as the hard drives themselves? I would verify the server is current and up to date entirely and see if that resolves the performance issues. 

Now verify how far back your BIOS and iDrac are prior to updating, as jumping too many updates to current can cause communication issues between the server and the motherboard. 

Let me know your current versions and I can give you an upgrade path to current. 

1 Rookie

 • 

48 Posts

August 29th, 2016 22:00

Grab cheap E5-2670 processors off ebay. Should help.

1 Rookie

 • 

91 Posts

May 17th, 2017 06:00

Similar problem on a R620 - 2 * Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz - 12 core - 128GB Ram

A single 1TB volume - RAID 1 - Write Back - 7200t/min disks.

OS : Oracle Linux (Oracle VM server).

All latest firmware - BIOS 2.5.4 - PERFORMANCE set.

PERC Firmware 21.3.4-0001n

When copying large files (cp) from the OS , things are going well for a little while but the system becomes totally unresponsive. I believe that it occurs once the disk write cache is full.

I also tested disk perf with DiskMark64 from a Windows 2008R2 VM running on top of the hypervisor. Similar behaviour. Testing small amount of datas (500MB), perfs are excellent but they drop to ridiculous for large amount of data (2GB).

Any hint ?

4 Operator

 • 

1.8K Posts

May 17th, 2017 08:00

"When copying large files (cp) from the OS , things are going well for a little while but the system becomes totally unresponsive. I believe that it occurs once the disk write cache is full."

Unresponsive is not normal nor the fault of the raid controller unless there is an issue with the firmware which is unlikely or the OS or  the program used to test. You OS or system devices are the likely culprit if this happens in normal operation..

Performance tuning windows server, not specifically for 2011 but most is relevant, I would look into SMB signing/protocol.

msdn.microsoft.com/.../dn529133(v=vs.85).aspx

Windows server OS is not optimized for transfers of large files by default, most file operation operate are optimized on smaller chunks of data unless a number of modifications are made to the OS .Raid adapters are also optimized for small data sizes by default . 

"perfs are excellent but they drop to ridiculous for large amount of data (2GB)."

Unless your testing a server tuned for 2GB files ( database), 2GB is a ridiculously sized file to test performance on, especially file transfer speed. You cache is filled and does little before the midline of testing.

Lastly raid 1 is not super fast... for each write the raid adapter must complete the write, then write to the second disk in a raid 1. The second write can complete with the help of write back, but once the cache fills there is definitely a write penalty. If you want far greater  performance you need raid 10.

1 Rookie

 • 

91 Posts

May 17th, 2017 08:00

Note that the source server of for the file transfer a R610 w/ a PERC 700 card.

When I copy the other way round (i.e. from the server w/ PERC 710 to the server w/ PERC 700, there is no  problem. The recipient R610 server stays responsive during the copy

1 Rookie

 • 

91 Posts

May 17th, 2017 08:00

copyjng from a server w/ PERC H700 to another server w/PERC H730P mini is ok...

Only the server w/ PERC H710P mini has performances problems.

1 Rookie

 • 

48 Posts

May 17th, 2017 10:00

I have noticed similar issues with ESXi on R620 and H710P. It's apparently an issue with the write cache becoming full since it's unable to flush all of the cache data to the drives fast enough. When this happens, there was an I/O bottleneck in the OS and the VMs but they were never unresponsive as in your case. I was able to overcome this by upgrading to faster drives and having SSD's on fastpath to bypass the controller cache. No more I/O bottleneck after that.

May be you will need to review your OS's disk cache policy. Also try disabling Adaptive Read Ahead to see if it helps.

1 Rookie

 • 

91 Posts

May 18th, 2017 01:00

Hi,

Thank you all for your replies.

Yes, I understand writes becoming slower, even much slower, when the cache is full, but the system should not become unresponsive, as in our case. While a large file copy is running, a single ls on another directory takes 30s....

This is happening only on that R620 / H710P.

On R610 / H700 and R730 / H730P, no such problem.

In the three cases, the OS is exactly the same : Oracle VM server 3.4.2 (next next install) w/o any additional driver.

# uname -a
Linux mis-ovs100-int 4.1.12-61.1.9.el6uek.x86_64 #2 SMP Tue Sep 13 22:53:36 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux

In particular the card driver is the one below

# modinfo megaraid_sas
filename:       /lib/modules/4.1.12-61.1.9.el6uek.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko
description:    Avago MegaRAID SAS Driver
author:         megaraidlinux.pdl@avagotech.com
version:        06.810.09.00-rc1
license:        GPL
srcversion:     4A07642CA75263BE57CCA2A

1 Rookie

 • 

91 Posts

May 19th, 2017 01:00

Tested again by disabling write caching of the RAID 1 volume (Write Through policy)

Same behavior....After a while the Operating System becomes unresponsive. While a copy is in progress (100GB file) , it takes 30s to open a new prompt on the server.

I suspect a flaw in the firmware because other servers (R610 / PERC H700 and R730/ PERC H730P mini) behave correctly.

1 Message

February 24th, 2020 12:00

I am experiencing the same problem with a r720 with a H710p. I can run the following command

dd if=/dev/zero of=tempfile bs=1M count=100000 status=progress

which writes 100GB to a file. I watch the speed degrade from over 400MB/s to under 30MB/s. I have 3 drive groups with 3 different VDs. After the speed degrades I can't get any drive to write any faster than 7MB/s. I get speeds as low as 747KB/s until I reboot.

Is there a way to clear the write cache? Is there a better driver to use for Linux than the megaraid_sas driver?

Specs
Product Name = PERC H710P Mini
Serial Number = 2AH00SY
SAS Address =  5848f690e6113800
PCI Address = 00:03:00:00
System Time = 02/24/2020 11:59:54
Mfg. Date = 10/21/12
Controller Time = 02/24/2020 19:59:54
FW Package Build = 21.3.5-0002
BIOS Version = 5.42.00.1_4.12.05.00_0x05290003
FW Version = 3.131.05-8148
Driver Name = megaraid_sas
Driver Version = 07.706.03.00-rc1
Vendor Id = 0x1000
Device Id = 0x5B
SubVendor Id = 0x1028
SubDevice Id = 0x1F34
Host Interface = PCI-E
Device Interface = SAS-6G
Bus Number = 3
Device Number = 0
Function Number = 0
Drive Groups = 3

Moderator

 • 

8.7K Posts

February 25th, 2020 09:00

Hi,

What raid level are the virtual disks? Which distro are you using? RHEL driver: https://dell.to/2VsKqRI

 

1 Message

March 28th, 2020 14:00

I was dealing with pretty much the same issue on both a PowerEdge T320 and a T620.  I had a spare H710 adapter that was still on FW 21.3.4-0001, and with it installed, the slow performance went away.  So, maybe try reverting the FW to 21.3.4-0001 and see if that helps.  I am going to do the same on the other H710 I have, to see if it resolves the slow performance on it.

1 Message

August 14th, 2020 10:00

Hi to everybody. 

We had the same problem with an ESXi 6.5 and a R520 with H700. 

We followed a lot of suggestions, as firmware update, downgrade to old version, changing WB to Pass Throught, configurations into ESXi, etc... with no results. 

Finally the problem was an incompatible SSD. H700 is not compatible with Samsung EVO 850 Pro SSDs (it appears to be an incompatibility between the SSD controller and the H700 controller), and it was correctly configured but while copying, when you copy more than 4GB and the cache of the H700 would be full, the write speeds goes down to 10-20 Mbps. Reading was no problem. 

If we disabled the H700 cache, it was written more or less 10-15 mbps. 

Using other SSD as 860 Pro or Crucial MX500 solved the problem, and writes at 500 mbps. 

Take a look to the HDDS first. We could be crazy finding the problem in software, and could be something as simple as a hardware incompatibility. 

Thanks a lot to everybody and good luck. 

12 Posts

August 29th, 2020 10:00

hello,

 

we have issue on r620 with h710 mini and server freeze when we try to copy backup on backup server via ftp.

current version of firmware is 21.3.5-0002 and we move disks on 2 different servers with same h710 and we have same issue on both servers.

we have external 4tb usb 3.0 disk but same issue is with copy from vd raid 10 with samsung ssd to usb disk or usb to backup server via network or from raid to backup server via network.

does anyone have issue and would downgrade of firmware resolve issue

12 Posts

August 31st, 2020 03:00

Hello,

 

also we have other server with h310 with created raid 10 with same hdd's and everything works fine with h310.

 

regards.

No Events found!

Top