Start a Conversation

Unsolved

This post is more than 5 years old

75137

April 16th, 2015 15:00

SSD performance on R430 server just dropped through the floor

We've had a new R430 server in operation for about 4 months now, but in the last two days SSD performance dropped off a cliff. No config changes were made so not sure what happened, prior to the last two days the SSDs have been exceptionally fast.

I don't think you can TRIM (or need to?) from the command line, is there perhaps a BIOS setting I need to enable for it?

The SSDs are 2x LITEON IT ECE-400NAS 400GB RAID1 setup behind PERC H330 card, R430 server running CentOS 7.1.

Plenty of disk space available

# df -h

Filesystem               Size  Used Avail Use% Mounted on

/dev/mapper/centos-root  354G  101G  254G  29% /

Read speed benchmark:

# hdparm -Tt /dev/sda

Timing cached reads:   14570 MB in  2.00 seconds = 7291.07 MB/sec

Timing buffered disk reads:  32 MB in  3.02 seconds =  10.60 MB/sec


And write speed benchmark:

# dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc

1073741824 bytes (1.1 GB) copied, 148.132 s, 7.2 MB/s

So you can see 10MB/s reads and 7MB/s writes aren't really going to cut it. Where should I start looking for solutions?

14 Posts

April 16th, 2015 18:00

Yes I rebooted the server but saw no changed. Also I have the DSU repository installed, but it is telling me I've got the latest available firmware installed:

[-]10 PERC H330 Mini Controller 0 Firmware

 Current Version : 25.2.1.0037 same as : 25.2.1.0037

Why would the updated firmware you posted not be reflected by dsu?

Moderator

 • 

8.5K Posts

April 16th, 2015 18:00

Hi,

Have you, are you able to reboot the server? Do you have Server administrator installed? Do either of the drives showed failed? Can you update the controller firmware it fixed some issues with SSDs falling offline. http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=5K40N

Moderator

 • 

8.5K Posts

April 16th, 2015 19:00

It came out a few days ago and has not been added to the repository yet. It is version 25.2.2.0004

14 Posts

April 16th, 2015 22:00

updated PERC firmware, but no change in performance. 

Installed OMSA 8.0.2 but when looking at the Storage tab it shows the error "The initialization sequence of SAS components failed during system startup. SAS management and monitoring is not possible."

However using smartctl to run a short SSD health check on each drive, it completed without error

smartctl --test=short /dev/sda -d megaraid,1

Moderator

 • 

8.5K Posts

April 17th, 2015 15:00

What version is it showing for the SAS backplane? Can you do a full power cycle, and not just a normal reboot, power off and power on, so that the PERC can be re-intialized? Try setting or checking the BIOS settings for low latency, http://i.dell.com/sites/content/shared-content/data-sheets/en/Documents/configuring-low-latency-environments-on-dell-poweredge-12g-servers.pdf

It is the same on 13G servers.

Moderator

 • 

8.5K Posts

April 17th, 2015 17:00

I have not seen anything with the model drive that you have.

14 Posts

April 17th, 2015 17:00

I was speaking with Dell support this morning and apparently there may be an issue with LITE-ON SSD firmware, but no ETA of when a firmware update might be released? Have you heard anything about that?

SMART output from both SSD drives #0 and #1 behind H330 PERC

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-229.1.2. el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
/dev/sda [megaraid_disk_00] [SAT]: Device open changed type from 'megaraid,0' to 'sat+megaraid,0'
=== START OF INFORMATION SECTION ===
Device Model:     LITEON IT ECE-400NAS
Serial Number:    TW0949GX550854BC0083
LU WWN Device Id: 5 002303 10032ef95
Add. Product Id:  DELL(tm)
Firmware Version: LPCF11XC
User Capacity:    400,088,457,216 bytes [400 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3, ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Apr 17 12:36:09 2015 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever 
been run.
Total time to complete Offline 
data collection:  (   10) seconds.
Offline data collection
capabilities:   (0x1d) SMART execute Offline immediate.
No Auto Offline data collection support.
Abort Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x00) Error logging supported.
General Purpose Logging supported.
Short self-test routine 
recommended polling time:   (   2) minutes.
Extended self-test routine
recommended polling time:   (  30) minutes.
SCT capabilities:         (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0002   100   100   070    Old_age   Always       -       0
  5 Reallocated_Sector_Ct   0x0003   100   100   000    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0002   100   100   000    Old_age   Always       -       1967
 12 Power_Cycle_Count       0x0002   100   100   000    Old_age   Always       -       22
 13 Read_Soft_Error_Rate    0x0002   100   100   000    Old_age   Always       -       0
175 Program_Fail_Count_Chip 0x0003   100   100   000    Pre-fail  Always       -       0
176 Erase_Fail_Count_Chip   0x0003   100   100   000    Pre-fail  Always       -       0
177 Wear_Leveling_Count     0x0003   100   100   000    Pre-fail  Always       -       5092
178 Used_Rsvd_Blk_Cnt_Chip  0x0003   100   100   000    Pre-fail  Always       -       0
179 Used_Rsvd_Blk_Cnt_Tot   0x0003   100   100   000    Pre-fail  Always       -       0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0002   100   100   005    Old_age   Always       -       15872
181 Program_Fail_Cnt_Total  0x0002   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0002   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   100   100   000    Old_age   Always       -       40
195 Hardware_ECC_Recovered  0x0002   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0002   100   100   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0002   100   100   000    Old_age   Always       -       0
201 Unknown_SSD_Attribute   0x0003   100   100   000    Pre-fail  Always       -       0
202 Unknown_SSD_Attribute   0x0003   100   100   000    Pre-fail  Always       -       0
232 Available_Reservd_Space 0x0003   100   000   010    Pre-fail  Always   In_the_past 100
233 Media_Wearout_Indicator 0x0002   100   100   000    Old_age   Always       -       45802
241 Total_LBAs_Written      0x0003   100   100   000    Pre-fail  Always       -       45802
242 Total_LBAs_Read         0x0003   100   100   000    Pre-fail  Always       -       17266
245 Unknown_Attribute       0x0002   099   000   010    Old_age   Always   In_the_past 99
SMART Error Log not supported
No Events found!

Top