Unsolved
This post is more than 5 years old
2 Intern
•
202 Posts
0
1791
April 20th, 2010 02:00
Clariion Raid 6 read performance under a simultenous write
Hi!
Our customer connected one AIX server to the CX3.
He uses TSM with this server. (TSM version is: 6.1.3.2)
He observed the following perfomance degradation:
When he is writing to the LUN, the write performance is 50 Mb/s,
but when he reads from this LUN to the SAME time he get 3-5 Mb/s
read performance. It is a HSM system and it is not acceptable for the
customer. When he reads OR writes to this LUN the performance is acceptable.
He wants to solves this performance problem, he needs to write and read
the LUN in same time. It is a HSM system, and he want to priorize read operation versus
write operation.
We did some performance tests:
Test1: (AIX host lun read/write test with R6 lun)
Copied data to this LUN (LUN 10) via NFS and reads the data simoulteniously with the
TSM HSM client, called dsmmigrate.
The write performance is 50/60 Mb/s however the read performance is quite
poor in this case. (5-10 Mb/s)
Customer wants to improve the read test, it is quite poor performance!!!
Test2: (AIX host lun read with R6 lun)
If we dont writing to this LUN (LUN 10) the read performance is 100 Mb/s.
Test3: (Windows host read/write test with R6 lun)
Windows performs the same result, than in the test1 the AIX host.
Write: 50-60 Mb/s
Read : 5-10 Mb/s and it isnt acceptable
We get the same result too on the windows side, when we tested with
os tools.(write: copy, read: copy)
Test4: (Windows host read/write test with R5 lun)
Customer writes to R5 LUN from the local disk, and reads the data with the
TSM HSM client.The performance is:
write perf: 50 Mb/s
read perf: 50 Mb/s
Read performance is in acceptable level in this case.
During the tests we use the raid groups which are
not used by other hosts or applications. These raid groups are used only for this
purpose.
Customer MUST use R6 luns, and the normal workload is read/write to the
LUN-s, but in this case the read performance is not acceptable level.
The main workload for the LUN 10 is the read, and they want to improve read performance
or prioritize read against lun write to reach better read performance.
Lun 10 the only LUN for this Raid group and its raid group consists of 14 500 Gb FC disk
Any tips and ideas appreciated. We want to modify cache settings, LUN 10 prefetch settings, Flare upgrade to latest 26 version,
but nothing helped.


kelleg
6 Operator
•
4.5K Posts
0
April 20th, 2010 14:00
Is the AIX host connected to the CX3 using fibre channel or iSCSI? How many connections - paths
What is the configuration of the raid group - what type of disks - FC or SATA - what is the speed of the disks?
Have you looked at the Navisphere Analyzer archives for this LUN?
What version of Windows are you using? Is the Windows host connected using iSCSI or FC?
Is this a NAS configuration?
glen
paulo3
2 Intern
•
202 Posts
0
April 21st, 2010 06:00
Hi Glen!
>Is the AIX host connected to the CX3 using fibre channel or iSCSI? How many connections - paths
2 fc cards, four paths.
>What is the configuration of the raid group - what type of disks - FC or SATA - what is the speed of the disks?
14 fc disks. Speed id 10k.
>Have you looked at the Navisphere Analyzer archives for this LUN?
Yes. If we write and read to this lun the LUN, we can utilize this LUN to 100%.
>What version of Windows are you using? Is the Windows host connected using iSCSI or FC?
2003. Windows is connected to array via fc.
>Is this a NAS configuration?
No
We tested again, and both host produces the same result.
only read: 8-100 Mb/s
paulo3
2 Intern
•
202 Posts
0
April 21st, 2010 07:00
>When you do the read test, are you reading from the LUN and writing to the tape? If so, have you tried just a simple read?
We tried to write to disk too. We got the same results. The tape can write approx 160 Mb/s so it isnt a bottleneck.
>If you look in the Analyzer archive at the disks for the LUN, what is the IOPS on the disks during the read test? What is the IOPS at the LUN during the read >test? What is the Read Cache Hit/s during the read test, what is the Read Cache Hit Ratio during the read test?
Sorry i dont know the Analyzer figures.
kelleg
6 Operator
•
4.5K Posts
0
April 21st, 2010 07:00
When you do the read test, are you reading from the LUN and writing to the tape? If so, have you tried just a simple read?
If you look in the Analyzer archive at the disks for the LUN, what is the IOPS on the disks during the read test? What is the IOPS at the LUN during the read test? What is the Read Cache Hit/s during the read test, what is the Read Cache Hit Ratio during the read test?
glen
naughty-natty
1 Message
0
April 21st, 2010 21:00
Hi,
Can you gather SPCollects and nar files taken when you faced this performance problem and forward it nannumal.thekkelal@gmail.com?
kelleg
6 Operator
•
4.5K Posts
0
April 22nd, 2010 10:00
At this point I would recommend opening a case with EMC for performance issues.
glen
paulo3
2 Intern
•
202 Posts
0
April 22nd, 2010 14:00
I opened it already.
kelleg
6 Operator
•
4.5K Posts
0
April 22nd, 2010 14:00
What's the case number?
glen
paulo3
2 Intern
•
202 Posts
0
April 23rd, 2010 02:00
34306932
kelleg
6 Operator
•
4.5K Posts
1
April 23rd, 2010 12:00
In the Documents section, see the following document - see page 40 for a desription of the different types of disks, speeds and performance.
EMC CLARiiON Best Practices for Fibre Channel Storage - CLARiiON Release 24.pdf
Glen
SKT2
2 Intern
•
1.3K Posts
0
April 24th, 2010 04:00
Have you considered "powerpath write throttling" to prioritise the RD/WR?
driskollt1
131 Posts
1
April 27th, 2010 08:00
That's odd. It should be writing to cache and reading from disk.
When you're writing data, are you dumping a bunch of data to the LUN at once? Are you causing your LUNs to forced-flush (i.e. write cache at 99% full)?
Is there a reason why you're using RAID6?
I'm not too familiar with TSM or AIX...
Typically for a UNIX System I would..
Create 2 6+1 RAID5 Raid Groups.
Create 1 LUN in each RAID Group. - Each owned by a different SP
Use a volume manager (VxVM, ASM, ZFS) to stripe the volumes on the host side.
You'd get the same capacity and less write overhead. Also the benefit of 2 SPs using write cache.
SKT2
2 Intern
•
1.3K Posts
0
April 28th, 2010 03:00
"Create 1 LUN each RG, each owned by differnet SP, then stripe at host level." What if i make a metalun(striped,avoid host sid striping) instead, and should i keep both LUNs with same SP or different SP to get better performance?
driskollt1
131 Posts
0
April 28th, 2010 06:00
UNIX volume managers are like a million times better than the host-side striping you get with Windows.
Plus a lot of UNIX file systems don't handle things like LUN migrations to larger LUNs very well. Typically I give space to UNIX servers in chunks (i.e. 532 GB R10 LUNs that UNIX admins stripe together. If they need 3 TB I give 6 532 GB LUNs)
I've had good luck with host side stiping using UNIX volume managers (Oracle ASM, VxVM, ZFS).
It's easier to manage for me and the UNIX admins as well. We migrated all of our data from some CX3-80s some CX4-960s with 0 downtime.
Things I can do with host-side striping that you can't by using metaLUNs
Stripe across both SPs - allows you to stripe and use both SPs (more write cache to use - also the potential to fill up both SPs write cache so be sure to allocate appropriate spindle count)
Drop LUNs to reclaim space
Migrate to new storage without downtime - (I haven't looked at powerpath migration enabler, it might be able to do this now)
Some volume managers still don't handle larger than 2TB volumes - Host side striping is a necessity on some systems when you need large volumes.
Our UNIX Servers are pretty beefy though, so host the load that the volume manager puts on the servers is pretty minimal.
I would never do host-side striping on a Windows box either. I would use metaLUNs for that.
I would alternate the SPs owning the LUNs. i.e vol1 - SPA, vol2 - SPB, vol3 - SPA, vol4 - SPB
kelleg
6 Operator
•
4.5K Posts
0
April 29th, 2010 07:00
Pal,
I noticed that this case was closed without a resolution - do you still want this to be investigated? Did you find the problem? Is it fixed?
glen