borninpa

40 Posts

17247

January 26th, 2011 05:00

Celerra Performance problems - slow write fast read

I work with 2 other offices in my company and we all have new Celerras (purchased within last 6-9 months). We have all upgraded to Unisphere. We are noticing that our Celerras are not writing data very fast. I configured my own Celerra using AVM. I have Fibre and SATA drives. The problem is accross the board on all CIFS shares that we have. I thought my Celerra was faster last fall before Unisphere but I am not certain.

Here is the issue. Read peformance is excellent but writes are slow. We use a lot of large files so we took a couple of them to test. When we copy a 1GB SAS or ACCESS file from the Celerra to our PC (we have 1GB to desktop and use fast drives), we can copy it at over 100MB/s. It takes about 10 seconds to copy down. However, when we try to copy files up to the Celerra, it takes minutes and goes at about 8-10MB/s (BTW, makes no difference on size of files...we just used the larger ones for testing).

I am using 10GB connection on the Celerra to procurve switches. One office uses Extreme switches (I think) and they have exact same issue.. However, I have also tested using a single 1GB connection to the Celerra and have the same problem. Write cache on the backend is enabled on my ns120 with the value 566. I would assume it is also enabled on my colleagues ns480 Celerras.

When we copy the same files to our windows servers with local SAS storage or with older Clariion storage, the performance (both read and write) is fine (over 100MB/s). All of these boxes use the same switches. None of our offices are very big. My office only has about 80 users.

I have run a lot of performance tests on the backend and nothing stood out (using knowledge from last years emcworld). The EMC technician ran the following and noted the Ierrors and restransmitted packets. however, those numbers have not changed over the past couple of weeks of testing. 11 Ierrors have been there. Also, there are no errors on the cge1 connection which is where I have been conducting additional testing (so I could isolate the connection and test things like duplex and so forth).

ANY suggestions on other ways I can test to see what would cause such slow write speeds. Anyone else have slow performance like this with new celerra running unisphere? We now have tested on three Celerras and they all exhibit same issue!

Name Mtu Ibytes Ierror Obytes Oerror PhysAddr

****************************************************************************

fxg0 9000 3016360536 11 2709827918 0 0:60:16:32:56:46

fxg1 9000 1237764640 0 0 0 0:60:16:32:56:47

mge0 9000 2762780894 0 4205729750 0 0:60:16:40:ee:1

mge1 9000 331775952 0 52356319 0 0:60:16:40:ed:ed

cge0 9000 1964674952 0 1408110665 0 0:60:16:2b:5c:96

cge1 9000 610079448 0 2747930285 0 0:60:16:2b:5c:97

tcp:

****

2022297234 packets sent

66962 data packets retransmitted

0 resets

3932824347 packets received

210284 connection requests

124 connections lingered

Responses(25)
Solutions(0)

borninpa

40 Posts

0

July 12th, 2012 09:00

If you have to have checkpoints, there is nothing you can do. Otherwise, delete all of your checkpoints. Also, if you are running celerra replicator, it also creates some hidden checkpoints which will slowdown performance somewhat as well. We finally gave up trying to do anything about this and are looking forward to leaving the celerra platform in a couple of years.

Paul Shane | Systems Administrator | paul.shane@milliman.com

Milliman | 1550 Liberty Ridge Drive, Suite 200 | Wayne, PA 19087-5572 | USA

Tel +1 610 975 8012 | Fax +1 610 687 4236 | Mobile +1 610 389 5088 | milliman.com

1 Attachment

image003.png

R

Rainer_EMC

6 Operator

•

8.6K Posts

0

July 12th, 2012 09:00

Are saying that with the latency of a CIFS connection from the UK to the US you notice an impact of checkpoints ?

Sathish Dodda

1 Rookie

•

121 Posts

0

July 12th, 2012 09:00

Ok Thanks a lot for your Information.

I will work with my mates and will update you. Thanks agaian for the information.

R

Rainer_EMC

6 Operator

•

8.6K Posts

0

July 12th, 2012 09:00

If you are copying from the UK to a CIFS share in the US than most likely its not the checkpoint that is limiting your performance

I suggest to try to do the same to a Windows server over the same distance or copy to a VNX file system without checkpoints

borninpa

40 Posts

0

July 12th, 2012 09:00

Because of the way Celerra’s currently use copy-on-write checkpoints, write performance will drop significantly when you have checkpoints. It was an unadvertised “feature” of these units. They have announced changes in their latest VNX code to change the way they do checkpoints.

Paul Shane | Systems Administrator | paul.shane@milliman.com

Milliman | 1550 Liberty Ridge Drive, Suite 200 | Wayne, PA 19087-5572 | USA

Tel +1 610 975 8012 | Fax +1 610 687 4236 | Mobile +1 610 389 5088 | milliman.com

1 Attachment

image003.png

Sathish Dodda

1 Rookie

•

121 Posts

0

July 12th, 2012 09:00

Thanks for your response.

Yes, we do have checkpoints for file systems.

We have two NAS boxes in our environmet....UK & US

1) When i capoy the data from UK citrix server to US nas share....it takes time.

Sathish Dodda

1 Rookie

•

121 Posts

0

July 12th, 2012 09:00

Thanks,

So what i have to do now to get the better perfomance. Kindly help me on this.

We are using Celerra NS80g backend storage with Clariion CX3-80 and another one Celerra NS80g backend storage with Symmterix V-max.

ER

E-rickV

3 Posts

0

July 30th, 2012 06:00

For anyone who's interested a similar story. We have been struggling with EMC support for 2 years. The main solution that emc gave us, was to delete all 60 checkpoints and create new schedules for the filesystems. Since then we have followed a lot of steps to gain filesystem/system performance, because even at this point, it still isn't what we expected of it.

- Before: All, filesystems 5 - 7 MB/s write performance

- Filesystems with 60 checkpoints, 1 schedule, every day.

- Filesystems without checkpoints, 20 - 50 MB/s write performance

The steps taken:

- Divided a 6 TB filesystem into 6 smaller 1 TB filesystems, EMC recommends a max. size of 2 TB.

- Went from 60 checkpoints to 14 checkpoints (EMC NAS Expert said that writespeed issues can occur when using more than 16 checkpoints for one filesystem) especially when looking at filesystems that contain a lot of files constantly changing, for instance, profiledata etc.

- Updated the Dart Code twice. The current code is 6.0.41-3 (we are planning to update to 6.0.41-4).

- We have installed a retransmission tool, to calculate retransmits, as it seems that these are questionable, we are investigating our network. But we still suspect the Celerra not performing because of the amount of checkpoints.

- We have 2 sites, site 1 is our primary CIFS/NFS nas, site 2 is used for backup purposes: Site 2 has 1 checkpoint per filesystem, and has write speeds between 25 - 50 MB/s

- There is no difference in write speeds using NFS or CIFS

- Bought 2 extra datamovers to divide the workload (Also because of the maximum TB supported per Datamover)

- Our network specialists and EMC have combined their teams to see what causes the retransmits, EMC says the retransmits might be caused on the network, the networkspecialists say the retransmits are caused within the datamovers.... in the end, nobody knew where to seek for a sollution and now we also have accepted the slow writes.

retransmission_stats]$ ./retransmission_stats.sh server_5

#########################################################
# Consider the following table for recommended values : #
# Above .1%                ->    Too high               #
# Between .01% and .1%     ->    Questionable           #
# Below .01%               ->    OK                     #
#########################################################

07/30/12 11:39:40 packet_sent=1100533848 packets_retransmitted=164825596 rate=0.049772%
07/30/12 11:40:15 packet_sent=1100707990 packets_retransmitted=164825632 rate=0.020673%
07/30/12 11:40:51 packet_sent=1100813751 packets_retransmitted=164825709 rate=0.072806%
07/30/12 11:41:23 packet_sent=1100919356 packets_retransmitted=164825755 rate=0.043559%
07/30/12 11:42:10 packet_sent=1100994037 packets_retransmitted=164825805 rate=0.066951%

- On short notice we are planning to replace all network cables connected to our datamovers, to see if that makes a difference.

All write tests are done with a the same 1GB iso file.

After: filesystems 8 - 15 MB/s write performance

- Filesystems with 14 checkpoints: 2 schedules 1 daily, 1 weekly

- Filesystems without checkpoints, 20 - 50 MB/s write performance

Does anyone know if it is possible to reset the "server_netstat server_x -i" error counters?

M

Moonigan

14 Posts

0

October 11th, 2012 00:00

If you are experiencing poor CIFS write performance then please have a look at this thread. We suffered for almost 3 years with this issue and the fix suggested in this thread transformed the write performance.

https://community.emc.com/message/657620#657620

Regards

Paul

R

Rainer_EMC

6 Operator

•

8.6K Posts

0

October 11th, 2012 02:00

I think the big speed increase you got with fastRTO was specific to your network setup