Celerra checkpoint performance impact

Question

Have a new high density NS960 running 6.0.42-3. When I take a snapshot of a volume the write performance drops by ~40% on writes to a completely new file (not overwriting which would have to copy tracks to savvol) going from 93MB/sec to 56MB/sec.  The Celerra has hardly anything on it, the backend CX4 dirty cache is 4% (not 40%, 4%), and the datamover I'm on has no other filesystems on it, so I can pretty much guarantee I'm not running into backend spindle contention or CPU constraints on the datamover.   No dedupe, no virtual provisioning, and no auto extend on the filesystem.  This 40% impact is way more than any documentation I've been able to find, especially on a system that is basically idle. No checkpoint #dd if=/dev/zero of=/mnt/testthis.nosnap bs=1M count=10001048576000 bytes (1.0 GB) copied, 11.2573 seconds, 93.1 MB/s With checkpoint # dd if=/dev/zero of=/mnt/testthis.snap bs=1M count=10001048576000 bytes (1.0 GB) copied, 18.5615 seconds, 56.5 MB/s Checkpoint creation command $ nas_ckpt_schedule -create tcpdump_3 -filesystem tcpdump_3 -recurrence once -start_on 2011-08-17 -runtimes 11:09

avs · Answer

Hello

After checkpoint creation any write to new location lead to copy on first write activity.

Take a look at SnapSure document:

https://powerlink.emc.com/nsepn/webapps/btg548664833igtcuup4826/km/live1/en_US/Offering_Technical/Technical_Documentation/300-011-857.pdf?mtcs=ZXZlbnRUeXBlPUttQ2xpY2tDb250ZW50RXZlbnQsZG9jdW1lbnRJZD0wOTAxNDA2NjgwNTg4MTlmLG5hdmVOb2RlPVNvZndhcmVEb3dubG9hZHMtMw__

To reduce that negative performance impact you should create SaveVol on other pool/metavol (other CLARiiON HDDs) than file system .

In your CLI command you didn't explicitly create SaveVol and SaveVol was created automaticly in same pool as FS.

Best regards,

Alex

InsaneGeek · Answer

Thanks for the response, much apprecieated; but I still wonder if something else is going on. I've got a case open with EMC but trying to find out any info while I'm waiting.

I created it in a different pool but even had no noticeable improvement. I didn't mention it before but that avm pool from my view should really have sufficient I/O capacity as it has over 700 spindles in it, and from analyzer the total number of iops going to the SP's is less than 350 (read + write) so it's a fairly large system with hardly anything going on. I might be wrong but it sure feels like I should not be limited by backend capacity of the CX4, as analyzer says lun & sp: %utilized, queue length, response time, service time are all single digits numbers.

Also not sure on whether or not it actually takes space in the savvol. My read of the snapsure document is that when a checkpoint it taken, it tracks all the blocks of current files and makes a bit map of them. Since I'm creating a different file (not overwriting) that bit map shouldn't contain any blocks in the checkpoint (possibly a few directory inodes timestamps). I believe page 18 of the VNX guide confirms this (also verified in the Celerra guide) as it states:

"The first time a change instruction from a PFS application is detected, SnapSure copies the original PFS block into the SavVol (unless the block is unused by a checkpoint), and then allows the instruction to execute on the PFS block"

Those new blocks should be unused by the checkpoint, and looking at the size of the checkpoint in the savvol I believe confirms this as the space consumed is only 128MB not 1GB (when I overwrite the original file it does shove those blocks into the savvol, which has a horrible 8x performance drop to 10MB from 80MB).

Details from another run using a different pool for the savvol

Before checkpoint

# dd if=/dev/zero of=/mnt/testthis.nosnap0 bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 11.9785 seconds, 87.5 MB/s

After checkpoint write to new file

# dd if=/dev/zero of=/mnt/testthis.snap2 bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 30.3396 seconds, 34.6 MB/s

After checkpoint overwrite original file

# dd if=/dev/zero of=/mnt/testthis.nosnap0 bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 98.769 seconds, 10.6 MB/s

After terminating the snapshot

[root@iteccspa201 /]# dd if=/dev/zero of=/mnt/testthis.nosnap0 bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 11.9686 seconds, 87.6 MB/s

$ fs_ckpt tcpdump_4 -Create -readonly y pool=clar_r5_performance

After taking the snapshot and creating a new file savvol is only 128MB

$ nas_fs -info tcpdump_4_ckpt1 -size

size = volume: total = 10240 avail = 10111 used = 129 ( 1% ) (sizes in MB) ckptfs: total = 10084 avail = 8082 used = 2002 ( 20% ) (sizes in MB) ( blockcount = 20971520 ) ckpt_usage_on_savvol: 128MB ( 1% )

After overwriting the file savvol is 1GB

$ nas_fs -info tcpdump_4_ckpt1 -size

size = volume: total = 10240 avail = 9215 used = 1025 ( 10% ) (sizes in MB) ckptfs: total = 10084 avail = 8082 used = 2002 ( 20% ) (sizes in MB) ( blockcount = 20971520 ) ckpt_usage_on_savvol: 1024MB ( 10% )

jwasilko · Answer

We're facing a similar performance problem. We migrated from a NS80 (running 5.6.50) to a VG8 backed by a vmax (running 7.0.14), and write performance with the same checkpoint schedule was terrible. At some point, the top talkers was showing 300-800ms write times.

We turned off all checkpoints (and deleted them) and performance got close to normal.

Apparently this is a known issue that was first reported on 8/4. Engineering is working on a fix, but none is available now.

Please reference SR 42573590 with support and see if your symtoms are similar.

jwasilko · Answer

Also, just wanted to confirm our perf numbers are similar to yours. We could do about 40-50MB/sec single stream writes before a checkpoint, and after it dropped to 8-9MB/sec.

96Chevyz71 · Answer

I have been fighting with this issue for 1.5 years now. I cannot seem to exceed 25 MB/Sec with checkpoints off and deleted. I can copy from the same server to other devices on my network at 100 MB/Sec so I would think the network is running fine. I have gone through complete SAN software upgrades and have well over 80 hours of my time spent trying to get an $80K SAN to work as well as a $12K Windows file server.

After disabling the checkpoints and removing all of the existing checkpoints, I was able to double my transfer rate from 1,423 Megabytes/Min. to 2,823 Megabytes/minute. What is really sad is I can achive 6,700 Megabytes/minute tranferring to a standard Windows Server. All of the above testing was done from a windows file server using Robocopy. I'd be interested to see what others get when doing similar testing.

Needless to say, this is ridiculous. I can't belive this was discovered on 8/14/2011 when I opened my first trouble ticket 1.5 years before then. I have since opened at least 5 trouble tickets and have yet to get this resolved.

Mosesn · Answer

See release notes for version 6.0.51.6 P/N 300-009-958

File Systems

Severity 1

Write performance degraded substantially on a file system with checkpoints compared to a file system without checkpoints. Performance degradation was caused by inefficiencies in checkpoint handling code path.

Fix summary

The code has been optimized to improve write performance on file systems with checkpoints.

downhill2 · Answer

I have been load testing a vnx5300 running 7.0.40-1 with equally poor results.

Setup:

1 16TB FS consisting of 1 stripe accross 6 4+1 R5 groups of 10k SAS drives.

Max write (even with multiple streams) is a pathetic 60MB/sec.no checkpoints, no replication.

I then ran the same test against our ns960 with a stripe of roughly 24 disks in R10, same results. Reading from this FS results in instant and continuous saturation of a Gb pipe and yields predictable 100+MB/sec. The 960 code is

Go figure. The 960 FS has no checkpoints. 6.0.41-4. If this is a bug, I wonder why those of us running the affected code have not been informed yet.

96Chevyz71 · Answer

I'm pretty sure they haven't reported this as a bug because it is not fixed yet. I have installed the update mentioned in the previous post, restarted the SAN, and noticed next to no performance increase. My results are consistent with yours. I get around 60 MB/Sec write speeds without any checkpoints. After the patch and creation of a single checkpoint, this drops to 23 MB/Sec. My Windows server are getting 120 MB/Sec with a single stream and 240 MB/Sec using dual streams between servers.

I did notice there was another patch released but I cannot find the release notes to see what was fixed.

I am seriously considering replacing it with an Equalogic unit when my maintenance is up next year. I have a long relationship with EMC but we are going on two years of abysmal performance and this is getting ridiculous.

96Chevyz71 · Answer

What is sad is I also have a SnapServer that is about 8 years old with SATA disks that does 50 MB/Sec without an issue. I am at a remote site here with a Dell PowerEdge T710 with teamed NICS that does 120 MB/Sec concurrently from two workstations. Yet this NS120 is a piece of you know what.

When you open the case, feel free to reference a few mine: 44548000, 42188908, & 42608448. Maybe they won't make you do all of the network testing and waste hours of your time when you know everything is fine going to anything other than the NS box.

I haven't tried MPFS. We are running standard CIFS for this implementation but I also have a couple of servers running MPIO to this box and they kick ass performance-wise.

downhill2 · Answer

Interesting. I’ll tell you, we have been using an NS40 with all SATA drives for 5 years as a “Backup” target. That thing will eat data up to 80MB/sec coming from exadata nodes over nfs. But, the 960 and our vnx fall short (so far). I’ll let you know if I break the current barrier but I am opening a case on it.

Say – have you tried MPFS? I have that on my plate as the next round of tests since we have the licenses, and those are not that extreme. Just a thought.

96Chevyz71 · Answer

Are there any updates regarding this issue?  My CX4 still sucks and EMC Support has been all but useless.

manisha_n · Answer

hi I want to create snapshot schedule with new Pool. Please share the exact command

dynamox · Answer

download Snapsure admin guide

Rainer_EMC · Answer

Either use the GUI Or lookup the nas_ckpt_schedule command in the 'Using VNX SnapSure' manual that you can get from support.emc.com

Celerra

Was this post helpful?