Finally Dell Support budged. This is the latest update from support:
-- Write Back Cache Enablement
We are actively driving a plan to deliver a Write-Back option (via firmware) for Dual PERC configurations. We are working this plan with great urgency and will have it implemented as soon as we are able complete our development and Enterprise validation of the firmware. --
As it seems it was a question of firmware update afterall! The latest firmware found here (or here) states:
Fixes & Enhancements Fixes: - Fixed an issue in which Virtual Disks may become inaccessible when migrated to another Power Edge VRTX. - Fixed an issue in which Virtual Disk to blade server mapping may be deleted after a failover. - Fixed an issue in which a cable failure in large configurations could cause the Shared PERC8 to fault. - Fixed an issue in which a redundant VD may be unable to use the "Copy Back" feature after disabling the second Shared PERC8 - Fixed an issue in which I/O traffic wasn't always sent using the "Fast Path" feature. - Improved Battery Back Up (BBU) messaging in the storage controller logs. - Fixed an issue in which the rebuild of a redundant virtual disk starts over after a controller failover.
Enhancements: - VDs can now be configured with write back caching in dual Shared PERC8 configurations - Increased hard drive predictive failure polling interval to 5 minutes. - Firmware TTY logs persist across controller and chassis resets.
I will implement the new firmware and test it right away.
#1 Write cache has a HUGE performance impact on RAID arrays that use parity, HUGE difference. You should not run a parity array without write caching. Your speeds of 70MB/s on a RAID 6 without write caching are extremely good.
#2 Write caching is disabled when running dual PERCs because the functionality is not present to match the write cache. Our storage appliances that run dual PERC failover configurations have the capability to match the cache. The passive PERC runs a mirror image of the active PERCs cache so that it can take over without data loss in the event of controller failure. The VRTX is not a full-featured storage appliance.
The use of dual PERCs on a VRTX will be dependent on the needs of the individual customer. I think it is great that you made this write-up to point out that write caching is not available in this configuration. Someone running a RAID 10 or other non-parity array will not suffer such a HUGE performance impact due to write caching being disabled, so some configurations run very well with dual PERCs.
Thank you for the clarification! Which led to my following question(s):
Q: In point #2 you say "because the functionality is not present to match the write cache". Will this functionality ever be present/available, or is this impossible seen the way caching works?
I will test and benchmark a RAID10 setup somewhere this week. The big downside with RAID10 is, that we lose a lot of storage capacity, meaning we have 21TB in RAID10 versus 36TB in RAID6 (that is a 15TB loss). Note: Our VRTX is completely filled with 3.64TB disks
If RAID6 is not the best option (or actual the worst) to use with the dual SPERC8 controller, it would be very nice if Dell made a mention / recommendation / best practice for the dual SPERC8 setup. Now I have been struggling with this issue for months, time that could have been saved when this was written down somewhere. (I realize that this dual SPERC8 setup is pretty new still, so giving new customers/users a heads up on this would be very convenient).
So for anyone that comes across this post: * If you have a VRTX with a dual SPERC8 and want to keep the fault tolerance, then do not use RAID6 for the storage, go for RAID10 instead (take the storage capacity loss into account!). * If you think the fault tolerance with SPERC8 is not that important and you want to use RAID6, downgrade the VRTX to be a single controller SPERC8 config and use write-back caching policy..
Q: Currently I have found a way to downgrade to a single SPERC8 controller. Is this the best way or do you know of an easier way to downgrade?
Q: In point #2 you say "because the functionality is not present to match the write cache". Will this functionality ever be present/available, or is this impossible seen the way caching works?
In short, I don't know. I doubt this type of functionality could be added with a firmware update, but I'm not a developer so I'm not sure of the engineering barriers to make it work. I'm not aware of any plans, and I'm not sure if it is possible.
Q: Currently I have found a way to downgrade to a single SPERC8 controller. Is this the best way or do you know of an easier way to downgrade?
You should be able to run the latest firmware with a single PERC8 shared and have write-caching enabled. Your method works, but I'm curious if back-flashing the firmware on the PERC is required. Some changes made with firmware updates are to the default configuration file. If you don't reset the PERC to defaults then the changes will not be implemented. I would suggest trying to reset the PERC to defaults prior to back-flashing to see if that will allow the write-cache option to become available. You can do this in the CMC: Storage>Controllers>Troubleshooting>Actions drop-down.
I face exactly the same case. New Dell VRTX + 3 M620 run Vsphere ESXi 5.5 VD as RAID-5 in dual SPERC8 controllers. Write Policy is Write Through (only) I made 2 images of the same VM, one in local HDD (of M620) and one in VD storage.
Write Performance of VM in VD storage is very low.
Now backup or cloning VMs are very slow
Do you have the test result with RAID-10 yet?
should i contact local Dell support ? I dont sure they have the answer. I feel so bad now
In your first reply you said the following: "Someone running a RAID 10 or other non-parity array will not suffer such a HUGE performance impact due to write caching being disabled, so some configurations run very well with dual PERCs."
Note: by my understanding "write through" is not equal to write caching being disable, but it is just another (much slower) way of write caching.
Of course I wanted to see/test your claim and I hate to say it, but the results are just as bad (according to me and the expectations sold to us by Dell Sales Department). So I am really wondering what kind of configuration actually is suitable for a dual PERC setup. Currently it is a very bad choice for any virtual environment, as that is a very disk intensive setup.
The results:
Taking all the results into account, I am rephrasing my advice: If you are planning to run any virtual environment on a VRTX, do not go for the dual SPERC8 setup, stick with the Single SPERC8 instead. The performance hit caused by the currently mandatory caching policy (Write Through), is way too big to smoothly run a virtual environment. No matter which RAID config you choose.
@ThiPham To have a real performance gain I would advice you to downgrade your VRTX to a Single PERC.
I am still figuring out the best way to downgrade from dual to single. The procedure described in my first post does work, but I am wondering if Daniel is right. Meaning that I only have to remove the controller + expander, re-cable and then reset the single controller. And that I do not need to downgrade the firmware.
And yes, please do contact Dell support! I have done this as well. I think the more people "complain" about this, the more serious they will take it.
Note: by my understanding "write through" is not equal to write caching being disable, but it is just another (much slower) way of write caching.
That is not correct. Write through and write caching being disabled are the same. The controller will always run data through the cache, but with caching disabled or write through, which are the same thing, it will not queue anything in cache. It basically has a queue depth of 0 with no cache.
And yes, please do contact Dell support! I have done this as well. I think the more people "complain" about this, the more serious they will take it.
There is not an issue with the hardware. This is just an issue of understanding how pivotal of a role cache plays in RAID. Your write speeds with cache enabled are almost 2500 MB/sec. Your HDD's are not capable of writing that fast. Cache allows you to achieve speeds that the drives are not capable of. All you are showing in the above tests is how much effect cache has on RAID performance.
I have run your benchmark on an R720XD with an H710P controller to show you that this is not a hardware or firmware issue. This is simply how RAID works. My test results are not as good as yours, so I suspect you are running better HDDs than me. I have 7200 RPM SATA drives that I am testing with. This is my result with all caching disabled on a different controller in a different system on a 4 drive RAID 10:
Taking all the results into account, I am rephrasing my advice: If you are planning to run any virtual environment on a VRTX, do not go for the dual SPERC8 setup, stick with the Single SPERC8 instead. The performance hit caused by the currently mandatory caching policy (Write Through), is way too big to smoothly run a virtual environment. No matter which RAID config you choose.
@ThiPham To have a real performance gain I would advice you to downgrade your VRTX to a Single PERC.
I am still figuring out the best way to downgrade from dual to single. The procedure described in my first post does work, but I am wondering if Daniel is right. Meaning that I only have to remove the controller + expander, re-cable and then reset the single controller. And that I do not need to downgrade the firmware.
And yes, please do contact Dell support! I have done this as well. I think the more people "complain" about this, the more serious they will take it.
Hi Erik,
Many thanks to your advice.
I just contact Dell support and wait for their reply
Some test results on our VRTX system for reference
[quote user="Erik Nettekoven"]Note: by my understanding "write through" is not equal to write caching being disable, but it is just another (much slower) way of write caching.
That is not correct. Write through and write caching being disabled are the same. The controller will always run data through the cache, but with caching disabled or write through, which are the same thing, it will not queue anything in cache. It basically has a queue depth of 0 with no cache. [/quote]
Thanks for clearing that up :)
[quote user="Erik Nettekoven"]And yes, please do contact Dell support! I have done this as well. I think the more people "complain" about this, the more serious they will take it.
There is not an issue with the hardware. This is just an issue of understanding how pivotal of a role cache plays in RAID. Your write speeds with cache enabled are almost 2500 MB/sec. Your HDD's are not capable of writing that fast. Cache allows you to achieve speeds that the drives are not capable of. All you are showing in the above tests is how much effect cache has on RAID performance.
I have run your benchmark on an R720XD with an H710P controller to show you that this is not a hardware or firmware issue. This is simply how RAID works.
[/quote]
Regarding "there is not an issue with the hardware", yes and no.
Yes: In the respect of "that is how RAID works" than I agree it is indeed not an issue with the hardware.
No: In the dual SPERC8 setup in the VRTX, which is also an hardware setup, you are bound to "write through" or "caching disabled". According to me this is a very poor choice, because of the huge impact on the write performance. It is in no way a suitable setup for a virtual environment, although it was sold to us by Dell "being a very suitable solution" for a virtual environment. The problems mentioned in my starting post and the benchmark results proof my "poor choice" statement.
If this "write through" caching policy is the only technical possible option with Dual SPERC controllers, making -running a virtual environment unfeasible/impossible-, than I feel like I have been more or less scammed by Dell. Or at least Dell didn't live up to the expectations that they gave me.
So for now I stick with my advice/statement:: If you plan to run any virtual environment on a VRTX and its internal storage, stick with a Single SPERC setup, stay far away from the Dual SPERC setup.
Furthermore I am still wondering: what kind of environment is actually suitable for this Dual SPERC setup?
I have contacted Dell Support and they recognize the performance problem with the Dual PERC controller setup. According Dell support the specialists are currently looking into this performance issue.
Furthermore support will sent me an email with troubleshooting steps to improve the performance of the Dual PERC setup and/or to find out if "write through" is the actual cause of this problem.
So everybody experiencing this problem, please DO contact support!
I received a reply from Dell support, I am very dissatisfied with their response:
Hi Erik,
Thanks for sending the information.
Single controller configuration is for performance Dual for fault tolerance.
The solution is to lower fault tolerance of the dual configuration. 1.Power off the chassis 2.Remove the 2nd PERC card 3.Optionally remove the 2nd expander and re-cable (MB SAS1A <-->UP EXP A, MB SAS1B<-->UP EXP B) 4.Run the following command on CMC console: racadm raid revertononfaulttolerant:RAID.ChassisIntegrated.1-1 5.Power on the chassis
Let me know if this solved your problem. You have to evaluate what is more important performance or fault tolerance.
Kind regards / Met vriendelijke groeten,
[removed personal info]
Pro Support Engineer Dell | Enterprise Support Services Dutch PowerEdge Servers ----------------------------------------
[removed personal info] -----------------------------------------------------
Feedback my manager [removed personal info] You may receive a survey, please participate! -----------------------------------------------------
EXPRESS SERVICE CODE You can use this link to convert your service tag to express service code. SUPPORT ASSISTremote monitoring & automated support TECH DIRECTonline portal for efficient problem resolution TWITTERfollow us for technical support YOUTUBEvideo sharing support LINKEDINbuild and engage with your professional network
I can't believe they actually said this "You have to evaluate what is more important performance or fault tolerance." If I choose fault tolerance, virtual environment is not performing at all and I am not able to create any backup. I think I prefer a backup over fault tolerance. And thank you Dell for selling something that is useless at moment...
I am still talking to Dell about this ridiculous solution, I really hope that they come with something better than this!
I received a reply from Dell support, I am very dissatisfied with their response:
[quote user="Dell Support Technician"]
Hi Erik,
Thanks for sending the information.
Single controller configuration is for performance Dual for fault tolerance.
The solution is to lower fault tolerance of the dual configuration. 1.Power off the chassis 2.Remove the 2nd PERC card 3.Optionally remove the 2nd expander and re-cable (MB SAS1A <-->UP EXP A, MB SAS1B<-->UP EXP B) 4.Run the following command on CMC console: racadm raid revertononfaulttolerant:RAID.ChassisIntegrated.1-1 5.Power on the chassis
[...]
I am still talking to Dell about this ridiculous solution, I really hope that they come with something better than this!
[/quote]
Do you happen to know if this downgrade procedure that Dell Support outlines will keep data intact, or will I have to move all my data off and re-create the virtual drives?
Do you happen to know if this downgrade procedure that Dell Support outlines will keep data intact, or will I have to move all my data off and re-create the virtual drives?
I have done several upgrades and downgrades allready, and until now all the data was still intact. It never hurts to make a backup, just to be sure.
So I did
not need to recreate the virtual drives (VD's). The only thing you might need to do is, to change the write caching policy on the VD's from "write through" to "write back".
Do you happen to know if this downgrade procedure that Dell Support outlines will keep data intact, or will I have to move all my data off and re-create the virtual drives?
With all the downgrades (and upgrades) I have done, my experience is that all the data and VD's will stay intact. But it sure doesn't hurt to make a backup first, just to be sure. I made a copy of the VM's and other data to different storage (a MD3200).
Erik Nettekoven
21 Posts
0
August 25th, 2014 16:00
Finally Dell Support budged. This is the latest update from support:
--
Write Back Cache Enablement
We are actively driving a plan to deliver a Write-Back option (via firmware) for Dual PERC configurations. We are working this plan with great urgency and will have it implemented as soon as we are able complete our development and Enterprise validation of the firmware.
--
Erik Nettekoven
21 Posts
0
November 7th, 2014 02:00
As it seems it was a question of firmware update afterall! The latest firmware found here (or here) states:
Fixes & Enhancements
Fixes:
- Fixed an issue in which Virtual Disks may become inaccessible when migrated to another Power Edge VRTX.
- Fixed an issue in which Virtual Disk to blade server mapping may be deleted after a failover.
- Fixed an issue in which a cable failure in large configurations could cause the Shared PERC8 to fault.
- Fixed an issue in which a redundant VD may be unable to use the "Copy Back" feature after disabling the second Shared PERC8
- Fixed an issue in which I/O traffic wasn't always sent using the "Fast Path" feature.
- Improved Battery Back Up (BBU) messaging in the storage controller logs.
- Fixed an issue in which the rebuild of a redundant virtual disk starts over after a controller failover.
Enhancements:
- VDs can now be configured with write back caching in dual Shared PERC8 configurations
- Increased hard drive predictive failure polling interval to 5 minutes.
- Firmware TTY logs persist across controller and chassis resets.
I will implement the new firmware and test it right away.
Daniel My
10 Elder
•
6.2K Posts
1
June 10th, 2014 08:00
Hello Erik
Thanks for the great write-up!
Here are a few things to consider with this:
#1 Write cache has a HUGE performance impact on RAID arrays that use parity, HUGE difference. You should not run a parity array without write caching. Your speeds of 70MB/s on a RAID 6 without write caching are extremely good.
#2 Write caching is disabled when running dual PERCs because the functionality is not present to match the write cache. Our storage appliances that run dual PERC failover configurations have the capability to match the cache. The passive PERC runs a mirror image of the active PERCs cache so that it can take over without data loss in the event of controller failure. The VRTX is not a full-featured storage appliance.
The use of dual PERCs on a VRTX will be dependent on the needs of the individual customer. I think it is great that you made this write-up to point out that write caching is not available in this configuration. Someone running a RAID 10 or other non-parity array will not suffer such a HUGE performance impact due to write caching being disabled, so some configurations run very well with dual PERCs.
Thanks
Erik Nettekoven
21 Posts
0
June 10th, 2014 08:00
Hi Daniel.
Thank you for the clarification! Which led to my following question(s):
Q: In point #2 you say "because the functionality is not present to match the write cache". Will this functionality ever be present/available, or is this impossible seen the way caching works?
I will test and benchmark a RAID10 setup somewhere this week. The big downside with RAID10 is, that we lose a lot of storage capacity, meaning we have 21TB in RAID10 versus 36TB in RAID6 (that is a 15TB loss). Note: Our VRTX is completely filled with 3.64TB disks
If RAID6 is not the best option (or actual the worst) to use with the dual SPERC8 controller, it would be very nice if Dell made a mention / recommendation / best practice for the dual SPERC8 setup. Now I have been struggling with this issue for months, time that could have been saved when this was written down somewhere. (I realize that this dual SPERC8 setup is pretty new still, so giving new customers/users a heads up on this would be very convenient).
So for anyone that comes across this post:
* If you have a VRTX with a dual SPERC8 and want to keep the fault tolerance, then do not use RAID6 for the storage, go for RAID10 instead (take the storage capacity loss into account!).
* If you think the fault tolerance with SPERC8 is not that important and you want to use RAID6, downgrade the VRTX to be a single controller SPERC8 config and use write-back caching policy..
Q: Currently I have found a way to downgrade to a single SPERC8 controller. Is this the best way or do you know of an easier way to downgrade?
Regards,
Erik
Daniel My
10 Elder
•
6.2K Posts
0
June 10th, 2014 09:00
In short, I don't know. I doubt this type of functionality could be added with a firmware update, but I'm not a developer so I'm not sure of the engineering barriers to make it work. I'm not aware of any plans, and I'm not sure if it is possible.
You should be able to run the latest firmware with a single PERC8 shared and have write-caching enabled. Your method works, but I'm curious if back-flashing the firmware on the PERC is required. Some changes made with firmware updates are to the default configuration file. If you don't reset the PERC to defaults then the changes will not be implemented. I would suggest trying to reset the PERC to defaults prior to back-flashing to see if that will allow the write-cache option to become available. You can do this in the CMC: Storage>Controllers>Troubleshooting>Actions drop-down.
Thanks
ThiPham
2 Posts
0
June 10th, 2014 11:00
I face exactly the same case.
New Dell VRTX + 3 M620 run Vsphere ESXi 5.5
VD as RAID-5 in dual SPERC8 controllers. Write Policy is Write Through (only)
I made 2 images of the same VM, one in local HDD (of M620) and one in VD storage.
Write Performance of VM in VD storage is very low.
Now backup or cloning VMs are very slow
Do you have the test result with RAID-10 yet?
should i contact local Dell support ? I dont sure they have the answer. I feel so bad now
Thanks
Erik Nettekoven
21 Posts
0
June 11th, 2014 01:00
Hi Daniel (and ThiPham),
In your first reply you said the following: "Someone running a RAID 10 or other non-parity array will not suffer such a HUGE performance impact due to write caching being disabled, so some configurations run very well with dual PERCs."
Note: by my understanding "write through" is not equal to write caching being disable, but it is just another (much slower) way of write caching.
Of course I wanted to see/test your claim and I hate to say it, but the results are just as bad (according to me and the expectations sold to us by Dell Sales Department). So I am really wondering what kind of configuration actually is suitable for a dual PERC setup. Currently it is a very bad choice for any virtual environment, as that is a very disk intensive setup.
The results:

Taking all the results into account, I am rephrasing my advice:
If you are planning to run any virtual environment on a VRTX, do not go for the dual SPERC8 setup, stick with the Single SPERC8 instead. The performance hit caused by the currently mandatory caching policy (Write Through), is way too big to smoothly run a virtual environment. No matter which RAID config you choose.
@ThiPham
To have a real performance gain I would advice you to downgrade your VRTX to a Single PERC.
I am still figuring out the best way to downgrade from dual to single. The procedure described in my first post does work, but I am wondering if Daniel is right. Meaning that I only have to remove the controller + expander, re-cable and then reset the single controller. And that I do not need to downgrade the firmware.
And yes, please do contact Dell support! I have done this as well. I think the more people "complain" about this, the more serious they will take it.
Daniel My
10 Elder
•
6.2K Posts
0
June 11th, 2014 11:00
Erik
That is not correct. Write through and write caching being disabled are the same. The controller will always run data through the cache, but with caching disabled or write through, which are the same thing, it will not queue anything in cache. It basically has a queue depth of 0 with no cache.
There is not an issue with the hardware. This is just an issue of understanding how pivotal of a role cache plays in RAID. Your write speeds with cache enabled are almost 2500 MB/sec. Your HDD's are not capable of writing that fast. Cache allows you to achieve speeds that the drives are not capable of. All you are showing in the above tests is how much effect cache has on RAID performance.
I have run your benchmark on an R720XD with an H710P controller to show you that this is not a hardware or firmware issue. This is simply how RAID works. My test results are not as good as yours, so I suspect you are running better HDDs than me. I have 7200 RPM SATA drives that I am testing with. This is my result with all caching disabled on a different controller in a different system on a 4 drive RAID 10:
ThiPham
2 Posts
0
June 11th, 2014 23:00
Hi Erik,
Many thanks to your advice.
I just contact Dell support and wait for their reply
Some test results on our VRTX system for reference
Erik Nettekoven
21 Posts
0
June 12th, 2014 01:00
That is not correct. Write through and write caching being disabled are the same. The controller will always run data through the cache, but with caching disabled or write through, which are the same thing, it will not queue anything in cache. It basically has a queue depth of 0 with no cache.
[/quote]
Thanks for clearing that up :)
There is not an issue with the hardware. This is just an issue of understanding how pivotal of a role cache plays in RAID. Your write speeds with cache enabled are almost 2500 MB/sec. Your HDD's are not capable of writing that fast. Cache allows you to achieve speeds that the drives are not capable of. All you are showing in the above tests is how much effect cache has on RAID performance.
I have run your benchmark on an R720XD with an H710P controller to show you that this is not a hardware or firmware issue. This is simply how RAID works.
[/quote]
Regarding "there is not an issue with the hardware", yes and no.
Yes: In the respect of "that is how RAID works" than I agree it is indeed not an issue with the hardware.
No: In the dual SPERC8 setup in the VRTX, which is also an hardware setup, you are bound to "write through" or "caching disabled". According to me this is a very poor choice, because of the huge impact on the write performance. It is in no way a suitable setup for a virtual environment, although it was sold to us by Dell "being a very suitable solution" for a virtual environment. The problems mentioned in my starting post and the benchmark results proof my "poor choice" statement.
If this "write through" caching policy is the only technical possible option with Dual SPERC controllers, making -running a virtual environment unfeasible/impossible-, than I feel like I have been more or less scammed by Dell. Or at least Dell didn't live up to the expectations that they gave me.
So for now I stick with my advice/statement::
If you plan to run any virtual environment on a VRTX and its internal storage, stick with a Single SPERC setup, stay far away from the Dual SPERC setup.
Furthermore I am still wondering: what kind of environment is actually suitable for this Dual SPERC setup?
Erik Nettekoven
21 Posts
0
June 12th, 2014 03:00
I have contacted Dell Support and they recognize the performance problem with the Dual PERC controller setup. According Dell support the specialists are currently looking into this performance issue.
Furthermore support will sent me an email with troubleshooting steps to improve the performance of the Dual PERC setup and/or to find out if "write through" is the actual cause of this problem.
So everybody experiencing this problem, please DO contact support!
Erik Nettekoven
21 Posts
0
June 12th, 2014 08:00
I received a reply from Dell support, I am very dissatisfied with their response:
I can't believe they actually said this "You have to evaluate what is more important performance or fault tolerance." If I choose fault tolerance, virtual environment is not performing at all and I am not able to create any backup. I think I prefer a backup over fault tolerance. And thank you Dell for selling something that is useless at moment...
I am still talking to Dell about this ridiculous solution, I really hope that they come with something better than this!
NorbyTheGeek
2 Posts
0
June 12th, 2014 10:00
I am still talking to Dell about this ridiculous solution, I really hope that they come with something better than this!
Erik Nettekoven
21 Posts
0
June 13th, 2014 01:00
Erik Nettekoven
21 Posts
0
June 13th, 2014 01:00
All the
With all the downgrades (and upgrades) I have done, my experience is that all the data and VD's will stay intact. But it sure doesn't hurt to make a backup first, just to be sure. I made a copy of the VM's and other data to different storage (a MD3200).