Start a Conversation

Solved!

Go to Solution

2263

January 1st, 2021 14:00

VMware ScaleIO Gateway to a shared datastore

Say please! Is it possible to migrate VMware ScaleIO Gateway to a shared datastore?

In case of host esxi failure when vSphere ON is enabled. As I understand it, only this SVM is not tied to RDM disks. Now it is stored on the local datastore and I think that if the host fails, it will not be able to start. Please tell me I understand the situation correctly or am I wrong?

31 Posts

January 4th, 2021 16:00

As a high level process it would be something like the following:

1. Reinstall ESXi on failed host with new SSD.
2. Add ESXi host back into vCenter
3. Re-run Add-node / Deploy SVM from PowerFlex vCenter plugin. (Assumes an SVM template is on one of the older hosts, you can have multiple copies of the SVM template on different nodes).

31 Posts

January 2nd, 2021 05:00

You can move it to shared storage if you’d like. Typically it doesn’t require HA though unless in an openstack environment where you can also use ha-proxy to make the gateway service redundant. 

26 Posts

January 2nd, 2021 12:00

Thank you!!! You can answer one question that I can't figure out. I have three hosts with 4 devices each - a total of 13.1 GB. The system shows me the available volume for creating volumes of 4.2 TB - If I create two thick volumes, then I have to reduce them by 20 percent to get into the system policy of 80 percent and 100 percent. And it allows you to create two thin disks of 2.1TB and more. What happens if I create two thin disks on 2.1 and let's say I make two datastores in vSphere with the same volume and fill them completely. What happens if one host fails? Will the system have enough resources for stock? Here I am completely confused! Tell me please!

26 Posts

January 2nd, 2021 14:00

Fixed

Thank you!!! You can answer one question that I can't figure out. I have three hosts with 4 devices each - a total of 13.1 TB. The system shows me the available volume for creating volumes of 4.2 TB - If I create two thick volumes, then I have to reduce them by 20 percent to get into the system policy of 80 percent and 100 percent. And it allows you to create two thin disks of 2.1TB and more. What happens if I create two thin disks on 2.1 and let's say I make two datastores in vSphere with the same volume and fill them completely. What happens if one host fails? Will the system have enough resources for stock? Here I am completely confused! Tell me please!

31 Posts

January 2nd, 2021 21:00

With 13.1TB raw capacity over 4 nodes, this should give you 4.9TB usable after deducting 25% as spare capacity for a single node failure. As long as your thick and thin Volumes do not exceed 4.9TB in allocated space, you will be guaranteed to be safe even if a node failure occurs. 

Also in regards to Thin Volumes, it is only on the first write that they are slower - after that they are equal in performance to Thick Volumes, so you may wish to consider using Thin Volumes for everything that you have.

26 Posts

January 4th, 2021 09:00

Thank you for your professional answers and advice! Let me ask you an important question for me! I currently have three hosts. On each host SVMs are located on SSD disks. SSD disks work as usual - they do not have RAID, although of course it would be good to initially place them in RAID-1. But the budget was not allowed. I have a question for you - if the SSD disk with the SVM located on it fails, what should I do? Of course I am doing a disk replacement. and here's how to create an SVM on it again. That is, I do not know how to proceed in this case? Please tell me how to act in case of that failure?

26 Posts

January 4th, 2021 19:00

thank you for your answer! Let me ask you one more question regarding volume allocation.
Let me describe the system. There are three Dell R 440 servers.
Each has four 1TB SAS disks and one 480GB SSD.
My task is to make two thick volumes of the maximum safe size and create two datastores,
this is required for high availability for VMware vSphere,
I am using version 6.5 U3. Here is the current physical capacity.




vx1.png

The system shows what is available for allocation 1.6 TB

Also in the volume section

vx3.png

But there is politics

vx2.png

Which will not allow me, for example, to create another volume of 1.6 TB, I must reduce it by 20 percent so as not to exceed the policy limit. Let's say I create a third volume with a 25 percent reduction. Then there is still unused space. Does the system need it? is it done for this? Why can't all free space be used? What is the reason?

But the task is as I wrote at the beginning. Two thick volumes are needed. Please tell me what size they should be given my situation. And all the same it is not clear why the unused space remains. And why is it limited to 80 percent? After all, the system already has spare space in case of a failure. For thin drives, the logic is clear, they can grow and exceed the limit, but for fat drives, why can't you use all the available space? Thank you!

31 Posts

January 4th, 2021 21:00

It is safe to exceed the pre-defined policy limits as long as you have already allocated enough spare capacity to handle a node failure (in your case, 25% is the minimum capacity spare you should set ---- it appears though that you currently have it 35%). 

You can therefore ignore the capacity warning alerts, or simply set them much higher (98% and 99%). 

With the 13.1TB raw capacity you have (900GB / 838.19GiB SAS * 4 drives * 4 nodes), after deducting the 25% spare (3.275 TiB) you have 9.825 TiB raw remaining, which means you can allocate up to 4.9125 TiB of Volumes safely. This means you can have 3x 1.6375 TiB of Thick Volumes if you wish. 

The only thing I would also mention is that if a node fails you will be safe, however you will want to get it repaired ASAP as if another drive on another node fails, you will be in trouble. To prevent against this, you may wish to slightly increase your spare capacity to factor in one more drive worth of spare capacity (which would make your new spare % setting of 31-32%). 

With 32% spare capacity, you would therefore have 9 TiB raw / 4.5TiB usable = 1.5TiB * 3 Volumes. 

Hope this helps. 

 

26 Posts

January 4th, 2021 23:00

Thank you very much! Sorry, I did not provide you with the exact data in the picture, I added a fourth disk for each host before that - there were three before that and there is a rebalance-now the picture is a little different already. The question-what you answered me remains relevant and after rebalancing you were misled by the parameter of available capacity possible-here is the full picture now. Please excuse me!

vx4.png

Please comment. The percentage of the space policy was originally 35 I didn't change it

vx55.png

31 Posts

January 5th, 2021 01:00

From here you can reduce your spare capacity down from 35% to 32% which is quite safe and will allow you to have one node failure, and then one disk failure, and still have your data fully protected. 

If you wish to increase the warning and critical levels up higher this is also up to you and of no real concern from a PowerFlex / ScaleIO perspective. The only thing that can occur when there is no free space left is that rebalancing may not occur if you exceed your Critical capacity threshold that you have set - https://www.dell.com/support/kbdoc/en-us/000167034/scaleio-doesn-t-rebalance-at-critical-capacity-threshold

 

26 Posts

January 5th, 2021 15:00

Thank you very much! Following your recommendations, I made the following settings and volumes, see the images please! Tell me now my system is in safe mode if one node fails? I do not quite understand what excess are you talking about specifically? If you mean "Critical capacity threshold" I set it to 89 and 99 but did not exceed it. Please see if I did it safely or not. I'm very worried!flex1.pngflex2.pngflex3.pngflex4.png

26 Posts

January 5th, 2021 17:00

Thank you matt_hobbs's for your help! You helped me a lot and explained a lot. However, since this is a production environment, I will revert to the default system settings. To policies 80 and 90 and re-creating volumes with a smaller size. thanks for the information how to test !. I will do it all on the test environment later and if everything works out then I will do it on the production one. I don't want to take risks now! Thank you! Merry Christmas! I wish you well-being!

31 Posts

January 5th, 2021 17:00

Looks like you are good to go now --- you should do a quick test of rebooting a node gracefully (enter maintenance mode at the ScaleIO level on an SDS), and also a not so graceful way (SSH into an SVM and 'systemctl stop sds.service) which will trigger a rebuild (and 'systemctl start sds.service' to restart it). 

As long as you can confirm that any test VM's are unaffected during these events you will be good to go. If this is for a production system, please also ensure that you have a valid license and ideally ESRS connectivity back to Dell EMC for dial-home support.

31 Posts

January 5th, 2021 18:00

My pleasure. Don't hesitate to reach out again, or contact your local Dell Technologies presales team as well who can also help to find you a specialist to work with locally if need be. 

31 Posts

January 5th, 2021 19:00

From a PowerFlex perspective, NTP itself is not critical although it is nice to have to aid troubleshooting if need be.

The current versions of PowerFlex use CentOS SVM now, so it's been a while since have I touched the SUSE version - this should be of some help to you though: https://documentation.suse.com/sles/15-SP1/html/SLES-all/cha-ntp.html

 

No Events found!

Top