Start a Conversation

Unsolved

This post is more than 5 years old

A

5 Practitioner

 • 

274.2K Posts

2901

June 21st, 2017 09:00

Sata DOMS for OS Install

Hey everyone. I've been able to successfully test out scale io in a virtual environment but now I'm interested in running it bare-metal. Some of the supermicro's have the ability to run dual Sata DOMs with AHCI mirroring. I want to protect myself from a DOM death so I'm thinking of doing a CentOS install using dmraid.

Has anyone had any issues with a setup like this? This should't affect the disks that will be used for data but I'm wondering if software raid for the OS disks will do more harm than good.

Thoughts?

306 Posts

June 23rd, 2017 07:00

Hi [acl],

We never tested it, but I would be cautios with software RAID - it might introduce some delays which MDM/SDS might not like. Even if one of the SATADOM fails, you still have the rest of the cluster and can always replace it and rebuild - that's the approach we have for the ScaleIO Ready Nodes and it's working fine.

If you decide to test it, please share your test results though!

Cheers,

Pawel

5 Practitioner

 • 

274.2K Posts

June 23rd, 2017 08:00

I've already spun up a test node and It's not pretty to use the software raid. Our mobo has one of those fake raid so both /dev/sdx and the new devmapper devices show up. I'm afraid we may accidentally overwrite one of the raid members so it's not worth it.

I heard the Ready Nodes are build to last so if that's how they do it, might as well follow that design. I wonder if they follow a specific setup on their DOMs so they don't kill them with loads of logging writes. Do you know if there is a best practices for DOM setup? you know, to extend the life of the boot ssd or DOM?

12 Posts

June 23rd, 2017 08:00

I agree, You could create an image of your satadom and just restore if there was a failure.  I don't have satadom in my test setup but just use an SSD as my OS drive.  I don't have it raided as it would use an additional bay.  Creating a image of it would probably be how I move forward and just restore if I needed to.  Scaleio already has falt tolerance built in if a node goes down as long as you make sure you have your spare percentage set correctly.

306 Posts

June 26th, 2017 03:00

I don't think there are any special recommendations for the SATADOM itself; if you check out the ScaleIO Ready Node HW and OS Installation Guide you can see we recommend to upgrade the firmware to certain version, but this is pretty much it.

I would say if you are concerned with wear of the SATADOM device you could setup remote syslog logging, as that's what would write to the disk at most, but other than that it should be OK - at least I haven't heard of any case when the DOM would wear off... In the worst case scenario, reinstalling ESXi and restoring and SVM shouldn't take long anyway.

5 Practitioner

 • 

274.2K Posts

June 26th, 2017 15:00

I'm a bit surprised that the OS installation guide does not have optimizations for ssds or doms. Wonder if the real Node's just log to memory instead of disk.

I'm going to give it a try and see how long it takes to kill a DOM. Thanks for the input.

July 20th, 2017 05:00

I was in a similar situation and we came up with a little bit of an out of the box solution. This solution was completely free too.

We were using a supermicro x10 board with the dual powered satadom ports. First we did raid-1 with the x10's intel integrated raid controller which covered us from a single satadom bringing down a node and would give us breathing room to plan a maintenance window for drive replacement. But we didn't want to take any chances of losing both drives and configurations for that node.

Second thing was we used a veeam agent (windows baremetal nodes) to run nightly backups of c: to a network share. We created a veeam recovery boot iso too.

So all we would have to do is replace the drive, boot from the iso over ipmi and point it at most recent recover point taken the night before. We found the whole process took 30 minutes and protected us from config errors. Since the procedure was so straight forward we made it into a SOP doc so anybody could really do it in our company. Really eliminated human error not to mention 100% free.

Veeam released a free backup agent for linux last year. Not sure if its out of beta but worth a look.

5 Practitioner

 • 

274.2K Posts

July 27th, 2017 10:00

Thats basically the same thing we did. We have a centos iso with a kick start file with all the stuff we need. It's a work in progress since each test makes us fine tune it a bit more.

So far so good with the lab setup. I still need to fine tune the networking but the whole product looks very promising.

No Events found!

Top