Unsolved
This post is more than 5 years old
5 Practitioner
•
274.2K Posts
0
1666
Disk failure simulation for ScaleIO cluster
The correct way of testing a disk failure would generally not be to remove a disk from its slot.
Pulling a working drive and replacing it is a basic hotplug event that ScaleIO treats as such, that is, it does not simulate drive failure to the system, beyond triggering a rebuild while the disk is "gone."
When the drive is plugged back in, first the raid controller sees it's meta-data on the disk, that this is the only member of a raid 0 set, and as a raid 0 device is the only source of consistency for its own data. The RAID controller trusts the disk to the data it says it has.
That logical disk is passed up to ScaleIO, which recognizes the disk, and the data on it, and puts it back as the same device it was previously.
To achieve the goal of documenting what the system does when it sees a new disk after a failure, the disk needs to be wiped once it is removed.
One method to achieve this would be to connect the drive to another system, where it can be viewed as a physical disk, and zero the entire device. A raid controller in HBA/passthrough/IT mode can be used for this, as well as SATA controllers that are compatible with the disk.
Zeroing the disk is a much better simulation of disk replacement. A new raid device may still present the same meta-data as far as ScaleIO is concerned, and thus the documentation from the replacement process will be different than when an actually blank new disk is installed.
patrbng
7 Posts
0
August 18th, 2017 08:00
"The correct way of testing a disk failure would generally not be to remove a disk from its slot."
How do you test the failure without removing the drive? Is there a command to mark a drive offline?
David312_b5736f
24 Posts
2
August 23rd, 2017 08:00
If you wanted to mark a drive offline, you could run the command below against the drive that you would like to mark offline:
echo offline > /sys/block/sdx/device/state
To bring it back online you can run this command:
echo running > /sys/block/sdx/device/state