With all due respect, I think you really need to reconsider this: "It is some years now that FW 07.84.47.60 is working and not giving any problems". Because from my point of view, the errors "Alternate RAID controller module removed" and "Temperature sensor removed" could may be avoided by having the firmware up to date.
If you ask me, as a Dell technician, I would recommend you to update the firmware. Because, at the moment, you want to keep a FW version from the year 2013. Please, have a look to this, below are all the fixes you are missing with your current FW version:
As you can see, there are a lot of fixes. I'm not sure if I've explained myself well and you can see my point. But I really think we need to get rid of this thinking: "If it works, don't touch it" or "Better old good known firmware from 9 years ago that the new one". If you were on my side and you would have seen, like me, so many issues that could have been avoided by having an updated firmware version, I am sure you would think differently.
When you replace a controller in a MD3200 that has dual controllers the configuration from the failed controller will be copied over to the new controller. Also if the firmware that is on replacement controller is older than the controller in the MD3200 then it will upgrade the firmware.
Here is how we recommend that you replace a controller on an MD3xxx:
1. You can login to MDSM and go to the support tab & then Manage RAID Controller Modules you can place the effected controller in an offline status.
2. Then you can go to the Summary Tab & confirm that the controller is offline. Once the controller is offline you can swap it with your replacement controller.
3. Once the controller has been replaced and all cables reconnected then you go back to the support tab & online the controller.
4. You can check in the summary tab & see that both controllers are online. Now if the new controller is at a different version of firmware then the MD 3xxx should copy the correct firmware that is running on the other controller to the replacement controller so that they match. Once the firmware matches on both controllers then both controllers should be active.
The normal troubleshooting steps for facing a Failed/Missing RAID Controller on a MD3200 and using the MDSM application is:
1. Gather MDSM Support Logs. 2. In MDSM or in storageArrayProfile log file, inspect RAID Controller Firmware and NVSRAM and ensure they are up to date. 3. Ensure both management ports are connected. 4. Open MDSM and inspect Physical tab of array to identify which controller is failed or missing <- you already know the failing one is the botton one. 5. Swap test controllers (note: mgmt IP 0 can duplicate after swap test) <- I would start with this test to discard backplanes problems. 6. If Firmware is out of date, put the good controller in slot 0, set to simplex and update (ensure MDSM up to date), then put back to duplex after update and hot plug in failed/missing RC -> see if comes online, if not you will need backplane and Raid Controller. 7. Check for controller lockdown or controller resets by connecting to missing controller and observe serial output <- you don't have a serial cable, that will limit the troubleshooting.
With some delay in giving any news, we had the opportunity to receive a replacement controller. We can do the tests on the failing controller and backplane but, as there is important data on the MD3200, even if under backup, I feel better to take the way of trying directly the replacement controller if viable and safer. Regarding this I have a question, on the old controller we have firmware version 07.84.47.60 and on the replacement version 08.20.16.60. It is some years now that FW 07.84.47.60 is working and not giving any problems so I prefer to keep it if possible. If I put the replacement on the SAN (after configuration cleanup with battery removal) will it downgrade automatically to the version of the old controller and that's all I need to do?
Hello Diego, I apologize. The list you provided definetively confirms that my way of thinking was wrong. I will follow the procedure you provided next week and update the post as soon as done. Thank you for the support.
No, it's okay Matteo! You don't really need to apologize. After all, you are the system admin and you know better what is the best option for your hardware. The only thing I can do is to recommend you. But, after all, you will be the one who will carry the task and do the job and to face any unexpected situation.. So in the end it's up to you. Our job in these forums is to help you and give you our expertise after years of working with this hardware. Sorry if I sounded a bit harsh in my previous message. No hard feelings.
Whatever you decide. Make sure you have as much documentation as possible. There are plenty of articles and videos from users who have done this process. For example, you can read this: Dell PowerVault: Update the PowerVault MD32xx – MD36xx disk array - https://dell.to/3pLjZ5z This article contains a good amount of information (like when to update, how long it takes and general recommendations such the bridge firmware version if the current version is too old) so it is a good start point.
So take your time and consider all your possible options. And if you need any support, well, you know where to find us.
DiegoLopez
4 Operator
•
2.7K Posts
0
February 24th, 2021 03:00
Hello again @videosml,
With all due respect, I think you really need to reconsider this: "It is some years now that FW 07.84.47.60 is working and not giving any problems". Because from my point of view, the errors "Alternate RAID controller module removed" and "Temperature sensor removed" could may be avoided by having the firmware up to date.
If you ask me, as a Dell technician, I would recommend you to update the firmware. Because, at the moment, you want to keep a FW version from the year 2013. Please, have a look to this, below are all the fixes you are missing with your current FW version:
As you can see, there are a lot of fixes. I'm not sure if I've explained myself well and you can see my point. But I really think we need to get rid of this thinking: "If it works, don't touch it" or "Better old good known firmware from 9 years ago that the new one". If you were on my side and you would have seen, like me, so many issues that could have been avoided by having an updated firmware version, I am sure you would think differently.
When you replace a controller in a MD3200 that has dual controllers the configuration from the failed controller will be copied over to the new controller. Also if the firmware that is on replacement controller is older than the controller in the MD3200 then it will upgrade the firmware.
Here is how we recommend that you replace a controller on an MD3xxx:
1. You can login to MDSM and go to the support tab & then Manage RAID Controller Modules you can place the effected controller in an offline status.
2. Then you can go to the Summary Tab & confirm that the controller is offline. Once the controller is offline you can swap it with your replacement controller.
3. Once the controller has been replaced and all cables reconnected then you go back to the support tab & online the controller.
4. You can check in the summary tab & see that both controllers are online. Now if the new controller is at a different version of firmware then the MD 3xxx should copy the correct firmware that is running on the other controller to the replacement controller so that they match. Once the firmware matches on both controllers then both controllers should be active.
Hope this helps.
Regards.
DiegoLopez
4 Operator
•
2.7K Posts
0
January 25th, 2021 05:00
Hello @videosml,
Welcome back to the Dell Community!
The normal troubleshooting steps for facing a Failed/Missing RAID Controller on a MD3200 and using the MDSM application is:
1. Gather MDSM Support Logs.
2. In MDSM or in storageArrayProfile log file, inspect RAID Controller Firmware and NVSRAM and ensure they are up to date.
3. Ensure both management ports are connected.
4. Open MDSM and inspect Physical tab of array to identify which controller is failed or missing <- you already know the failing one is the botton one.
5. Swap test controllers (note: mgmt IP 0 can duplicate after swap test) <- I would start with this test to discard backplanes problems.
6. If Firmware is out of date, put the good controller in slot 0, set to simplex and update (ensure MDSM up to date), then put back to duplex after update and hot plug in failed/missing RC -> see if comes online, if not you will need backplane and Raid Controller.
7. Check for controller lockdown or controller resets by connecting to missing controller and observe serial output <- you don't have a serial cable, that will limit the troubleshooting.
Hope this helps.
Regards.
videosml
18 Posts
0
January 29th, 2021 00:00
Good morning,
Perfect, thanks for the troubleshooting steps you proposed!
The next week I have the possibility to put offline the system and do the testings. I will keep updated this discussion as soon I have the results.
Thank you very much
Matteo
videosml
18 Posts
0
February 24th, 2021 02:00
Good morning,
With some delay in giving any news, we had the opportunity to receive a replacement controller. We can do the tests on the failing controller and backplane but, as there is important data on the MD3200, even if under backup, I feel better to take the way of trying directly the replacement controller if viable and safer. Regarding this I have a question, on the old controller we have firmware version 07.84.47.60 and on the replacement version 08.20.16.60. It is some years now that FW 07.84.47.60 is working and not giving any problems so I prefer to keep it if possible. If I put the replacement on the SAN (after configuration cleanup with battery removal) will it downgrade automatically to the version of the old controller and that's all I need to do?
Thank you
Matteo
videosml
18 Posts
0
February 24th, 2021 04:00
Hello Diego,
I apologize.
The list you provided definetively confirms that my way of thinking was wrong.
I will follow the procedure you provided next week and update the post as soon as done.
Thank you for the support.
Best regards,
Matteo
DiegoLopez
4 Operator
•
2.7K Posts
0
February 24th, 2021 05:00
No, it's okay Matteo! You don't really need to apologize. After all, you are the system admin and you know better what is the best option for your hardware. The only thing I can do is to recommend you. But, after all, you will be the one who will carry the task and do the job and to face any unexpected situation.. So in the end it's up to you. Our job in these forums is to help you and give you our expertise after years of working with this hardware. Sorry if I sounded a bit harsh in my previous message. No hard feelings.
Whatever you decide. Make sure you have as much documentation as possible. There are plenty of articles and videos from users who have done this process. For example, you can read this: Dell PowerVault: Update the PowerVault MD32xx – MD36xx disk array - https://dell.to/3pLjZ5z This article contains a good amount of information (like when to update, how long it takes and general recommendations such the bridge firmware version if the current version is too old) so it is a good start point.
So take your time and consider all your possible options. And if you need any support, well, you know where to find us.
Best regards.