printer Print

Customer View

RecoverPoint: RPA down for unknown reason

Article Number: 520895
Version: 4
Article Type: Break Fix
Last Published: 24 Mar 2020

RecoverPoint GUI reported "RPA down"
RecoverPoint Appliance rebooted unexpectedly

This is a hardware related issue with Intel RPA. Looking through IPMI event logs we can see that an SMI timeout occurred around the time of the RPA reboot.
From log: files/home/kos/kbox/utilities/regulate_reboot/detailed_startup_information.txt
reboot      Tue Dec 6 21:27:46 UTC 2016
From log: processes/usr/bin/ipmitoolselelist or files/home/kos/sel/sel.log
59 | 12/06/2016 | 21:16:33 | Unknown SMI Timeout | State Asserted
5a | 12/06/2016 | 21:16:58 | Unknown SMI Timeout | State Asserted

Affected version: All version with Intel hardware RPA

According to Intel documentation: "SMI stands for system management interrupt and is an interrupt that gets generated so the processor can service server management events (typically memory or PCI errors, or other forms of critical interrupts). If this interrupt times out the system is frozen." This long timeout caused the RPA to go down.
The affected RPA should recover itself after about ~6-10 minutes. Power cycle the RPA if it does not recover itself. 
Monitor the affected RPA and if the issue persists, this RPA will need to be replaced.
Article Properties
First Published Tue May 08 2018
02:57:21 GMT
Primary Product
RecoverPoint Gen5 Server
RecoverPoint Gen5 Server,RecoverPoint CL,RecoverPoint EX,RecoverPoint SE,RecoverPoint