printer Print

Customer View


RecoverPoint: RPA down for unknown reason

Article Number: 520895
Version: 4
Article Type: Break Fix
Last Published: 24 Mar 2020
Summary:
Issue


RecoverPoint GUI reported "RPA down"
RecoverPoint Appliance rebooted unexpectedly

This is a hardware related issue with Intel RPA. Looking through IPMI event logs we can see that an SMI timeout occurred around the time of the RPA reboot.
From log: files/home/kos/kbox/utilities/regulate_reboot/detailed_startup_information.txt
reboot      Tue Dec 6 21:27:46 UTC 2016
 
From log: processes/usr/bin/ipmitoolselelist or files/home/kos/sel/sel.log
59 | 12/06/2016 | 21:16:33 | Unknown SMI Timeout | State Asserted
5a | 12/06/2016 | 21:16:58 | Unknown SMI Timeout | State Asserted

Affected version: All version with Intel hardware RPA

 
Cause
According to Intel documentation: "SMI stands for system management interrupt and is an interrupt that gets generated so the processor can service server management events (typically memory or PCI errors, or other forms of critical interrupts). If this interrupt times out the system is frozen." This long timeout caused the RPA to go down.
Resolution
The affected RPA should recover itself after about ~6-10 minutes. Power cycle the RPA if it does not recover itself. 
Monitor the affected RPA and if the issue persists, this RPA will need to be replaced.
 
Notes
Attachments
Article Properties
First Published Tue May 08 2018
02:57:21 GMT
Primary Product
RecoverPoint Gen5 Server
Product
RecoverPoint Gen5 Server,RecoverPoint CL,RecoverPoint EX,RecoverPoint SE,RecoverPoint