Dell Unity: SP Panic after Being Up for More than 240 Days
Summary: Dell Unity XT 480, 680, or 880 Storage Processor (SP) may panic after being up for more than 240 days. (Dell Correctable)
Symptoms
Dell Unity XT 480, 680, or 880 SP may panic after being up for 240 days. Other Dell Unity systems can experience the issue after more than 240 days (that is greater than 730 days).
UDoctor may generate an alert on any code version below 5.2.1 (except for version 5.1.3.0.6.439) where the SP has been running for more than 240 days, and that alert will reference this KB article.
Cause
An SP panic may occur due to an integer overflow calculation leading to a 64-bit result being stored in a 32-bit variable.
The issue is most likely to occur on a Unity XT 480, 680, or 880 array running Unity OE versions 5.1.0.0.5.394 through 5.2.0.0.5.173. This is due to changes within that code and the SP hardware used in those models, and how the code interacts with that hardware.
Resolution
Fix:
The fix first became available in Unity OE version 5.3. All current supported Unity OE releases include the fix. Please see KB article 70507 for a listing of current supported versions.
Workaround:
Proactively reboot the SP every 240 days to avoid an SP panic. Instructions to reboot an SP are available in the article: Unity: How to Reboot a Storage Processor (User Correctable).
Customers can issue an "uptime" command by connecting to an array using SSH and using the service account to get the SP runtime.
The example below shows uptime is 31 days.
04:30:01 service@xxx spa:~/user# uptime
04:30am up 31 days 3:41, 2 users, load average: 29.21, 29.45, 29.51
The UDoctor alert will refresh once per week unless the above Fix or Workaround is implemented. Dell Technologies recommends that customers upgrade to a current supported version of Unity OE code.