Dell Unity: SP Panic after Being Up for More than 240 Days

摘要: Dell Unity XT 480, 680, or 880 Storage Processor (SP) may panic after being up for more than 240 days. (Dell Correctable)

本文章適用於 本文章不適用於 本文無關於任何特定產品。 本文未識別所有產品版本。

症狀

Dell Unity XT 480, 680, or 880 SP may panic after being up for 240 days.  Other Dell Unity systems can experience the issue after more than 240 days (that is greater than 730 days).

UDoctor may generate an alert on any code version below 5.3 where the SP has been running for more than 240 days, and that alert will reference this KB article.  See KB article Dell Unity: Critical Alert 640003 Occurring on OE 5.2.1 or later, Where Storage Processor (SP) Uptime Panic Fix is already applied for more details about the UDoctor alert.

原因

An SP panic may occur due to an integer overflow calculation leading to a 64-bit result being stored in a 32-bit variable. 

The issue is most likely to occur on a Unity XT 480, 680, or 880 array running Unity OE versions 5.1.0.0.5.394 through 5.2.0.0.5.173. This is due to changes within that code and the SP hardware used in those models, and how the code interacts with that hardware.

 

NOTE: Older codes and Unity models are less likely to see this issue, but it is not impossible for the issue to occur on older code or older Unity models, especially when the SP has been running for longer than 240 days (that is more than 730 days).

 

NOTE: The average number of days before an SP restart is triggered by this issue is 275-300 days. The SP restart can occur before 275 days, but it is less likely. Storage Processor restarts due to this issue do not occur prior to 240 days.

解析度

Fix:
The fix is available in Unity OE version 5.2.1.0.5.013 and later.  However, Dell does not recommend upgrading to this code version.  Dell strongly recommends that customers upgrade to the latest available code, or at a minimum, if the latest code is not the "target" code, upgrade to the target code.

Also, the UDoctor utility identifies this issue on Unity OE version 5.3 and below. This is because the fix was delivered in version 5.3, but was backported to 5.2.1 and greater code.  The UDoctor alert triggers on the backported code.


Workaround:

Proactively reboot the SP every 240 days to avoid an SP panic. Instructions to reboot an SP are available in the article: Unity: How to Reboot a Storage Processor (User Correctable).

Customers can issue an "uptime" command by connecting to an array using SSH and using the service account to get the SP runtime.

The example below shows uptime is 31 days.

04:30:01 service@xxx spa:~/user# uptime
04:30am  up 31 days  3:41,  2 users,  load average: 29.21, 29.45, 29.51

 


 

The UDoctor alert will refresh every three days unless the above fix or workaround is implemented.  Dell Technologies recommends that customers implement the Fix or the Workaround (if they cannot implement the Fix). But if neither the Fix or the Workaround can be implemented, the UDoctor check for this condition alone can be disabled.

Log in to the primary SP using SSH and issue the following command:

svc_udoctor --jobs --disable CalculateUptime

This disables the check from running every three days.

受影響的產品

Dell EMC Unity Family |Dell EMC Unity All Flash, Dell EMC Unity Family
文章屬性
文章編號: 000200921
文章類型: Solution
上次修改時間: 21 5月 2025
版本:  23
向其他 Dell 使用者尋求您問題的答案
支援服務
檢查您的裝置是否在支援服務的涵蓋範圍內。