ECS:由于在 CPU 上收到 NMI,导致意外重新启动
Summary: 由于在 CPU 上收到 NMI,导致意外重新启动。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
ECS 节点多次意外重新启动,重新启动时生成了核心文件。通过在 /var/crash/ 的 dmesg 日志中检查堆栈跟踪,重新启动是由于在 CPU 上检测到 NMI。
针对不可屏蔽中断的 NMI 标准,这是具有最高优先级的中断,它发生是为了对不可恢复的硬件错误发出注意信号。
2020-03-01-21:06/dmesg.txt:[5200025.129135] Uhhuh. NMI received for unknown reason 3d on CPU 0. 2020-03-01-21:06/dmesg.txt-[5200025.129135] Do you have a strange power-saving mode enabled? Checked the hardware for any issue and checked if BIOS is out-dated sudo bash memory.sh sudo ipmitool sel list sudo xdoctor /usr/share/emc-intel-firmware/flashupdt/flashupdt /i | grep "BIOS Version"
Cause
这可能是作系统问题,也可能是硬件问题。
Resolution
重新映像可能就足够了。但是,最好以物理方式更换节点,以防重新映像后问题仍然存在。
Affected Products
Elastic Cloud StorageProducts
Elastic Cloud StorageArticle Properties
Article Number: 000081969
Article Type: Solution
Last Modified: 12 Sep 2025
Version: 5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.