PowerStore: Unexpected node reboot or kernel panic
Сводка: In order to fully identify the cause of a reboot or provide a full Root Cause Analysis (RCA), various logs are needed.
Данная статья применяется к
Данная статья не применяется к
Эта статья не привязана к какому-либо конкретному продукту.
В этой статье указаны не все версии продуктов.
Симптомы
The most likely event or error code for this issue is: 0x00304404
Description: Node has been physically removed or shut down.
Other possible event codes:
A node reboot can trigger other secondary alerts or dial homes, such as:
Description: Node has been physically removed or shut down.
Other possible event codes:
- 0x00307701: XENV is not active.
- 0x00304203: Node has stopped.
- 0x00302b04: The Node has stopped.
- 0x00300D06: The cluster service has stopped.
- 0x0030c601: The appliance has stopped servicing IOs.
A node reboot can trigger other secondary alerts or dial homes, such as:
- Port link failure alerts (Event code 0x00307404). More details in PowerStore Alerts: Port Link Failure.
- Port health state alerts (Event codes: 0x00305302, 0x00305303, 0x00305402, 0x00305403). More details in PowerStore Alerts: Node Port Health States
Причина
A PowerStore node may reboot unexpectedly due to various reasons.
Each unexpected reboot should be investigated separately.
See the Additional Info section below for details on what is needed for this investigation.
Each unexpected reboot should be investigated separately.
See the Additional Info section below for details on what is needed for this investigation.
Разрешение
A few options exist to check for unexpected node reboots.
Log in to the cluster over ssh and run svc_dc list_dumps
You can also try to find dump files from PowerStore Manager. For more details see PowerStore: How to generate and collect various logs from PowerStore.
To login to the nodes over ssh, find the cluster or node IP within PowerStore Manager under Settings > Network IPs. Log in with your preferred ssh client using the service user account and the respective service user password (defined during the setup of your system).
This is also useful as some unexpected reboots may not produce a dump file.
Checking alerts and events from the PowerStore Manager (GUI)
Check the events and alerts that could indicate an unexpected node reboot:- Within PowerStore Manager, check the Monitoring section, and look at the details under the ALERTS and EVENTS tabs.
- Look for timestamps, error or event codes, messages, and so on. In order to make your searches clearer, use the filter options from within the ALERTS and EVENTS tabs:
Checking for dump files
Check for the existence of system dump files around the time of the errors. Kernel dumps are not included in Data Collects.Log in to the cluster over ssh and run svc_dc list_dumps
You can also try to find dump files from PowerStore Manager. For more details see PowerStore: How to generate and collect various logs from PowerStore.
To login to the nodes over ssh, find the cluster or node IP within PowerStore Manager under Settings > Network IPs. Log in with your preferred ssh client using the service user account and the respective service user password (defined during the setup of your system).
Checking the uptime on both nodes
Run the command uptime on both nodes. This will tell you how long the node had been up for and help confirm possible reboots.This is also useful as some unexpected reboots may not produce a dump file.
Other indicators
A gap in the Performance graphs in PowerStore Manager may also indicate a Node reboot. This should be used for guidance only, and you must confirm with more evidence as suggested above. Performance graphs are available either from Dashboard > PERFORMANCE, or Hardware > Appliance X > Performance.Дополнительная информация
What is needed for Root Cause Analysis (RCA)?
- Support Materials from all the appliances in the cluster. These should be gathered as close to the reboot as possible.
- Dump file
Затронутые продукты
PowerStoreСвойства статьи
Номер статьи: 000130141
Тип статьи: Solution
Последнее изменение: 16 Aug 2023
Версия: 14
Получите ответы на свои вопросы от других пользователей Dell
Услуги технической поддержки
Проверьте, распространяются ли на ваше устройство услуги технической поддержки.