Artikkelin numero: 530469

printer Tulosta mail Sähköposti

OneFS: Under rare situations, Mellanox cards, 40G and Inifiniband, may stop responding to commands

Ensisijainen tuote: Isilon OneFS 8.1

Tuote: Isilon OneFS 8.0 lisää...

Viimeksi julkaistu: 02 syys 2020

Artikkelin tyyppi: Break Fix

Julkaisutila: Online

Versio: 7

OneFS: Under rare situations, Mellanox cards, 40G and Inifiniband, may stop responding to commands

Artikkelin sisältö

Ongelma


An internal error in the Mellanox 40Gb or InfiniBand card may cause the card to fail. When failure occurs, the interface will no longer respond to commands, such as ifconfig or pciconfig. In addition, when this issue occurs, and the card is configured for an external network, flexnet and smartconnect are unable to assign IP addresses to the interface.
 
Footprints of the failure are seen in the messages file, include the following syntax:     
 
Errors indicating that the driver can no longer post commands:      
Notice the driver number  mlx4_core1 :     
mlx4_core1: mlx4_cmd_post:cmd_pending failed

... or ...
 
Indication of Internal error detected:     
Notice the driver number  mlx4_core1 :     
2018-12-26T16:31:34-08:00 <0.7> isilon-1 /boot/kernel.amd64/kernel: mlx4_core1: Internal error detected:
2018-12-26T16:31:34-08:00 <0.7> isilon-1 /boot/kernel.amd64/kernel: mlx4_core1:   buf[00]: ffffffff
.
.
2018-12-26T16:31:34-08:00 <0.3> isilon-1 /boot/kernel.amd64/kernel: mlx4_en mlx4_core1: Internal error detected, restarting device
Syy
This occurs when there are Cisco BiDi QSFP+ Optics in use with this card. The optic can produce up to 3.5W of power while the NIC can only accept a maximum of 1.5W of power. Since the margin is too great for the input rail to handle, the NIC stops functioning causing the node to panic.
Tarkkuus
Shut down the node and replace the NIC.

In addition, do not use BiDi optical cables with this NIC as there is a difference in power requirements. A new NIC with an updated fuse is now available to order. Contact Dell Isilon Technical Support for details.

Another workaround is to use a different cable as listed in the compatibility guide:      
https://support.emc.com/docu44518_Isilon-Supportability-and-Compatibility-Guide.pdf?language=en_US
Huomautukset

Ongelma


An internal error in the Mellanox 40Gb or InfiniBand card may cause the card to fail. When failure occurs, the interface will no longer respond to commands, such as ifconfig or pciconfig. In addition, when this issue occurs, and the card is configured for an external network, flexnet and smartconnect are unable to assign IP addresses to the interface.
 
Footprints of the failure are seen in the messages file, include the following syntax:     
 
Errors indicating that the driver can no longer post commands:      
Notice the driver number  mlx4_core1 :     
mlx4_core1: mlx4_cmd_post:cmd_pending failed

... or ...
 
Indication of Internal error detected:     
Notice the driver number  mlx4_core1 :     
2018-12-26T16:31:34-08:00 <0.7> isilon-1 /boot/kernel.amd64/kernel: mlx4_core1: Internal error detected:
2018-12-26T16:31:34-08:00 <0.7> isilon-1 /boot/kernel.amd64/kernel: mlx4_core1:   buf[00]: ffffffff
.
.
2018-12-26T16:31:34-08:00 <0.3> isilon-1 /boot/kernel.amd64/kernel: mlx4_en mlx4_core1: Internal error detected, restarting device
Syy
This occurs when there are Cisco BiDi QSFP+ Optics in use with this card. The optic can produce up to 3.5W of power while the NIC can only accept a maximum of 1.5W of power. Since the margin is too great for the input rail to handle, the NIC stops functioning causing the node to panic.
Tarkkuus

Shut down the node and replace the NIC.

In addition, do not use BiDi optical cables with this NIC as there is a difference in power requirements. A new NIC with an updated fuse is now available to order. Contact Dell Isilon Technical Support for details.

Another workaround is to use a different cable as listed in the compatibility guide:      
https://support.emc.com/docu44518_Isilon-Supportability-and-Compatibility-Guide.pdf?language=en_US

Huomautukset

Article Attachments

Liitteet

Liitteet

Artikkelin ominaisuudet

Ensimmäinen julkaisupäivä

la maalis 16 2019 15.19.06 GMT

Ensimmäinen julkaisupäivä

la maalis 16 2019 15.19.06 GMT

Arvostele tämä artikkeli

Tarkka
Hyödyllinen
Helppo hahmottaa
Oliko tästä artikkelista hyötyä?
0/3000 characters