Connectrix: Cisco: Ports go into an immediate Error disabled state
Summary: This KB article describes an issue where Connectrix: Cisco ports go into an immediate Error disabled state.
Symptoms
Multiple port flaps, which causes Ternary Content Addressable Memory (TCAM) memory usage in certain cases.
Ports transition immediately to the Error disabled state when running no shutdown.
If there is device login, the port goes into the Error disabled state:
edge-(config-if)# no shutdown fc4/29: (error) Failed to Write to TCAM
Access Control Lists (ACL) programming fails to lead to zone activation failure or device login failure.
The following syslog messages are seen:
%ZONE-2-ZS_TCAM_SWITCHED_TO_SOFT_ZONING_MODE: %$VSAN 12%$ Switched to soft zoning : Reason: Hard zoning disabled
Reviewed the TCAM entries:
show system internal acl tcam-usage TCAM Entries: ============= Region1 Region2 Region3 Region4 Region5 Region6 Mod Fwd Dir TOP SYS SECURITY ZONING BOTTOM FCC DIS FCC ENA Eng Use/Total Use/Total Use/Total Use/Total Use/Total Use/Total --- --- ------ ---------- --------- ------------ --------- --------- --------- 4 3 INPUT 239/19664 0/9840 35004/49136* 87/19664 0/0 0/0
Then this error appears:
show port internal event-history errors 12) Event:E_DEBUG, length:119, at 424487 usecs after Wed Oct 5 00:50:56 2016 [102] pi_fsm_port_attr_change_init: Ifindex (fc4/26)0x1199000, Err disabled event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23
The same error exists for:
[102] pi_fsm_ac_cfg_wait_resp_rcvd: Ifindex (fc4/32)0x119f000, Err disabled (0x404e0005) event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23 due to acl mode failure [102] pi_fsm_ac_cfg_wait_resp_rcvd: Ifindex (fc4/27)0x119a000, Err disabled (0x404e0005) event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23 due to acl mode failure [102] pi_fsm_ac_cfg_wait_resp_rcvd: Ifindex (fc4/29)0x119c000, Err disabled (0x404e0005) event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23 due to acl mode failure
And for this:
show port internal event-history errors 12) Event:E_DEBUG, length:119, at 424487 usecs after Wed Oct 5 00:50:56 2016 [102] pi_fsm_port_attr_change_init: Ifindex (fc4/26)0x1199000, Err disabled event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23
The same error exists for:
[102] pi_fsm_ac_cfg_wait_resp_rcvd: Ifindex (fc4/32)0x119f000, Err disabled (0x404e0005) event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23 due to acl mode failure [102] pi_fsm_ac_cfg_wait_resp_rcvd: Ifindex (fc4/27)0x119a000, Err disabled (0x404e0005) event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23 due to acl mode failure [102] pi_fsm_ac_cfg_wait_resp_rcvd: Ifindex (fc4/29)0x119c000, Err disabled (0x404e0005) event (PI_FSM_EV_ACL_MODE_PROG_FAILURE)0x23 due to acl mode failure
Cause
Cisco bug CSCuw15428 FC Scale: Port error disabled due to TCAM rewrite fail
ACL programming fails as the TCAM usage has almost reached the limit and complains of no space:
'show system internal acl tcam-usage' 1 1 INPUT 2506/13114 0/3290 65496/68786* 16/13114 0/0 0/0 1 1 OUTPUT 7/4077 0/1645 0/11469 0/4077 6/1651 9/1654
The high value is due to some duplicate entries programmed:
module-1# sh process acltcam fwd-engine 1 input match interface fc 1/15 | inc d80601 00442 44 0 136 d80601 d80601 0 12 - - - - - - - - - | 0 0 0 0 0 0 02802 44 0 136 d80601 d80601 0 12 - - - - - - - - - | 0 0 0 0 0 0 08282 bb 0 136 d80601 6e028d 0 12 - - - - - - - - - | 0 0 0 0 1 0 0854a bb 0 136 d80601 6e028d 0 12 - - - - - - - - - | 0 0 0 0 1 0 08562 bb 0 136 d80601 6e028b 0 12 - - - - - - - - - | 0 0 0 0 1 0 08563 bb 0 136 d80601 6e028b 0 12 - - - - - - - - - | 0 0 0 0 1 0
Sometimes, the tcam-usage value goes negative:
`show system internal acl tcam-usage` TCAM Entries: ============= Region1 Region2 Region3 Region4 Region5 Region6 Mod Fwd Dir TOP SYS SECURITY ZONING BOTTOM FCC DIS FCC ENA Eng Use/Total Use/Total Use/Total Use/Total Use/Total Use/Total --- --- ------ ---------- --------- ------------ --------- --------- --------- 3 3 INPUT 192/9824 0/0 -1126/-1126* 131/9824 0/0 0/0
From version 7.3.0 and later releases, instead of a negative value, an unexpectedly large value displays:
1 5 INPUT 598/9824 0/0 4294964129/4294964129* 523/9824 0/0 0/0
Resolution
Fix:
Upgrade to NX-OS 6.2(17).
Workaround:
Reload the line card in question.
Additional Information
In the instance identified, the switch was a 9710 @ 6.2.11c and the line card in question was a DS-X9448-768K9 (2/4/8/10/16 Gbps Advanced FC Module).