Isilon : Smartconnect Service IP Address (SSIP) goes missing on all configured VLAN interfaces after committing OneFS 8.2 upgrade

摘要: This article describes an event where the SSIP will go missing on any interface that is configured with VLANs when upgrading to 8.2.

本文适用于 本文不适用于 本文并非针对某种特定的产品。 本文并非包含所有产品版本。

症状

Upon committing an upgrade to OneFS 8.2, the Smartconnect Service IP Address (SSIP) is no longer available for zone name queries and load balancing for any interface configured with VLAN tagging. 
NOTE: For any subnet that is configured with an SSIP and VLAN tagging is disabled, this issue does not apply.  Additionally, this does NOT apply to any other upgrade type or version.  This only applies when upgrading to 8.2.
To illustrate the issue, an internal reproduction reflects how the issue is triggered and what criteria are needed - 
>> We can see the SSIP exists on all interfaces (169.168.1.9, 169.168.10.9, 169.168.20.9):
MN-X410-CLUS-1# ifconfig
bxe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO>
       ether 00:0a:f7:
       inet 169.168.1.20 netmask 0xffffff00 broadcast 169.168.1.255 zone 1
       inet 169.168.1.9 netmask 0xffffff00 broadcast 169.168.1.255 zone 1
       nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
.
.
vlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet6 fe80::20a:f7: %vlan0 prefixlen 64 scopeid 0x8 zone 1
       inet 169.168.10.20 netmask 0xffffff00 broadcast 169.168.10.255 zone 1
       inet 169.168.10.9 netmask 0xffffff00 broadcast 169.168.10.255 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 10 parent interface: bxe0
vlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet6 fe80::20a:f7: %vlan1 prefixlen 64 scopeid 0x9 zone 1
       inet 169.168.20.9 netmask 0xffffff00 broadcast 169.168.20.255 zone 1
       inet 169.168.20.20 netmask 0xffffff00 broadcast 169.168.20.255 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 20 parent interface: bxe0        
>> Begin upgrade to 8.2.0.0:
MN-X410-CLUS-1# uname -a
Isilon OneFS MN-X410-CLUS-1 v8.2.0.0 Isilon OneFS v8.2.0.0 B_8_2_0_0_011(RELEASE): 0x80200500000000B:Thu Jun 20 10:29:21 PDT 2019
    root@sea-build11-01:/b/mnt/obj/b/mnt/src/amd64.amd64/sys/IQ.amd64.release   FreeBSD clang version 3.9.1
(tags/RELEASE_391/final 289601) (based on LLVM 3.9.1) amd64


MN-X410-CLUS-1# ifconfig
bxe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO>
       ether 00:0a:f7:
       inet 169.168.1.20 netmask 0xffffff00 broadcast 169.168.1.255 zone 1
       inet 169.168.1.9 netmask 0xffffff00 broadcast 169.168.1.255 zone 0
       nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
.
.
vlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet 169.168.10.20 netmask 0xffffff00 broadcast 169.168.10.255 zone 1
       inet 169.168.10.9 netmask 0xffffff00 broadcast 169.168.10.255 zone 0
       inet6 fe80::20a:f7: %vlan0 prefixlen 64 scopeid 0x8 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 10 vlanpcp: 0 parent interface: bxe0
       groups: vlan
vlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet 169.168.20.20 netmask 0xffffff00 broadcast 169.168.20.255 zone 1
       inet 169.168.20.9 netmask 0xffffff00 broadcast 169.168.20.255 zone 0
       inet6 fe80::20a:f7: %vlan1 prefixlen 64 scopeid 0x9 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 20 vlanpcp: 0 parent interface: bxe0
       groups: vlan
>> Verified vlans and SSIP are created and assigned after reboot:
MN-X410-CLUS-1# uname -a
Isilon OneFS MN-X410-CLUS-1 v8.2.0.0 Isilon OneFS v8.2.0.0 B_8_2_0_0_011(RELEASE): 0x80200500000000B:Thu Jun 20 10:29:21 PDT 2019
    root@sea-build11-01:/b/mnt/obj/b/mnt/src/amd64.amd64/sys/IQ.amd64.release   FreeBSD clang version 3.9.1
(tags/RELEASE_391/final 289601) (based on LLVM 3.9.1) amd64


MN-X410-CLUS-1# ifconfig
bxe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO>
       ether 00:0a:f7:
       inet 169.168.1.20 netmask 0xffffff00 broadcast 169.168.1.255 zone 1
       inet 169.168.1.9 netmask 0xffffff00 broadcast 169.168.1.255 zone 0
       nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
.
.
vlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet 169.168.10.20 netmask 0xffffff00 broadcast 169.168.10.255 zone 1
       inet 169.168.10.9 netmask 0xffffff00 broadcast 169.168.10.255 zone 0
       inet6 fe80::20a:f7: %vlan0 prefixlen 64 scopeid 0x8 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 10 vlanpcp: 0 parent interface: bxe0
       groups: vlan
vlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet 169.168.20.20 netmask 0xffffff00 broadcast 169.168.20.255 zone 1
       inet 169.168.20.9 netmask 0xffffff00 broadcast 169.168.20.255 zone 0
       inet6 fe80::20a:f7: %vlan1 prefixlen 64 scopeid 0x9 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 20 vlanpcp: 0 parent interface: bxe0
       groups: vlan
>> Upgrade is in pre-commit state:
MN-X410-CLUS-1# isi upgrade view

Upgrade Status:

Current Upgrade Activity: OneFS upgrade
   Cluster Upgrade State: Ready to commit
   Upgrade Process State: Running
      Upgrade Start Time: 2019-07-13T12:38:25
      Current OS Version: 8.0.0.6_build(117)style(5)
      Upgrade OS Version: 8.2.0.0_build(11)style(5)
        Percent Complete: 100%

Nodes Progress:

     Total Cluster Nodes: 1
       Nodes On Older OS: 0
          Nodes Upgraded: 1
Nodes Transitioning/Down: 0

LNN  Progress  Version  Status 
--------------------------------
1    100%      8.2.0.0  upgraded
>> Committed the upgrade:
MN-X410-CLUS-1# isi upgrade commit
You are about to COMMIT an upgrade, it CANNOT be rolled back after this, are you sure? (yes/[no]): yes
SSIP is now missing on all vlan interfaces, however non-vlan interface is NOT affected:
MN-X410-CLUS-1# ifconfig
bxe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO>
       ether 00:0a:f7:
       inet 169.168.1.20 netmask 0xffffff00 broadcast 169.168.1.255 zone 1
       inet 169.168.1.9 netmask 0xffffff00 broadcast 169.168.1.255 zone 0
       nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
.
.
vlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet 169.168.10.20 netmask 0xffffff00 broadcast 169.168.10.255 zone 1
       inet6 fe80::20a:f7: %vlan0 prefixlen 64 scopeid 0x8 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 10 vlanpcp: 0 parent interface: bxe0
       groups: vlan
vlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       options=303<RXCSUM,TXCSUM,TSO4,TSO6>
       ether 00:0a:f7:
       inet 169.168.20.20 netmask 0xffffff00 broadcast 169.168.20.255 zone 1
       inet6 fe80::20a:f7: %vlan1 prefixlen 64 scopeid 0x9 zone 1
       nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
       media: Ethernet autoselect (10Gbase-SR <full-duplex>)
       status: active
       vlan: 20 vlanpcp: 0 parent interface: bxe0
       groups: vlan

原因

Upon upgrading to 8.2, flexnet s configuration file (flx_config.xml) is broken up into smaller more narrow pieces of information.  A new folder is created called nodeinfo, to which each node s interface and status information is provided.  However, during the upgrade, vlan information is not fetched, thus leaving out any vlan configuration from within each nodes nodeinfo file.  Once the upgrade is committed Smartconnect attempts to read from the nodeinfo files and is unable to capture any vlan details to assign the SSIP. 
While debugging isi_smartconnect_d, we can see the following errors are found in the logs:
2019-07-15-T12:46:12:DEBUG:0x80c612010:NodeInterfaceGetVlanNic_inlock():nodeinfo.c:1281: Error STATUS_NOT_FOUND (0xc0000225)
2019-07-15-T12:46:12:DEBUG:0x80c612010:NodeInterfaceIsStatus():nodeinfo.c:1385: Error STATUS_NOT_FOUND (0xc0000225)
2019-07-15-T12:46:12:DEBUG:0x80c612010:VIPLoadInterface():vip_coord.c:480: Error STATUS_NETWORK_UNREACHABLE (0xc000023c)

解决方案

This issue was observed on upgrade to 8.2.0 only. The problem was fixed in 8.2.1.0 and above. Any upgrades to later releases than 8.2.0 are not affected.  If you suspect you are affected by this issue, contact Support and reference this KB before continuing to the steps below.
  1. To resolve the issue, the flx_config.xml and nodeinfo configurations must be forced to change.  This can be triggered by enabling and disabling Smartconnect debug logging.
To enable debug logging:
# isi_sc_log_level -l debug
To change logging to info:
# isi_sc_log_level -l info
Verify that the IP address has returned and that the processes are running:
# isi_for_array ifconfig | grep <SSIP>
# isi_for_array -s ps auwx | egrep "(smartconnect|dnsiq)" | grep -v grep
NOTE: Restarting daemons has not resolved this issue.  Daemons include isi_dnsiq_d, isi_smartconnect_d, and isi_flexnet_d.

产品

Isilon Gen6, Isilon HD400, Isilon NL410, Isilon X210, Isilon X410
文章属性
文章编号: 000168627
文章类型: Solution
上次修改时间: 14 12月 2023
版本:  4
从其他戴尔用户那里查找问题的答案
支持服务
检查您的设备是否在支持服务涵盖的范围内。