Dell Unity: NFS hosts utilizing Netgroup are unable to connect to NFS Share. (Dell Correctable)
Summary: NFS hosts are not able to connect to Unity NFS Shares via 'Netgroup', however, NFS is working when using explicit Unity NAS Server IP address, or hostname without issue. When the same hosts attempt to connect to the Unity NFS Share via a 'netgroup', this results in server not responding : RPC: Timed out and loss of access to the NFS Share ...
Symptoms
Implemented netgroups after upgrading Unity to Unity Version: 4.0.1.8404134
NFS Host unable to access Unity NFS Shares via 'netgroups'. Same NFS Host is able to directly attach to the Unity NFS Shares via Unity NAS Server IP Address, or via Unity NAS Server 'hostname', however, if same NFS Host attempts to attach to Unity NAS Server NFS share via a 'netgroup', this results in a loss of access to the Unity NFS Share.
An Alert appears on the Unity Unisphere GUI, in the 'alerts' section: "The Network Information Service (NIS) configured for the NAS server was unable to provide user mapping information and is not responding. Check the availability of the NIS server, and ensure that the domain name and addresses used for the server are accurate."
Similar alert can be viewed and observed via Unity CLI :21:08:58 root@(none) spa:/home/service> uemcli /event/alert/hist show
Storage system address: 127.0.0.1
Storage system port: 443
HTTPS connection
1: ID = alert_804
Time = 2016-09-26 21:07:59.088
Message = NAS server n125d061: There is no NIS server on-line for the domain nb-engr.
Description = "The Network Information Service (NIS) configured for the NAS server was unable to provide user mapping information and is not responding. Check the availability of the NIS server, and ensure that the domain name and addresses used for the server are accurate."
Severity = error
Acknowledged = no
2: ID = alert_803
Time = 2016-09-26 21:07:44.102
Message = NAS server n125d061: There is no NIS server on-line for the domain nb-engr.
Description = "The Network Information Service (NIS) configured for the NAS server was unable to provide user mapping information and is not responding. Check the availability of the NIS server, and ensure that the domain name and addresses used for the server are accurate."
Severity = error
Acknowledged = no
3: ID = alert_802
Time = 2016-09-26 21:07:29.174
Message = NAS server n125d061: There is no NIS server on-line for the domain nb-engr.
Description = "The Network Information Service (NIS) configured for the NAS server was unable to provide user mapping information and is not responding. Check the availability of the NIS server, and ensure that the domain name and addresses used for the server are accurate."
Severity = error
Acknowledged = no
A check on whether rpc.d is running on the Unity NAS Server, indicates, that rpc.mountd, rpc,nfsd, and rpc.portmapper are running.root@spa:/cores/service>/sbin/rpcinfo -p 10.#.#.##
program vers proto port service
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100003 4 tcp 2049 nfs
100005 1 udp 1234 mountd
100005 2 udp 1234 mountd
100005 3 udp 1234 mountd
100005 1 tcp 1234 mountd
100005 2 tcp 1234 mountd
100005 3 tcp 1234 mountd
100003 3 tcp 2049 nfs
100021 4 tcp 4001 nlockmgr
100021 1 tcp 4001 nlockmgr
100021 2 tcp 4001 nlockmgr
100021 3 tcp 4001 nlockmgr
100024 1 tcp 4000 status
100003 3 udp 2049 nfs
100021 4 udp 4001 nlockmgr
100021 1 udp 4001 nlockmgr
100021 2 udp 4001 nlockmgr
100021 3 udp 4001 nlockmgr
100024 1 udp 4000 status
536870914 2 tcp 4658
536870914 2 udp 4658
824395111 1 udp 39850
824395111 1 tcp 50600
102660 1 tcp 37185
102660 1 udp 52008
Similary checking on the status of 'rpc.d' from the NFS Host, pointing to the Unity NAS Server, also indicates thatrpc.mountd, rpc,nfsd, and rpc.portmapper are running on the Unity NAS Serverbash-2.03# rpcinfo -p 10.#.#.#0
program vers proto port service
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100003 4 tcp 2049 nfs
100005 1 udp 1234 mountd
100005 2 udp 1234 mountd
100005 3 udp 1234 mountd
100005 1 tcp 1234 mountd
100005 2 tcp 1234 mountd
100005 3 tcp 1234 mountd
100003 3 tcp 2049 nfs
100021 4 tcp 4001 nlockmgr
100021 1 tcp 4001 nlockmgr
100021 2 tcp 4001 nlockmgr
100021 3 tcp 4001 nlockmgr
100024 1 tcp 4000 status
100003 3 udp 2049 nfs
100021 4 udp 4001 nlockmgr
100021 1 udp 4001 nlockmgr
100021 2 udp 4001 nlockmgr
100021 3 udp 4001 nlockmgr
100024 1 udp 4000 status
536870914 2 tcp 4658
536870914 2 udp 4658
824395111 1 udp 39850
824395111 1 tcp 50600
102660 1 tcp 37185
102660 1 udp 52008
NFS Host is able to successfully attach (mount) to the Unity NAS Server NFS Share via the Unity's IP Address and via Unity NAS Server hostname, but after user attempts to connect the NFS Host to the Unity NAS Share via 'netgroup' it fails with: bash-2.03# showmount -e 10.#.#.#0
export list for 10.#.#.#0:
/steventestfs (everyone)
stevetestshare (everyone)
/tejasshare n#-e###-a##,10.#.#.##1/255.255.255.255,10.#.#.##/255.255.255.255
bash-2.03# mount 10.#.#.#0:/tejasshare /mwtest/
nfs mount: 10.#.#.#0:/tejasshare: server not responding : RPC: Timed out
nfs mount: retrying: /mwtest
nfs mount: 10.#.#.#0:/tejasshare: server not responding : RPC: Timed out
Cause
Unity dropping Legal YP MATCH responses in the form of "host.byaddr". Unity utilizes a strict internal 'firewall' that is dropping network packets. The Unity NAS Server container, is not allowing datagrams sent by the NIS Server, when the NIS Server uses a different port value for its source port, than the port value it returns in "PORTMAP' requests.
Unity's default internal firewall policy is to DROP if no rules matches. So, any random ports much match the "-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT" rule.
Resolution
A Unity software fix is currently being developed. Currently being tracked via AR Number 858778. Contact Dell-EMC Technical Support for update on 'code fix'.
In the meantime we have a workaround. The workaround requires a change of the ip-chains.
connect to the Unity, and inject Root.
add a rule at customer box to accept any udp packet from the NIS servers. After adding the rule, the NIS should work. This is not really a permanent solution, but a workaround, check on AR number 858778 for software fix.
To add:iptables -A IN_DATA -p udp -s <ip address of the NIS server> -j ACCEPT
do the above command for each of the NIS Servers IP address, then you can confirm that netgroups work, you can then run the following command to delete.
To Delete:iptables -D IN_DATA -p udp -s <ip address of the NIS server> -j ACCEPT
example: root@spa:/cores/service>iptables -S IN_DATA
-N IN_DATA
-A IN_DATA -p tcp -m tcp --dport 445 -j ACCEPT
....
-A IN_DATA -i eve_br0 -p udp -m multiport --dports 22 -j ACCEPT
-A IN_DATA -s 10.#.#.##/32 -p udp -j ACCEPT
-A IN_DATA -s 10.#.#.##/32 -p udp -j ACCEPT