We use smartconnect in our Onefs 7.1 to balance our load on all 70 ESXi server. We use 24 nfs mounts on each servers. when we boot the ESXi server, the mount is to fast and we received only a few differents adress for the NFS mount. It look like if we ask the smartConnect for an adress true a Windows 2k8 DNS delagation, we receive the same adress if we ask for 2 adress in the same second. TTL is set to 0.
We have connect to the EMC formation LAB and it give us the same problem at your site !!!!
Any work aroud exist for this?
Solved! Go to Solution.
Which SC balancing policy are you using? Only round-robin guarantees to return
(at Isilon level) different addresses no matter within how short intervals the queries are made.
Do you have the change to capture the network packets to and from the SC interface?
That would help checking wether some caching is done outside the Isilon despite ttl=0.
The connecting counts are updated only every 5 seconds, which is too slow here as I understand it.
Alain mentioned he was checking out responses within one second...
(DNS answers from SC are NOT counted -- actual mounts are, but outside SC.)
BTW great new KB article here, very concise explanations and advise:
Thanks Peter, good KB indeed. We are still mapping Isilon to vSphere using individual node IP addresses (back in ESX 4.1 SmartZone was not supported).
As far as I know, for Windows DNS, unlike BIND, ttl 0 doesn't work as we expected.
Actual minimum ttl is likely to be ttl1.
We experienced this problem at multiple site which uses Windows DNS.
So if multiple request for SmartConnectZone happen in 1 second, Windows DNS use and reply same IP to clients.
I just found out that the solution to this is not to use a delegation record on the Windows DNS.
You just add normal A records with the same name and all the different IPs on the Windows DNS, and ensure that in the advanced settings of the DNS “enable round robin” is turned on, which should be the default.
Doing it like this the Windows DNS will give back nicely distributed IPs, even for lots of concurrent requests.
So in special usecases like HPC that does it. In any other case, where 1 sec caching is no problem, using the delegation is of course best.