March 27th, 2015 07:00

The suggested best practice is now to have a only single Dynamic SmartConnect Pool for NameNode and DataNode, not separate pools as in early documents.

https://www.emc.com/collateral/white-papers/h13926-wp-emc-isilon-hadoop-best-practices-onefs72.pdf

You would utilize a single pool per Access Zone as needed, Racks can then be used within pools to control client IP access to nodes also.

Everything Big Data at EMC

254 Posts

March 26th, 2015 13:00

IP/SmartConnect Zones exist to help balance the clients across the nodes. With Isilon all nodes are potential name nodes as well as data nodes. So when a client connects to the name node through a SmartConnect name, each client gets sent to a different node (typically in a round-robin fashion). Of course, the client expects to get 3 hdfs locations for the data being requested. The client can choose any of the addresses but, in practice, clients start at the first address then move down only if necessary. So the name node service on the Isilon acts similar to SmartConnect in that it returns a rotating list of addresses each on a different node, again, with the idea of balancing the clients. So if you had a 5 node cluster, the name node request might return addresses on nodes 1,2,3. Then the next one could return 2,3,4. Then 3,4,5. Then 4,5,1 and so on. The idea is to try and keep the clients balanced amongst the nodes in the cluster. Any Isilon node can access any of the data so it doesn’t really matter to OneFS which node the client uses to connect with respect to data access so the system just tries to balance the front-end workload as best as it can.

Hope this helps.

No Events found!

Top