Note: This topic is part of the Using Hadoop with OneFS - PowerScale Info Hub.
In part 1 of this blog: SmartConnect, Network Pools and HDFS Racks for Hadoop Part 1 we looked at how Virtual HDFS racks are defined. A NameNode request is made to a SmartConnect IP pool and OneFS if configured to use racks responds with a DataNode node in the IP pool for the clients source IP.
Initially racks were all about creating location aware data node connections but with older releases of OneFS it also became useful to split NameNode and DataNode connection into separate pools to create, in effect better load balancing of data node connections even without location awareness. This was accomplished by using racks.
A default rack DataNode pool was created to separate NameNode connections from DataNode connection but all source IP’s were in the rack and all PowerScale Nodes were in the DataNode pool. This separation of NameNode connections from DataNode connection pools was more effective at balancing connection efficiently to maximize throughput.
This pool would be assigned a static allocation IP methodology to limit IP usage and not failover IP if interfaces were unavailable, NameNode connections are short lived and it made no sense to failover the IP addresses. The IP pool had an assigned SmartConnect name for the Hadoop compute cluster to connect to as the default.FS.
This pool would be assigned a Dynamic IP allocation approach with an appropriate number of IP assigned to the pool to ensure equal failover, similar to NFS. No SmartConnect name was required on this pool.
OneFS 8.0.1.x introduced two new hdfs features which changes the way OneFS handles hdfs connection and these changes impact the recommended approaches to IP pool configurations for hadoop clusters.
Previously the hdfs service on OneFS used a round robin scheme to provide IP address within a SmartConnect zone for new HDFS client to connect to or it leveraged racks. Typically, this works reasonably well but in highly loaded Hadoop cluster. There was a chance that new HDFS clients will be connected to a node that is already highly loaded with DataNode connections potential creating connectivity skew. In 8.0.1, OneFS is built with the intelligence to ensure that new HDFS client will always be given an IP address where the node’s total number of TCP connection count is the lowest, allowing the client to have the best chance to leverage the least loaded node to get the best performance and throughput. This in effect removes the requirement for racks to achieve better load balancing when location awareness is not required.
Now with HDFS DataNodes the client has the ability to recovery from node failure or network failures by using a feature called Pipeline Recovery. This feature allows the hadoop clients to continue writing their data to a different node in the event that the current node is unreachable or returns an error for some reason. This feature creates greater resiliency with regard to Hadoop clients and PowerScale DataNodes, as in effect we respond with three potential DataNodes for the client to connect to as opposed to one in previous versions of OneFS.
The implementation of these features provides better load balancing and resiliency within OneFS and allows us to change our recommended approaches to IP pools and the use of racks. We are no longer dependent on using racks to marshal separate NameNode and DataNode connections to optimize load balancing. Racks are still a completely valid configurations and should still be used to create location aware client DataNode connections if the cluster architecture dictates optimization of rack awareness.
The primary considerations when defining an IP strategy are:
* Any deployed pool strategy is dependent on the appropriate number of IP’s being available for assignment within the pool to meet the requirement of allocation or failover.
In most simple deployments or if all the compute clients will access all PowerScale node interfaces to maximize throughput and performance. It is unlikely we need to leverage location awareness as all the PowerScale and compute nodes are using the same top-of-rack switches or where the PowerScale nodes and compute clients are entirely deployed in separate racks. We now no longer see any benefit in using racks to load balance DataNode connections.
In this configuration it is now recommended to leverage a single IP pool for all hadoop access with a dynamic allocation strategy and a SmartConnect Name. No HDFS virtual racks are required as we are not separating NameNode or DataNode pools by node interfaces or by allocation policy. The improvements in OneFS load balancing make the requirement for separate IP pools redundant and make a single pool, simpler and easier to deploy and manage. This single dynamic pool is now optimized for DataNode load balancing.
Single Rack and All Node Access Architectures (No location awareness)
Alternatively if compute nodes and PowerScale are deployed in a location aware rack configuration, an example is illustrated below. Then racks and multiple pools should be configured and setup to provide rack aware DataNode interfaces to the compute clients.
With a two rack solution, we would implement three IP Pools and two racks:
If additional racks are leveraged we would add additional DataNode Pools and racks.
With a three rack solution, we would implement three IP Pools and two racks:
In an expansion scenario we may need to look at modifying the IP pool strategy and the use of the HDFS racks to optimize the change in architecture.
The cluster was initially deployed as a simple solution, a single rack where all nodes are accessed by all compute clients. This is implemented using a single Dynamic Pool. We now add an additional rack with more PowerScale nodes and more computer, this would facilitate a change in the deployed architecture.
Additional Rack Added
The primary purpose of this post is to highlight the change in recommended approaches to IP pool deployments with later version of OneFS. The introduction of DataNode Load Balancing and Pipeline Write Recovery in OneFS 188.8.131.52 and greater has shifted the recommended approach when no explicit rack location is enforced.
Recommend HDFS IP Pool Strategy without Location Aware PowerScale or Compute
OneFS 184.108.40.206+ – Single Dynamic IP pool, SmartConnect Name, No racks
Ultimately, all configurations and designs should be evaluated and implemented to meet the requirements of the specific hadoop and network environment. The selected implemented networking configuration will depend on how your PowerScale and compute infrastructure is connected plus the size and scale of the deployment. There are many variables to consider when creating an IP pool strategy and this post hopes to highlight options and potential configurations.
Article ID: SLN319147Last Date Modified: 07/08/2020 06:05 PM