In a Hadoop implementation on a
PowerScale cluster,
PowerScaleOneFS serves as the file system for Hadoop compute clients. The Hadoop distributed file system (HDFS) is supported as a protocol, which is used by Hadoop compute clients to access data on the HDFS storage layer.
Hadoop compute clients can access the data that is stored on a
PowerScale cluster by connecting to any node over the HDFS protocol, and all nodes that are configured for HDFS provide NameNode and DataNode functionality as shown in the following illustration.
Each node boosts performance and expands the cluster's capacity. For Hadoop analytics, the
PowerScale scale-out distributed architecture minimizes bottlenecks, rapidly serves Big Data, and optimizes performance.
How a
PowerScaleOneFS Hadoop implementation differs from a traditional Hadoop deployment
A Hadoop implementation with
OneFS differs from a typical Hadoop implementation in the following ways:
The Hadoop compute and HDFS storage layers are on separate clusters instead of the same cluster.
Instead of storing data within a Hadoop distributed file system, the storage layer functionality is fulfilled by
OneFS on a
PowerScale cluster. Nodes on the
PowerScale cluster function as both a NameNode and a DataNode.
The compute layer is established on a Hadoop compute cluster that is separate from the
PowerScale cluster. The Hadoop MapReduce framework and its components are installed on the Hadoop compute cluster only.
Instead of a storage layer, HDFS is implemented on
OneFS as a native, lightweight protocol layer between the
PowerScale cluster and the Hadoop compute cluster. Clients from the Hadoop compute cluster connect over HDFS to access data on the
PowerScale cluster.
In addition to HDFS, clients from the Hadoop compute cluster can connect to the
PowerScale cluster over any protocol that
OneFS supports such as NFS, SMB, FTP, and HTTP.
PowerScaleOneFS is the only non-standard implementation of HDFS offered that allows for multi-protocol access.
PowerScale makes for an ideal alternative storage system to native HDFS by marrying HDFS services with enterprise-grade data management features.
Hadoop compute clients can connect to any node on the
PowerScale cluster that functions as a NameNode instead of being routed by a single NameNode.
Data is not available for the Topic
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please select whether the article was helpful or not.
Comments cannot contain these special characters: <>()\