Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

954

November 19th, 2014 08:00

ViPR HDFS

How does ViPR HDFS work?

64 Posts

November 19th, 2014 13:00

ViPR HDFS provides an HDFS-compatible file system. A ViPR client (a .jar file) installs on the data nodes of an existing Hadoop cluster and registers the ViPR File system (ViPRFS) as a file system that is available for MapReduce jobs as well as Pig, Hive queries etc. Existing storage arrays managed by ViPR can now be made accessible via ViPR HDFS.

ViPR Services create a unified pool (bucket) of data. Similar to ViPR Object, users create buckets which can span file shares that can grow and shrink on demand. The data is distributed across the arrays according to how the virtual storage pool is configured. The bucket provides an HDFS interface or, optionally, an Object (S3) and HDFS interface. In this way, the compute portion of an existing Hadoop cluster communicates with ViPR HDFS, which uses existing data (added to the HDFS bucket) as the target for Big Data applications and queries.

474 Posts

November 19th, 2014 10:00

HDFS is another access protocol/API for the ViPR Services storage engine..  (In addition to S3, Swift, Atmos, etc).  The Hadoop compute nodes can mount viprhdfs:// filesystem and access the objects that are stored in ViPR Services directly.

What specifically would you like to know?

No Events found!

Top