Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

ECS 3.6.2 Data Access Guide

PDF

Configuring Hadoop to use ECS HDFS

Hadoop stores system configuration information in several files, including core-site.xml, hdfs-site.xml and hive-site.xml. The ECS HDFS configuration requires you to edit core-site.xml.

Add, or modify several types of properties in the core-site.xml file, including:

  • ECS HDFS Java classes: This set of properties defines the ECS HDFS implementation classes that are contained in the ECS HDFS Client Library.
  • File system location properties: These properties define the file system URI (scheme and authority) to use when running Hadoop jobs, and the IP addresses or FQDNs of the ECS data nodes for a specific ECS file system.
  • Kerberos realm and service principal properties: These properties are required only in a Hadoop environment where Kerberos is present. These properties map Hadoop and ECS HDFS users.

The core-site.xml file resides on each node in the Hadoop cluster. Add the same properties to each instance of core-site.xml.

NOTE:

When modifying configuration files, you should use the management interface (Ambari) rather than manually editing files. Changes that you make using the Ambari management interface are persisted across the cluster.

NOTE: HDFS now supports automatic trash removal for HDFS files that a user removes. In traditional Hadoop, the fs.rash.interval hadoop setting defines the minimum time that is taken to clean up the removed files from a users trash directory, and fs.trash.checkpoint.interval defines the time interval between active and inactive period, the trash cleanup thread takes before checking for removal of the candidates. In this release of ECS, the storage administrator manages the trash removal on the ECS. Use the cf_client command, the bucket, namespace of a Hadoop file system, and maintenance interval settings to define maintenance intervals for all trash folders.

Rate this content

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please provide ratings (1-5 stars).
  Please select whether the article was helpful or not.
  Comments cannot contain these special characters: <>()\