Ambari HDP with Isilon 8.0.0.1 and Active Directory Kerberos Implementation


Ambari HDP with Isilon 8.0.0.1 and Active Directory Kerberos Implementation




Note: This topic is part of the Using Hadoop with OneFS - Isilon Info Hub.

This article presents a high level overview of the procedure to Kerberize an Ambari HDP cluster with Isilon against an Active Directory. It provides the core tasks needed to complete the setup.

This procedure is based on the following:
  • Isilon 8.0.0.1
  • Ambari 2.2.1.0
  • HDP Stack 2.4.2.0-258
This discussion assumes the following:
  • Isilon Hadoop environment is configured and operational.
  • A dedicated Isilon Access Zone is in use (not the system zone).
  • The Isilon SmartConnect Zone configuration is implemented per best practice for Isilon HDFS access.
  • The Isilon HDFS configuration is correctly configured.
  • Ambari is configured correctly for Isilon integration.
  • Ambari will manage and deploy keytab and krb5.conf files
  • A simple access model currently exists between Hadoop and Isilon; user UID & GID are correctly implemented and allow HDFS access to the Isilon HDFS root with UID & GID parity. Isilon and Hadoop Local User UID Parity
  • Hadoop jobs and services are fully operational.
If your environment deviates from any of these configurations, an alternative approach to Kerberization may be required, especially regarding keytabs and krb5.conf file management. This article does not address all configurations or requirements. Contact Dell EMC if required.

This article also does not address Linux host kerberization, Directory Service integration and the Isilon permissioning model for multiprotocol access following kerberization.

This article describes how to integrate an existing Ambari HDP cluster with an Isilon into a pre-existing Microsoft Active Directory environment. The high level approach is:
  • Prepare and configure the Active Directory for Isilon – Hadoop integration
  • Prepare the Ambari HDP cluster and hosts for kerberization
  • Integrate the Isilon cluster into Active Directory
  • Kerberize the HDP cluster via the Ambari wizard
  • Complete the integration of Isilon and HDP
  • Test and validate
Assume that there is a preexisting Active Directory environment for Kerberos User Authentication.

To use an existing Active Directory domain for the cluster with the Ambari wizard Kerberos Setup, you must prepare the following:
  • Isilon, Ambari Server and compute cluster hosts have all the required network access to Active Directory and Active Directory services
  • All DNS name resolution of required Active Directory Services is valid
  • Active Directory secure LDAP (LDAPS) connectivity has been configured.
  • Active Directory OU User container for principals has been created, For example "OU=Hadoop-Cluster,OU=People,dc=domain,dc=com"
  • Active Directory administrative credentials with delegated control of "Create, delete, and manage user accounts" on the OU User container are implemented.

For additional information, see the Hortonworks security documents.

How Kerberos is implemented here:

Since the Isilon integrated Hadoop cluster is a mix between Linux hosts running the compute services and Isilon running the data services, Ambari cannot effectively complete the Kerberization end-to-end. With Isilon running a clustered operating system (OneFS), the Ambari agent cannot configure and manage the kerberization of Isilon completely, nor does it need to. It can completely deploy and configure the Linux hosts however.

Because of this the kerberization of the Isilon integrated Hadoop cluster should be considered in the following context:

  • Isilon is Kerberized
  • Ambari Kerberization wizard runs and deploys kerberization to Linux and Hadoop
Since both sets of systems are now fully Kerberized within the same KDC realm, Kerberized user access can occur between the Isilon and Hadoop cluster seamlessly.


Ambari Pre-Configuration

  • Review that Ambari 2.0 or higher is running.

  • Forward and reverse DNS between all hosts is tested and validated. Test this with dig or ping.

  • All services are running (green) on the Ambari Dashboard.

  • All other Ambari specific Kerberos requirements have been met; NTP, DNS, packages etc.

Before launching the Ambari Kerberization wizard, you must make two configuration customizations and restart all services.

  1. In HDFS -> Custom core-site set "hadoop.security.token.service.use_ip" to "false" to the core-site.xml.

This key may need creating:

Key after addition:


  1. In MapReduce2 -> Advanced mapred-site add "`hadoop classpath`:" to the beginning of "mapreduce.application.classpath". Note the colon and backticks (but do not copy the quotation marks).

Locate the mapreduce.application.classpath key and add `hadoop classpath`:, save and restart the service.



Isilon OneFS Configuration

This section covers the configuration required for OneFS to respond to requests for secure Kerberized HDFS authenticated by Active Directory.

  • The cluster must be joined correctly to the target Active Directory.

  • The Access Zone the HDFS root lives under is configured for this Active Directory provider.

  • All IP addresses within the required SmartConnect Zone must be added to the reverse DNS with the same FQDN for the cluster delegation. All IP's should resolve back to the SmartConnect Zone. This is required for kerberos.

Isilon SPN's

Since OneFS is a clustered file system running on multiple nodes but is joined to Active Directory as a single Computer Object. The SPN requirements for Kerberized hadoop access are unique. The required SPN’s for hadoop access are as follows, it requires additional SPN’s for the Access Zone that HDFS NameNode access is made through:




Review the registered SPN’s on the Isilon cluster and add the required SPN’s for the SmartConnect Zone name if needed.

#isi auth ads spn list --provider-name=<AD PROVIDER NAME>


The following example illustrates the required SPN’s:

  • Isilon Cluster Name - rip2.foo.com – SPN: hdfs/rip2.foo.com

  • Access Zone NN SmartConnect FQDN - hdfs/rip2-horton1.foo.com & HTTP/rip2-horton1.foo.com


For additional information on adding or modifying Isilon SPN’s in Active Directory see the Isilon CLI Administrative Guide.


Isilon Hadoop (HDFS) Changes

The following configuration changes are required on the HDFS Access Zone.

  1. Disable simple authentication. This enforces only Kerberos or delegation token authentication access only.

# isi hdfs settings modify --authentication-mode=kerberos_only --zone=rip2-horton1

  1. Create the required Proxy Users

Proxy users are required for service account impersonation for specific hadoop services to execute jobs, add the required proxy users as needed. More on proxy users in a later post and review the Isilon CLI administrative guide.

  1. Increase the hdfs log level

# isi hdfs log-level modify --set=verbose

This completes the Isilon hadoop Active Directory setup:

  • Isilon joined to Active Directory, Provider Online

  • HDFS Access Zone has Active Directory provider added

  • SPN's are correctly configured

  • HDFS Service configured for kerberos_only

  • DNS configuration is valid


Kerberize Ambari Wizard

The following outlines the steps to run the Ambari Kerberization wizard and any customization required to allow the wizard to integrate with Isilon upon completion. It is suggested to stop all user activity on the Hadoop cluster prior to executing a Kerberization task.

1. Enable Kerberization

From the Ambari WebUI, Select Admin and Kerberos



Then ‘Enable Kerberos’

Proceed at the warning.


2. Getting Started

At the ‘Get Started’ screen, select an ‘Existing Active Directory’ as the type of KDC you plan to use. In order to precede select all the check boxes to agree that you have met and completed all the prerequisites. This document does not include direction on setting up and completing these requirements, for additional information on meeting these prerequisites it is suggested the Hortonworks Security Guide is consulted for Ambari and Microsoft documentation is consulted for Active Directory information and configuration guidance.

Once you have met and selected the checkboxes for all the prerequisites, the wizard can continue.

The Ambari Kerberos Wizard will request information related to the KDC, LDAP URL, REALM, Active Directory OU and delegated Ambari user account is shown below. The Account will be used to Bind to Active Directory and create all the Ambari required principals in Active Directory.

3. Configure Kerberos

Enter the required information about the KDC and Test the KDC Connection.

  • KDC Host – An Active Directory Domain Controller

  • Realm Name – The name of the Kerberos realm you are joining

  • LDAP URL – The LDAP URL of the Directory Domain Controller; adding port 636 allows secure ldap.

  • Container DN – The OU that delegated access was granted on

  • Domains – (optional) A comma separated list of domain names to map server host names to realm names

  • Kadmin host – An Active Directory Domain Controller

  • Admin Principal – The Active Directory User account with delegated rights

  • Admin password – Password for the Admin Principal


The Advanced Kerberos-env setting should be reviewed, but no changes are required. As of OneFS 8.0.0.0 aes-256 encryption is supported.


The Advanced krb5-conf setting should be reviewed, but no changes are required.




Once all Configuration have been addressed, proceed with the wizard; Next.




The Wizard will deploy and configure the Kerberos Clients to all hosts.


Even though the Kerberos client and configuration is not being pushed to Isilon at this time, it will appear to and report success.



Ensure the successful deployment and test of the Kerberos Clients.


Click, Next to continue.

4. Ambari Principals customization for Isilon Integration

A number of changes need to be made to the principals that will be used and created by the Kerberization wizard:

Ambari creates user principals in the form $ -$ @$ , then uses hadoop.security.auth_to_local in core-site.xml to map the principals into just $ on the file system.

Isilon does not honor the mapping rules, so you must remove the -$ from all principals in the "Ambari Principals" section. Isilon will strip off the @$ , so no aliasing is necessary. In my Ambari cluster running HDFS, YARN, MapReduce2, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Ambari Metrics and Spark,

Make the following modifications in the "General" tab:

  • Smokeuser Principal Name: ${cluster-env/smokeuser}-$ @$ => ${cluster-env/smokeuser}@$

  • Spark.history.kerberos.principal: ${spark-env/spark_user}-$ @$ => ${spark-env/spark_user}-@${REALM}

  • HBase user principal: ${hbase-env/hbase_user}-$ @$ => ${hbase-env/hbase_user}@$

  • HDFS user principal: ${hadoop-env/hdfs_user}-$ @$ => ${hadoop-env/hdfs_user}@$

Additional Principals will require updating if these services are running.

  • Storm principal name: ${storm-env/storm_user}-$ @$ => ${storm-env/storm_user}-@$

  • accumulo_principal_name: ${accumulo-env/accumulo_user}-$ @$ => ${accumulo-env/accumulo_user}@$

  • trace.user: tracer-$ @$ => tracer@$

(you can see the modified principals with the orange reset arrow)

5. Ambari Created User Principals

Ambari creates users principals, some of which are different than their UNIX usernames. Again, since Isilon does not honor the mapping rules, you must modify the principal names to match their UNIX usernames. Make the following modifications the principal and the keytab name in the "Advanced" tab:

HDFS > dfs.namenode.kerberos.principal = nn/_HOST@$ => hdfs/_HOST@${REALM}
HDFS > dfs.namenode.keytab.file = $ /nn.service.keytab => $ /hdfs.service.keytab

HDFS > dfs.secondary.namenode.kerberos.principal = nn/_HOST@$ => hdfs/_HOST@$
HDFS > dfs.secondary.namenode.keytab.file = $ /nn.service.keytab => $ /hdfs.service.keytab

HDFS > dfs.datanode.kerberos.principal = dn/_HOST@$ => hdfs/_HOST@${REALM
HDFS > dfs.datanode.keytab.file = $ /dn.service.keytab => $ /hdfs.service.keytab

MapReduce2 > mapreduce.jobhistory.principal = jhs/_HOST@$ => mapred/_HOST@$
MapReduce2 > mapreduce.jobhistory.keytab = $ /jhs.service.keytab => $ /mapred.service.keytab

YARN > yarn.nodemanager.principal = nm/_HOST@$ => yarn/_HOST@$
YARN > yarn.nodemanager.keytab = $ /nm.service.keytab => $ /yarn.service.keytab

YARN > yarn.resourcemanager.principal = rm/_HOST@$ => yarn/_HOST@$
YARN > yarn.resourcemanager.keytab = $ /rm.service.keytab => $ /yarn.service.keytab

Falcon > *.dfs.namenode.kerberos.principal = nn/_HOST@$ => hdfs/_HOST@$

The changes to the HDFS and MapReduce2 principals are illustrated below.





After configuring the appropriate principals, press "Next". At the "Confirm Configuration" screen, press Next.


6. Confirm Configuration

Review the configuration, and proceed Next, Exiting the Wizard here will remove all configuration and customization's and they will need re-entering.



Download the csv and review. The csv file contains all the principals and keytabs that the Ambari will create in Active Directory. The list contains principals and keytabs for Isilon but these keytabs will not be distributed to the Isilon cluster. Isilon kerberization has already occurred and is implemented through joining Active Directory.


7. Stop Services

This will stop all the Hadoop services in Ambari. All user activity will stop



Services on all hosts will stop.



On successful stopping of all services, proceed with Next.


8. Kerberize Cluster

The Kerberization wizard will begin execution of the Kerberization of the Ambari Services, create principals in Active Directory and distribute keytabs.



Following the creation of principals, you can view all the Active Directory principals in the Hadoop OU.



Note: The UPN and sAMAccountName differ in Active Directory; this does not present any problems in simple installation. Complex custom installs may require additional configuration to enable Isilon multi-protocol functionality to operate correctly. More on this in later posts.



Kerberization of the cluster completes successfully!



Since Ambari created principals for the Isilon cluster in AD during deployment of kerberos that are not required, these need removing from Active Directory.

Remove the following User from the Hadoop OU:

  • hdfs/<isilon-clustername>

  • HTTP/<isilon-clustername>



Remove the user AD principal’s auto created by the ambari kerberization wizard for the Isilon cluster;

Following removal of the users.


9. Start and Test Services

The wizard will now attempt to start all the Kerberized Hadoop services on Ambari.


If some services fail to start, they can always be restarted. It is often common to see some failures. Review the start up logs of the service and monitor the Isilon /var/log/hdfs.log while services are starting to review what is happening.



If some services do fail, move on and troubleshoot each service independently.

On completion of the Kerberos wizard, you can see the configuration in Ambari.



A few services need restarting.



On restart of these services, the cluster and all hdfs services are running and the cluster is green.



This completes the Kerberos deployment of the Hadoop services. Ambari has Kerberized the Hadoop cluster and Isilon is a valid Active Directory provider. You can now test and validate that Kerberos authentication is operational against the Isilon HDFS data.


Test and Validation Hadoop Services

To validate the newly Kerberized cluster, run a few simple tests.


1. No Kerberos Ticket Test

The cluster is now Kerberized and Isilon is enforcing Kerberos_only access to the HDFS root. If you attempt to run any simple hadoop commands, they will fail if you do not have a valid kerberos ticket. This is a good test to validate that simple authentication is still not happening.




2. Valid Kerberos Ticket Test

Get a kerberos ticket for your test user using a kinit command: $kinit <ad user name>




Execute a simple HDFS directory listing: $hadoop fs –ls /


3. Execute a simple file system write

Create a simple file on the Isilon Hadoop root: $hadoop fs -touchz /user/hdpuser3/This_file_testing_Kerberos.txt\



4. Run a simple yarn job without a valid Kerberos ticket

You see a lots of kerberos errors.



5. Run a simple yarn job that access the file system

In the output, you see the delegation token used to execute the kerberized job.




If you see issues with running Kerberized jobs, you can increase the kerberos logging to show more details, as described in Having Kerberos Authentication Issue, DEBUG it.



Article ID: SLN319437

Last Date Modified: 02/27/2020 05:07 PM

Rate this article

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please provide ratings (1-5 stars).
Please select whether the article was helpful or not.
Comments cannot contain these special characters: <>()\
characters left.