: This topic is part of the Using Hadoop with OneFS - Isilon Info Hub
Also see the Isilon Cloudera Kerberos Installation Guide
This article presents an overview of the procedure to Kerberize a Cloudera CDH cluster with Isilon against an Active Directory. It provides the core tasks needed to complete the setup.
The article is based on the following versions:
The procedure assumes that the Isilon Hadoop environment is configured and operational, as follows:
- A dedicated Isilon Access Zone is in use (not the system zone).
- The Isilon SmartConnect Zone configuration is implemented per best practice for Isilon HDFS access.
- The Isilon HDFS configuration is correctly configured.
- Cloudera Manager is configured correctly for Isilon integration.
- Cloudera Manager will manage and deploy keytab and krb5.conf files.
- A simple access model currently exists between Hadoop and Isilon; user UID & GID are correctly implemented and allow HDFS access to the Isilon HDFS root with UID & GID parity.
- Hadoop jobs and services are fully operational.
If your environment deviates from any of these configurations, an alternative approach to Kerberization may be required, especially with regards to management of keytabs and krb5.conf files. This procedure does not address all configurations or requirements. Contact Dell EMC support if required. This procedure also does not address Linux host kerberization, Directory Service integration, and the Isilon permissioning model for multiprotocol access following kerberization.
This procedure integrates an existing Cloudera CDH cluster with an Isilon into a pre-existing Microsoft Active Directory environment. The high level approach is:
- Prepare and configure the Active Directory for Isilon – Hadoop integration.
- Prepare the Cloudera cluster and Linux hosts for kerberization.
- Integrate the Isilon cluster into Active Directory.
- Kerberize the HDP cluster via the Cloudera enable Kerberos wizard.
- Complete the integration of Isilon and Cloudera.
- Test and validate Kerberized services.
This procedure is based on using a pre-existing Active Directory environment for Kerberos User Authentication. To use an existing Active Directory domain for the cluster with the Cloudera Kerberos wizard, make sure all of the following conditions are met:
- Isilon, Cloudera Manager and the compute cluster hosts have all the required network access to Active Directory and AD services.
- All DNS name resolution of required Active Directory Services is valid.
- Active Directory secure LDAP (LDAPS) connectivity is configured.
- Active Directory OU User container for principals is created. For example: "OU=Hadoop--Cluster,OU=People,dc=domain,dc=com".
- Active Directory administrative credentials with delegated control of "Create, delete, and manage user accounts" on the OU User container are implemented.
For additional information, see the Cloudera security documents here: http://www.cloudera.com/documentation/enterprise/latest/topics/cm_sg_intro_kerb.html
How Kerberos is Implemented
Since the Isilon integrated Hadoop cluster is a mix between Linux hosts running the compute services and Isilon running the data services, Cloudera cannot effectively complete the Kerberization end-to-end. With Isilon running a clustered operating system (OneFS), ssh-based remote management cannot configure and manage the kerberization of Isilon completely, nor does it need to. However, It can still completely deploy and configure the Linux hosts. Considering these statements, you can consider the kerberization of the Isilon integrated Hadoop cluster in the following context: Isilon is Kerberized.
The Cloudera Kerberization wizard runs and deploys kerberization to Linux and Hadoop services. Since both sets of systems are now fully Kerberized within the same KDC realm, Kerberized user access can occur between the Isilon and Hadoop cluster seamlessly.
1. Ensure that all of these conditions are true:
- Cloudera 5.x or higher is running.
- Forward and reverse DNS between all hosts is tested and validated. Test with dig or ping.
- All services are running (green) on the Cloudera Manager Dashboard.
- All other Cloudera specific Kerberos requirements are met, such as NTP, DNS, packages, and so on.
2. Make the following configuration customization and then restart all services.
- In the Isilon Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml property for the Isilon service, set the value of the hadoop.security.token.service.use_ip property to FALSE.
- You may need to create the key.
Preparing Hosts for Kerberization
For Kerberization to operate on all Hadoop hosts, the required client libraries must be installed. Install the OpenLdap client libraries on the Cloudera Manager Server. Install all Kerberos client libraries on ALL hosts. See the following Cloudera doc for more information: Enabling Kerberos Authentication Using the Wizard
On a RHEL or CentOS system, the following commands install the required libraries.
# yum install krb5-workstation
# yum install krb5-libs
# yum install openldap-clients
On other OS's, use the appropriate packages.
Isilon OneFS Configuration
This section describes the configuration required for OneFS to respond to requests for secure Kerberized HDFS authenticated by Active Directory.
- The cluster must be joined correctly to the target Active Directory as a Provider.
- The Access Zone the HDFS root lives under is configured for this Active Directory provider.
- All IP addresses within the required SmartConnect Zone must be added to the reverse DNS, with the same FQDN for the cluster delegation. All IP's must resolve back to the SmartConnect Zone. This is required for kerberos.
OneFS is a clustered file system running on multiple nodes but is joined to Active Directory as a single Computer Object. The SPN requirements for Kerberized hadoop access are unique. They require additional SPNs for the Access Zone through which the HDFS NameNode access occurs.
Review the registered SPNs on the Isilon cluster and add the required SPNs for the SmartConnect Zone name, if needed.
#isi auth ads spn list --provider-name=<AD PROVIDER NAME>
The following example illustrates the required SPNs:
- Isilon Cluster Name is rip2.foo.com
- SPN is: hdfs/rip2.foo.com
- Access Zone NN SmartConnect FQDN is rip2-cd1.foo.com
- SPNs are : hdfs/rip2-cd1.foo.com and HTTP/rip2-cd1.foo.com
For additional information about adding or modifying Isilon SPNs in Active Directory, see the Isilon CLI Administrative Guide
Isilon Hadoop (HDFS) Changes
The following configuration changes are required on the HDFS Access Zone.
1. Disable simple authentication. This enforces only Kerberos or delegation token authentication access.
# isi hdfs settings modify --authentication-mode=kerberos_only --zone=rip2-cd1
2. Create the required Proxy Users.
Proxy users are required for service account impersonation for specific hadoop services to execute jobs. Add the required proxy users . For information, see the Isilon CLI Administration Guide.
3. Increase the hdfs log level:
#isi hdfs log-level modify --set=verbose
This completes the Isilon hadoop Active Directory setup. The setup includes:
- Isilon joined to Active Directory, with the Provider Online.
- The Active Directory provider is added to the HDFS Access Zone .
- SPNs are correctly configured.
- HDFS Service is configured for kerberos_only.
- DNS configuration is valid.
Enable Kerberos on Cloudera
Now you can kerberize the Cloudera cluster. We recommend to suspend all client and user activity on the Hadoop cluster prior to executing any Kerberization tasks.
On the Dashboard, select Security.
Select Enable Kerberos.
If you followed all procedures so far, all the prerequisites are satisfied. Check all the boxes, and click Continue.
- Add the KDC Server Host FQDN
- Add the Security Realm (the AD domain)
- Add additional Encryption types (OneFS 8.0.x support aes-256)
- Modify the OU for the delegated Cloudera OU to be used for Principals
Werecommend to manage host krb5.conf files through Cloudera Manager.
Check the Manage krb5.conf files through Cloudera Manager.
Accept the defaults.
Since Cloudera Manager creates and manages all the Principals, an AD OU with an Delegated Administrative account is used,
Enter the credentials for the AD user with Delegated access to the OU in the AD Domain.
You can leave the default ports.
Select "Yes, I am ready to restart the cluster now" and Continue.
The kerberization wizard starts.
The wizard creates the required Principals in Active Directory.
Kerberos enablement continues. Services attempt to restart.
The Hue Service will Fail which causes the wizard to halt. This is a known issue and needs a workaround.
Since the failure of the Hue service prevents the wizard from completing, you must do the following:
- Open another browser session to the CM Dashboard: http://<CM-URL>:7180 and review the state of the services.
- You will likely see Service in an unhealthy state.
- Address each of the services individually; starting or restarting them as needed. Monitor the log files to determine the action. Some services may just need restarting manually.
- When all services are started, EXCEPT HUE, you can close the other Kerberization wizard browser.
All services are Kerberized and the cluster is operational (except Hue).
Getting the Hue Service started
See the following document: Getting the Hue Service Started on Kerberized Cloudera with Isilon
The procedure for Kerberizing Cloudera with Isilon is complete. Address any configuration or alarms as needed.
Now you can test the configuration. For some basic kerberos testing methodology, see Ambari HDP with Isilon 22.214.171.124 and Active Directory Kerberos Implementation. That document describes the following tests to validate that the cluster and Isilon are correctly Kerberized.
- Test without a valid ticket (obtain a valid ticket).
- Browse the Hadoop root.
- Write a file.
- Run simple Yarn or TeraGen jobs.