Note: This topic is part of the Using Hadoop with OneFS - PowerScale Info Hub.
This article provides a high level overview of how to leverage the Active Directory Provider for Kerberized Hadoop access using the SFU-rfc2307 extension in AD.
This article considers one configuration methodology used within OneFS to facilitate Kerberized Hadoop on PowerScale. One of the cornerstones of implementation is leveraging the Active Directory's ability to provide UNIX identities for users in addition to the normal SID's, with additional schema attributes complying with rfc2307. Using these additional features, you can simplify user mapping and identity management on PowerScale from a permissions management perspective. Using the rfc2307 extension is definitely not the only method to achieve this but it does provide an elegant and simplified solution.
The following discussion includes considerations for implementing Kerberized Hadoop with AD.
You can enable rfc2307 for SFU support using the CLI. Note that in some versions, the assume default domain switch is missing from the CLI. In that case, look for it an MR.
After enabling these features, validate that look ups are working for short and long name:
By enabling the Active Directory Provider with SFU support for rfc2307, you maintain a consistent user and identity mapping between users executing Hadoop jobs and PowerScale. This allows the implementation of a standard PowerScale permissioning model leveraging the OneFS permission model with posix file permissions. Without SFU-rfc2307 support, PowerScale would need to leverage user mapping to a different LDAP provider that can provide UNIX UID & GID'S for the user.
For more information about the permissioning model, see the following series of multiprotocol articles:
What advantage does enabling SFU-rfc2307 offer? It provides UID's & GID's from Active Directory for your AD user accounts. The access token contains Directory Service based UID/GID and SID. You can permission directly against these AD identities to support full multiprotocol access.
|User's UNIX ID in Active Directory||User's Access Token in PowerScale|
The token validates that the Active Directory provider is pulling the correct information from Active Directory and that the UNIX identities are present.
AD is providing the correct UID for the users running jobs. The on-disk permission is based on UID's & GID'S (as you can see in the token). Also, the permission model is based on posix authoritative permissions and easily managed with existing tools, such as chown and chmod.
Article ID: SLN319144Last Date Modified: 07/08/2020 06:01 PM