Friday, March 21, 2014

Hadoop GroupMapping – LDAP Integration

By Vinay Shukla
LDAP provides a central source for maintaining users and groups within an enterprise. There are two ways to use LDAP groups within Hadoop. The first is to use OS level configuration to read LDAP groups. The second is to explicitly configure Hadoop to use LDAP-based group mapping.
Here is an overview of steps to configure Hadoop explicitly to use groups stored in LDAP.
  • Modify core-site.xml to point to LDAP for group mapping
  • Re-start HDFS NameNode & YARN ResourceManager
  • Verify LDAP based group mapping
Prerequisites: Access to LDAP and the connection details are available.

Step 1: Modify core-site.xml to point to LDAP for group mapping

Back up your core-site.xml before making modifications to it. Below is a sample configuration that needs to be added to core-site.xml. You will need to provide the value for the bind user, bind password and other properties specific to your LDAP and make sure object class, user & group filter match the values specified in your LDAP.
[xml]
hadoop.security.group.mapping
org.apache.hadoop.security.LdapGroupsMapping
hadoop.security.group.mapping.ldap.bind.user
cn=Manager,dc=hadoop,dc=apache,dc=org
<!–
hadoop.security.group.mapping.ldap.bind.password.file
/etc/hadoop/conf/ldap-conn-pass.txt
–>
hadoop.security.group.mapping.ldap.bind.password
hadoop
hadoop.security.group.mapping.ldap.url
ldap://localhost:389/dc=hadoop,dc=apache,dc=org
hadoop.security.group.mapping.ldap.url
ldap://localhost:389/dc=hadoop,dc=apache,dc=org
hadoop.security.group.mapping.ldap.base
hadoop.security.group.mapping.ldap.search.filter.user
(&(|(objectclass=person)(objectclass=applicationProcess))(cn={0}))
hadoop.security.group.mapping.ldap.search.filter.group
(objectclass=groupOfNames)
hadoop.security.group.mapping.ldap.search.attr.member
member
hadoop.security.group.mapping.ldap.search.attr.group.name
cn
[/xml]
While group mapping configuration supports reading password from a file, in the above example relevant configuration is commented out due to this bug (HADOOP-10249) .

Step 2 : Re-start Hadoop

Follow the instructions in the Hortonworks Data Platform documentation to re-start HDFS NameNode & YARN ResourceManager.

Step 3: Verify LDAP group mapping

Run hdfs groups command. This command will fetch groups from LDAP for the current user. Note with LDAP group mapping configured, the hdfs permission can leverage groups defined in LDAP for access control

Conclusion

Since there are two ways in Hadoop to use groups in LDAP, a basic question is when to use each way. The OS based group mapping is a Linux/Unix method and won’t work on Windows. The explicit group mapping covered in this post will work on both Linux & Windows.
Let me know if you run into any issues with the steps in this post or have any comments on this post. In the next post I will cover configuring OS to read group information from LDAP.