Best Practices for Enterprise User Management in Hadoop Environment

Post on 22-Jan-2018

994 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

Transcript

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

BEST PRACTICES FOR ENTERPRISE USER MANAGEMENT IN HADOOP

ENVIRONMENT

Sailaja PolavarapuSr. Software EngineerHortonworks

Dataworks Summit 2017 Munich

Don Bosco DuraiCofounder & Chief Security ArchitectPrivacera

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Don Bosco Durai

⬢Cofounder and Chief Security Architect at Privacera

⬢Committer in Apache Ranger and Apache Ambari

⬢Contributor in most Apache projects for security

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Sailaja Polavarapu

⬢ Apache Ranger contributor since 2015

⬢ Apache Ranger Committer

⬢ Contributed major improvements for Usersync module in Ranger⬢Currently working at Hortonworks Security Team

⬢ Contact: spolavarapu@apache.org

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Agenda◆ Authentication and Users in Hadoop

◆ Integrating Ranger with AD/LDAP

◆ Common Use cases

◆ LDAP connection check tool

◆ Best practices

◆ Demo

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Most commonly asked question

If I have Ranger, do I need Kerberos?

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why Authenticate Users?

Authentication

Authorization

Auditing

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Service Types

Infrastructure

HDFS

Oozie

Storm

YARNHive

ServerHBase

Zookeeper Kafka

Apps

ZeppelinAmbari

Views

Ambari

AdminRanger

Atlas

LogSearch

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Master Node

Infrastructure - Kerberos

YARN

Resource ManagerHive Server

HDFS

Name Node

Node 1

YARN

Node Manager

HDFS

Data Node

Linux

Process

Linux

Process

Node 2

YARN

Node Manager

HDFS

Data Node

Linux

Process

Linux

Process

2

3 3

4 4

5

6 6

Users

1

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

PortalsNotebooks/Viewer

Apps - Username & Password

Hive Server2

ZeppelinAmbari Views

HDFS

Ambari

Atlas

Ranger

BI Tools

Spark

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Knox - Gateway & SSO

Ambari

WebHDFS (HDFS)

Templeton (HCatalog)

Stargate (HBase)

Oozie

Hive/JDBC

Yarn RM

Storm

Name Node UI

Job History UI

Oozie UI

HBase UI

Yarn UI

Spark UI

Ambari UI

Ranger Admin Console

Services UIs

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Authentication and User Source

Hive JDBC

Web Apps

(Zeppelin, Ranger,

Ambari, Atlas)

CLI/ API(HDFS, Hive Beeline,

HBase, etc.)

LDAP/Kerberos

LDAP

Kerberos

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ranger UserSync

Ranger Admin

Database

AD/ LDAP

Sync Users/Groups

User/Group Synchronization in Ranger

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

User sources

⬢ AD/LDAP –Syncs users and groups from LDAP Organizational Units (OU)

⬢Unix Native Users–Syncs users and groups from /etc/passwd and /etc/group files

⬢ File Sources

–Syncs users and groups from a file specified in the configuration.

–Supports many file formats like - CSV, JSON, etc...

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Integrating Ranger with AD/LDAP

⬢Understanding your deployment–What kind of directory server: Active Directory, OpenLdap

server, etc…?– Is the communication between hadoop cluster and directory

server secure or unsecure?–Do you have atleast a read-only LDAP user for binding?– Any firewall restrictions for communication between hadoop

and directory server? – Is Centrify being used as Ldap proxy?– Does your AD have spaces or special characters in username

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

⬢Gathering details of the directory server structure– AD/LDAP url and bind credentials– Any specific OU(s) for hadoop users and groups?–How many users and groups in the Domain and/or in Ous?– What kind of filters for user search and/or group search to be configured in order to limit the users and groups synced to hadoop?

–What all the available attributes on the directory server for users and groups like uid, sAMAccountname, memberof, objectclass, etc…– Authorization policies to be configured at user level or group level?

Requirements for User Management

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

DC=ad01,DC=hadoop,DC=comOU=Hadoop Users

OU=Hadoop Groups

sAMAccountName=jdoe cn=John Doe

sAMAccountName=bhall cn=Bob Hall

sAMAccountName=asmith cn=Andy Smith

sAMAccountName=acaroll cn=Ashley Caroll

(|(memberof=cn=hdp_testing,ou=Hadoop Groups,dc=hortonworks,dc=com)(memberof=cn=hdp_admin,ou=Hadoop Groups,dc=hortonworks,dc=com)(memberof=cn=dev_ops,ou=Hadoop Groups,dc=hortonworks,dc=com))

cn=hdp_testing

cn=dev_ops

cn=hdp_admin

sAMAccountName=jdoe cn=John Doe

sAMAccountName=bhall cn=Bob Hall

sAMAccountName=asmith cn=Andy Smith

sAMAccountName=acaroll cn=Ashley Caroll

Sample Active Directory Server Structure

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Use Case

⬢Sync all the users that belong to groups -“hdp_testing”, “hdp_admin”, or “dev_ops”

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

User based Search

⬢ Filter based on “memberof” attribute of the user

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

(| (memberof=cn=hdp_testing,ou=Hadoop Groups, dc=hortonworks,dc=com)

(memberof=cn=hdp_admin, ou=Hadoop Groups, dc=hortonworks,dc=com)

(memberof=cn=dev_ops, ou=Hadoop Groups, dc=hortonworks,dc=com) )

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

sAMAccountName

(|(memberof=cn=hdp_testing,ou=Hadoop Groups,

dc=hortonworks,dc=com)

(memberof=cn=hdp_admin, ou=Hadoop Groups,

dc=hortonworks,dc=com)

(memberof=cn=dev_ops, ou=Hadoop Groups,

dc=hortonworks,dc=com))

OU=Hadoop Users,dc=hortonworks,dc=com

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Group based Search⬢ Filter based on the group name or “cn” attribute of the group

(|(cn=hdp_*)(cn=dev_*))

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

cn

OU=Hadoop Groups,dc=hortonworks,dc=com

member

(|(cn=dev_*)(cn=hdp_*))

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

LDAP connection check tool

⬢ Command line tool

⬢ Used for–Discovering various LDAP attributes– Validate the LDAP settings in Ranger, Ambari, or HDFS LDAP

Group Mapping– To retrieve the total number of user and/or groups

⬢ Available as part of ranger installation

⬢ Requires basic information like ldap url, bind credentials, etc… – Command line interface – a template properties file to update the values specific to the

setup

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Tool usage

⬢usage: run.sh

-a ignore authentication properties-d <arg> {all|users|groups}-h show help.-i <arg> Input file name-o <arg> Output directory-r <arg> {all|users|groups}

⬢ All these above parameters are optional

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

CLI option for the Ldap tool

⬢CLI is provided when input file is not specified:

Ldap url [ldap://ldap.example.com:389]:Bind DN [cn=admin,ou=users,dc=example,dc=com]:Bind Password:User Search Base [ou=users,dc=example,dc=com]:User Search Filter [cn=user1]:Sample Authentication User [user1]:Sample Authentication Password:

27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Demo

28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Best practices and Strategies

⬢ Use LDAP/AD for application service authentication⬢ Use Ranger for authorization⬢ Verify the truststore certs are updated across the system in case

of SSL⬢ Use LDAP Connection check tool to

–discover LDAP configuration attributes–verify the number of users and groups to be sync’d to ranger

⬢ Verify if same case conversion and special characters for user and group names are handled uniformly across hadoop environment

–Matching rules must be used in core-site.xml as well as in ranger

29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

user@ranger.apache.org

top related