Top Banner
Secure Hadoop Clusters on Windows Platform Hadoop Users Group Bucharest Jan 29/2015 Remus Rusanu
14

Secure Hadoop clusters on Windows platform

Jul 18, 2015

Download

Software

Remus Rusanu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Secure Hadoop clusters on Windows platform

Secure Hadoop Clusters on Windows Platform

Hadoop Users GroupBucharest Jan 29/2015

Remus Rusanu

Page 2: Secure Hadoop clusters on Windows platform

about:me

• SQL Server engine developer since 2001

• Worked on HDInsight service (Azure Hadoop offering) and on PDW appliance Hadoop region

• Hive contributor: vectorized execution engine HIVE-4160

• Hadoop contributor: Windows secure YARN containers YARN-2190

• @rusanu

• Stack Overflow user 105929

Page 3: Secure Hadoop clusters on Windows platform

Integrate Hadoop with Windows Security

• Integrate your cluster with the existing Active Domain

• Integrated security• Use Windows domain users

• No need for local users, local passwords

• Single sign-on• Only provide password when opening OS session

• Group membership provided from AD groups

Page 4: Secure Hadoop clusters on Windows platform

Benefits

• Group membership based access control• Domain\HadoopUsers: Granted access to Hadoop cluster

• Domain\NewHire is added to HadoopUsers

• Domain\NewHire has access to Hadoop cluster

• Centralized password control• Only administer the Active Domain

• Integrate with the rest of the enterprise that uses AD

Page 5: Secure Hadoop clusters on Windows platform

What can leverage AD based Access Control

• HDFS

• M/R queues

• HTTP interfaces (Web UI)

• Hadoop ecosystem stack• Oozie proxy (Hadoop super)

Page 6: Secure Hadoop clusters on Windows platform

Secure Hadoop clusters

• “Kerberized” cluster• Users are authenticated using Kerberos• Services authenticate each other using Kerberos

• Data encryption in traffic• RPC, block transfer, HTTP• No data encryption at rest

• Permission control for containers (task)• Containers cannot access service (NM) files

• Process isolation• Containers cannot access each other files

• SecureMode: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html

Page 7: Secure Hadoop clusters on Windows platform

Windows and Secure Hadoop Clusters

• Windows does support integrated authentication and single sign on with a Kerberized cluster• Can be a Linux Kerberized cluster too, with proper KDC configuration

• Requires allowtgtsessionkey key in registry

• See KB308339: Registry Key to Allow Session Keys to Be Sent in Kerberos Ticket-Granting-Ticket: http://support.microsoft.com/kb/308339

• Not allowed for LUA, see KB2627903 Access to Session Keys not possible using a restricted Token: http://support.microsoft.com/kb/2627903

• This solves the problem of authenticating the user at cluster periphery (job submit)

Page 8: Secure Hadoop clusters on Windows platform

Securing Hadoop Services

• Same as Linux configuration• Hadoop.security.authentication: Kerberos• Hadoop.security.authorization: True• Hadoop.http.filter.initializers: org.apache.Hadoop.security.AuthenticationFilterInitializer• Hadoop.http.authentication.type: Kerberos• Etc. etc. Refer to your installation Secure Mode guide.

• Use ktpass.exe to obtain keytab files for NT domain users• https://technet.microsoft.com/en-us/library/cc753771.aspx

• Configure KDC in krb5.ini for the realm (domain)

• Enable AES128 and AES256 for the user accounts in AD• msDS-SupportedEncryptionTypes

• Not required for Hadoop services to run as the service accounts, Hadoop will use principal names and keytabfiles anyway• I recommend it none the less, confusing otherwise

• Java runtime must contain the Unlimited Strength JCE policy files

Page 9: Secure Hadoop clusters on Windows platform

Group Membership Provider

• LDAP provider already works• hadoop.security.group.mapping: org.apache.hadoop.security.LdapGroupsMapping

• HDFS access control

• M/R queues access control

• Add LDAP_MATCHING_RULE_IN_CHAIN• hadoop.security.group.mapping.ldap.search.attr.member:

member:1.2.840.113556.1.4.1941:

• This rule is limited to filters that apply to the DN. This is a special "extended match operator that walks the chain of ancestry in objects all the way to the root until it finds a match

• See https://msdn.microsoft.com/en-us/library/aa746475%28v=vs.85%29.aspx

Page 10: Secure Hadoop clusters on Windows platform

Windows Secure Container Executor

• Windows platform equivalent of LinuxContainerExecutor• http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-

site/SecureContainer.html

• Yarn-2190

• Leverages the S4U (Self4User) Kerberos extension• A process that has the SE_TCB (Trusted Computing Base) privilege can

impersonate an arbitrary user w/o providing a password for said user

• Creates an isolated environment for a container and then launches the container impersonating the user

Page 11: Secure Hadoop clusters on Windows platform

Configuring WSCE

• Requires a privileged NT service: • %HADOOP_HOME%\bin\winutils /service

• Must run as LocalSystem

• Equivalent of LinuxContainerExecutor’s container executor binary with setuidset and owned by root

• Requires %HADOOP_HOME%\etc\hadoop\wsce_site.xml• impersonate.allowed: users allowed to be impersonated

• impersonate.denied: users explicitly forbidden from being impersonated

• Very powerful: launch a process as arbitrary user• Validates that wsce_site.xml is writable only by Administrators

Page 12: Secure Hadoop clusters on Windows platform

Demo

Page 13: Secure Hadoop clusters on Windows platform

//TODO

• Forests, domain trust etc.• Currently works only with one single domain

• Hadoop infrastructure modeled after Linux security model, does not support “domain\user”

• Delegation• S4U extension does not support delegation

• Container cannot access resource outside the node host• Eg. sqoop access SQL Server under Integrated Security: won’t work

• Deployment/configuration support• Ambari (Hortonworks)

Page 14: Secure Hadoop clusters on Windows platform

Q & A