An Approach for Multi- tenant Applications with Apache Knox Larry McCay Architect and Manager for Security Infra - Hortonworks Sumit Gupta Technical Lead for Knox - Hortonworks April 5 th 2017 – DataWorks Summit Munich
An Approach for Multi-tenant Applications with Apache KnoxLarry McCay
Architect and Manager for Security Infra -HortonworksSumit Gupta
Technical Lead for Knox - HortonworksApril 5th 2017 – DataWorks Summit Munich
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Disclaimer
This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately never be developed.
Product capabilities are based on information that is publicly available within the Apache Software Foundation websites (“Apache”). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery.
This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product.
Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
Since this document may contain an outline of general product development plans, customers should not rely upon it when making purchasing decisions.
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Apache Knox Overview
Topologies
Identity Assertion and Authorization
Multi-tenant Applications What are they?
What are the concerns?
Loanscore SaaS Application Overview,Requirements,Design
Loanscore via Knox, Design
Demo
Q&A
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Knox
5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Knox History and Community Growth
Mar 2013
Entered
Incubator
Oct 2013
0.1.0 - 0.3.0
Incubator
Releases
Feb 2014
Graduates
to
Apache TLP
Apr 2014
0.4.0
TLP
Release
Nov 2014
0.5.0 May 2015
0.6.0Apr/Aug 2016
0.9.0/0.9.1
Feb 2016
0.8.0Dec 2015
0.7.0
Nov 2016
0.10.0Dec 2016
0.11.0
Mar 2017
0.12.0TBD
1.0.0
Target
Release
Date
• Committers: 17
• Contributors from:• Hortonworks, IBM,
CGI, Uber, Oracle, Blue Talon
Apache 0.12.0/HDP 2.6
• Client SDK/DSL Improvements
• Apache Zeppelin Proxying
• YARN RM UI HA Support
• Knox Token Service
• Solr API and UI
Apache 0.11.0
• LDAP Improvements
• Hadoop Group Lookup Support
• Phoenix Server Support (Avatica)
• Management UI
• Metrics
@apache_knox
6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Knox OverviewProxying Services
Primary goals of the Apache Knox project is to provide access to Apache Hadoop via proxying of HTTP resources.
Authentication Services
Authentication for REST API access as well as WebSSO flow for UIs. LDAP/AD, Header based PreAuth, Kerberos, SAML, OAuth are all available options.
Client DSL/SDK Services
Client development can be done with scripting through DSL or using the Knox Shell classes directly as SDK.
WebSSO
AuthenticationAnd
Federationproviders
Groovy basedDSL
Client DSL/SDK Services
HTTPProxyingServices
UIs
RESTAPIs
WebSockets
Hive
Ambari
HBase
WebHCatWebHDFS
HadoopUIs
Authentication ServicesProxying Services
KnoxShellSDK
TokenSessions
RESTAPI
Classes
KnoxSSO/Token
YARN
Ranger
Zeppelin
Oozie
Phoenix
Gremlin
SQL/DB
SAML
OAuth
LDAP/AD
SPNEGO
HeaderBased
YARNRM
WebHCat
WebHDFS
HiveYARNRM
HBase
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Knox Topologies
Which services to proxy– For instance: Hive, WebHDFS, WebHCat, HBase, etc
Unique URLs per topology– For instance: https://localhost:8443/gateway/TOPOLOGY/webhdfs/v1
Separate Hadoop clusters– For example: dev.xml and prod.xml
Different access requirements for the same cluster (through providers)– token.xml and basic.xml
Tenant specific access to the Knox services– acme1.xml and acme2.xml
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Identity Assertion and Authorization
Establish the effective identity
Can alter the effective identity through: Principal mapping
Regular expressions
Concatenation of prefixes, suffixes
Establishes security context for service level authorization checks through: The principal and group mapping or transforms described above
Group lookup
Service Level Authorization for the effective user Simple ACL based authorization provider
Ranger Knox plugin
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Multi-tenant Applications
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
What is a Multi-tenant Application?
– Deployment
– Application
– Data
Shared Infrastructure
– Users have accounts within an Organization’s Account
– Each organization is a tenant
Account Context
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Multi-tenancy Concerns
– Tenants cannot view, modify or delete each other’s data
– Tenant admins may only affect tenant specific settings
– Application admins cannot access tenant data
Data Protection
– Users authenticate using their typical or chosen usernames
– Security context must include tenant membership (username ‘bob’ is too ambiguous)
– Only Authenticated and Authorized users may access the system
– Authentication Provider Flexibility
• Application managed providers
• Tenant specific provider integrations
Authentication
12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Loanscore SaaS Application
13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Loanscore SaaS Application
Continually improve risk assessment with a central risk model, analytics and machine learning with tenant specific thresholds
– machine learning capabilities
– models for scoring risk
– small businesses and individuals can be scored
– configurable datasources (e.g. yelp)
Application Provides
– Users are employees of the lending institution (e.g. an originator)
– Tenant specific authentication integrations
– Tenants have their own configuration/settings
– Tenants get their own sub-domain and branding
Tenants are Lending Institutions
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Loanscore SaaS Application
LoanScoringBusinessLogicand Branding
TenantSpecificAuthentication(Login form, LDAP, SAML, etc)
UserDisambiguationfor access toHadoop(bob -> bob_goodloans)
HadoopAccess(Kerberos + doas)
SAMLIDP
CorpADLDAP
Loanscore SaaS
Authentication
Application must account for authentication configuration per tenant. This is for different LDAP search bases within a shared LDAP or tenant specific LDAP servers or IdP integrations.
Business Logic of the App
The business logic and branding of the application for each tenant.
User Disambiguation
The effective security context for backend interactions must contain the tenant affiliation for authorization policy to be enforced properly.
Hadoop Access Patterns
REST API calls to Hadoop services generally require kerberos+doas for secure clusters.
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Loanscore SaaS Application v2.0
LoanScoringBusinessLogicand Branding
TenantSpecificAuthentication(Login form, LDAP, SAML, etc)
UserDisambiguationfor access toHadoop(bob -> bob_goodloans)
HadoopAccess(Kerberos + doas)
SAMLIDP
CorpADLDAP
Loanscore SaaS v2.0
Authentication
Application must account for authentication configuration per tenant. This is for different LDAP search bases within a shared LDAP or tenant specific LDAP servers or IdP integrations.
Business Logic of the App
The business logic and branding of the application for each tenant.
User Disambiguation
The effective security context for backend interactions must contain the tenant affiliation for authorization policy to be enforced properly.
Hadoop Access Patterns
REST API calls to Hadoop services generally require kerberos+doas for secure clusters.
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Loanscore SaaS with Knox Services
Business Logic, Branding, Knox Client SDK
Loanscore SaaS
SAMLIDP
CorpADLDAP
Tenant SpecificAuthentication
UserDisambiguation(identity assertion)
KnoxSSO Proxying
Knox Authentication and Proxying Services
kerberos+doas or simple auth
Proxying Services
By proxying the app through Apache Knox, the gateway is able to require authentication prior to the user accessing the actual application. Hadoop API access is also proxied through Knox and the dispatch within the gateway handles the kerberos+doas and user disambiguating requirements.
Authentication Services
The authentication or federation provider within the proxying topology for the tenant may contain the actual authentication configuration or may redirect to KnoxSSO for a WebSSO flow.
Client SDK
The backend of the application may consume Hadoop REST APIs via the KnoxShell client classes.
17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Loanscore SaaS with Knox Services
LoanscoreApplicationSAML
IDPCorpADLDAP
goodloans-sso.xml
goodloans.xml(user disambiguation)
KnoxSSO Proxying
https://goodloans.loanscore.comhttps://unwise.loanscore.com
doas=bob_goodloans
username: bob
password: ***
1. Goodloans originator bob navigates to the goodloan’s loanscore app URL
2. Since he has yet to authenticate he is redirected to the KnoxSSO topology for goodloans
3. He is authenticated against the goodloan’sconfigured identity provider. He provides his username and password (bob:***)
4. Upon successful auth he is redirected back the loanscore application and granted access
5. The user principal propagated to the loanscore app has been disambiguated by adding the tenant name to the end of the username (bob_goodloans) in the identity assertion provider
6. Loanscore app adds a file to a tenant specific directory within HDFS using KnoxShell SDK classes
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Demo
19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Q&A