Top Banner
An Approach for Multi- tenant Applications with Apache Knox Larry McCay Architect and Manager for Security Infra - Hortonworks Sumit Gupta Technical Lead for Knox - Hortonworks April 5 th 2017 – DataWorks Summit Munich
19

An Approach for Multi-Tenancy Through Apache Knox

Jan 22, 2018

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Approach for Multi-Tenancy Through Apache Knox

An Approach for Multi-tenant Applications with Apache KnoxLarry McCay

Architect and Manager for Security Infra -HortonworksSumit Gupta

Technical Lead for Knox - HortonworksApril 5th 2017 – DataWorks Summit Munich

Page 2: An Approach for Multi-Tenancy Through Apache Knox

2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Disclaimer

This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately never be developed.

Product capabilities are based on information that is publicly available within the Apache Software Foundation websites (“Apache”). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery.

This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product.

Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

Since this document may contain an outline of general product development plans, customers should not rely upon it when making purchasing decisions.

Page 3: An Approach for Multi-Tenancy Through Apache Knox

3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Agenda

Apache Knox Overview

Topologies

Identity Assertion and Authorization

Multi-tenant Applications What are they?

What are the concerns?

Loanscore SaaS Application Overview,Requirements,Design

Loanscore via Knox, Design

Demo

Q&A

Page 4: An Approach for Multi-Tenancy Through Apache Knox

4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Apache Knox

Page 5: An Approach for Multi-Tenancy Through Apache Knox

5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Apache Knox History and Community Growth

Mar 2013

Entered

Incubator

Oct 2013

0.1.0 - 0.3.0

Incubator

Releases

Feb 2014

Graduates

to

Apache TLP

Apr 2014

0.4.0

TLP

Release

Nov 2014

0.5.0 May 2015

0.6.0Apr/Aug 2016

0.9.0/0.9.1

Feb 2016

0.8.0Dec 2015

0.7.0

Nov 2016

0.10.0Dec 2016

0.11.0

Mar 2017

0.12.0TBD

1.0.0

Target

Release

Date

• Committers: 17

• Contributors from:• Hortonworks, IBM,

CGI, Uber, Oracle, Blue Talon

Apache 0.12.0/HDP 2.6

• Client SDK/DSL Improvements

• Apache Zeppelin Proxying

• YARN RM UI HA Support

• Knox Token Service

• Solr API and UI

Apache 0.11.0

• LDAP Improvements

• Hadoop Group Lookup Support

• Phoenix Server Support (Avatica)

• Management UI

• Metrics

@apache_knox

Page 6: An Approach for Multi-Tenancy Through Apache Knox

6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Apache Knox OverviewProxying Services

Primary goals of the Apache Knox project is to provide access to Apache Hadoop via proxying of HTTP resources.

Authentication Services

Authentication for REST API access as well as WebSSO flow for UIs. LDAP/AD, Header based PreAuth, Kerberos, SAML, OAuth are all available options.

Client DSL/SDK Services

Client development can be done with scripting through DSL or using the Knox Shell classes directly as SDK.

WebSSO

AuthenticationAnd

Federationproviders

Groovy basedDSL

Client DSL/SDK Services

HTTPProxyingServices

UIs

RESTAPIs

WebSockets

Hive

Ambari

HBase

WebHCatWebHDFS

HadoopUIs

Authentication ServicesProxying Services

KnoxShellSDK

TokenSessions

RESTAPI

Classes

KnoxSSO/Token

YARN

Ranger

Zeppelin

Oozie

Phoenix

Gremlin

SQL/DB

SAML

OAuth

LDAP/AD

SPNEGO

HeaderBased

YARNRM

WebHCat

WebHDFS

HiveYARNRM

HBase

Page 7: An Approach for Multi-Tenancy Through Apache Knox

7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Knox Topologies

Which services to proxy– For instance: Hive, WebHDFS, WebHCat, HBase, etc

Unique URLs per topology– For instance: https://localhost:8443/gateway/TOPOLOGY/webhdfs/v1

Separate Hadoop clusters– For example: dev.xml and prod.xml

Different access requirements for the same cluster (through providers)– token.xml and basic.xml

Tenant specific access to the Knox services– acme1.xml and acme2.xml

Page 8: An Approach for Multi-Tenancy Through Apache Knox

8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Identity Assertion and Authorization

Establish the effective identity

Can alter the effective identity through: Principal mapping

Regular expressions

Concatenation of prefixes, suffixes

Establishes security context for service level authorization checks through: The principal and group mapping or transforms described above

Group lookup

Service Level Authorization for the effective user Simple ACL based authorization provider

Ranger Knox plugin

Page 9: An Approach for Multi-Tenancy Through Apache Knox

9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Multi-tenant Applications

Page 10: An Approach for Multi-Tenancy Through Apache Knox

10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

What is a Multi-tenant Application?

– Deployment

– Application

– Data

Shared Infrastructure

– Users have accounts within an Organization’s Account

– Each organization is a tenant

Account Context

Page 11: An Approach for Multi-Tenancy Through Apache Knox

11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Multi-tenancy Concerns

– Tenants cannot view, modify or delete each other’s data

– Tenant admins may only affect tenant specific settings

– Application admins cannot access tenant data

Data Protection

– Users authenticate using their typical or chosen usernames

– Security context must include tenant membership (username ‘bob’ is too ambiguous)

– Only Authenticated and Authorized users may access the system

– Authentication Provider Flexibility

• Application managed providers

• Tenant specific provider integrations

Authentication

Page 12: An Approach for Multi-Tenancy Through Apache Knox

12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Loanscore SaaS Application

Page 13: An Approach for Multi-Tenancy Through Apache Knox

13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Loanscore SaaS Application

Continually improve risk assessment with a central risk model, analytics and machine learning with tenant specific thresholds

– machine learning capabilities

– models for scoring risk

– small businesses and individuals can be scored

– configurable datasources (e.g. yelp)

Application Provides

– Users are employees of the lending institution (e.g. an originator)

– Tenant specific authentication integrations

– Tenants have their own configuration/settings

– Tenants get their own sub-domain and branding

Tenants are Lending Institutions

Page 14: An Approach for Multi-Tenancy Through Apache Knox

14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Loanscore SaaS Application

LoanScoringBusinessLogicand Branding

TenantSpecificAuthentication(Login form, LDAP, SAML, etc)

UserDisambiguationfor access toHadoop(bob -> bob_goodloans)

HadoopAccess(Kerberos + doas)

SAMLIDP

CorpADLDAP

Loanscore SaaS

Authentication

Application must account for authentication configuration per tenant. This is for different LDAP search bases within a shared LDAP or tenant specific LDAP servers or IdP integrations.

Business Logic of the App

The business logic and branding of the application for each tenant.

User Disambiguation

The effective security context for backend interactions must contain the tenant affiliation for authorization policy to be enforced properly.

Hadoop Access Patterns

REST API calls to Hadoop services generally require kerberos+doas for secure clusters.

Page 15: An Approach for Multi-Tenancy Through Apache Knox

15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Loanscore SaaS Application v2.0

LoanScoringBusinessLogicand Branding

TenantSpecificAuthentication(Login form, LDAP, SAML, etc)

UserDisambiguationfor access toHadoop(bob -> bob_goodloans)

HadoopAccess(Kerberos + doas)

SAMLIDP

CorpADLDAP

Loanscore SaaS v2.0

Authentication

Application must account for authentication configuration per tenant. This is for different LDAP search bases within a shared LDAP or tenant specific LDAP servers or IdP integrations.

Business Logic of the App

The business logic and branding of the application for each tenant.

User Disambiguation

The effective security context for backend interactions must contain the tenant affiliation for authorization policy to be enforced properly.

Hadoop Access Patterns

REST API calls to Hadoop services generally require kerberos+doas for secure clusters.

Page 16: An Approach for Multi-Tenancy Through Apache Knox

16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Loanscore SaaS with Knox Services

Business Logic, Branding, Knox Client SDK

Loanscore SaaS

SAMLIDP

CorpADLDAP

Tenant SpecificAuthentication

UserDisambiguation(identity assertion)

KnoxSSO Proxying

Knox Authentication and Proxying Services

kerberos+doas or simple auth

Proxying Services

By proxying the app through Apache Knox, the gateway is able to require authentication prior to the user accessing the actual application. Hadoop API access is also proxied through Knox and the dispatch within the gateway handles the kerberos+doas and user disambiguating requirements.

Authentication Services

The authentication or federation provider within the proxying topology for the tenant may contain the actual authentication configuration or may redirect to KnoxSSO for a WebSSO flow.

Client SDK

The backend of the application may consume Hadoop REST APIs via the KnoxShell client classes.

Page 17: An Approach for Multi-Tenancy Through Apache Knox

17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Loanscore SaaS with Knox Services

LoanscoreApplicationSAML

IDPCorpADLDAP

goodloans-sso.xml

goodloans.xml(user disambiguation)

KnoxSSO Proxying

https://goodloans.loanscore.comhttps://unwise.loanscore.com

doas=bob_goodloans

username: bob

password: ***

1. Goodloans originator bob navigates to the goodloan’s loanscore app URL

2. Since he has yet to authenticate he is redirected to the KnoxSSO topology for goodloans

3. He is authenticated against the goodloan’sconfigured identity provider. He provides his username and password (bob:***)

4. Upon successful auth he is redirected back the loanscore application and granted access

5. The user principal propagated to the loanscore app has been disambiguated by adding the tenant name to the end of the username (bob_goodloans) in the identity assertion provider

6. Loanscore app adds a file to a tenant specific directory within HDFS using KnoxShell SDK classes

Page 18: An Approach for Multi-Tenancy Through Apache Knox

18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Demo

Page 19: An Approach for Multi-Tenancy Through Apache Knox

19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Q&A