Top Banner
1 © Hortonworks Inc. 2011 – 2017 All Rights Reserved Hortonworks Confidential. For Internal Use Only. AUTOMATIC DETECTION, CLASSIFICATION, AND AUTHORIZATION OF SENSITIVE PERSONAL DATA IMPACTED BY GDPR Srikanth Venkat – Senior Director, Product Management, Hortonworks Subra Ramesh – VP, Products & Engineering, Dataguise
29

Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

Jan 22, 2018

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

1 © Hortonworks Inc. 2011 – 2017 All Rights Reserved Hortonworks Confidential. For Internal Use Only.

AUTOMATIC DETECTION, CLASSIFICATION, AND AUTHORIZATION OF SENSITIVE PERSONAL DATA IMPACTED BY GDPRSrikanth Venkat – Senior Director, Product Management, Hortonworks

Subra Ramesh – VP, Products & Engineering, Dataguise

Page 2: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

2© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Agenda

2

GDPR Overview

GDPR Personal Data – what it requires

GDPR – Controller vs. Processor Requirements

Addressing GDPR requirements– DgSecure: Detection, Element-level Protection, Monitoring

– Hortoworks HDP: Apache Ranger (Security & Privacy)and Apache Atlas (Data Inventory/Classification)

Integration of DgSecure Detection with Atlas-Ranger for Automatic Authorization Control over GDPR Personal Data

Demo

Page 3: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

3© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

General Data Protection Regulation

3

Framework for the digital transformation economy – Data = business asset, new currency, innovation accelerator– Personal data leveraged throughout connected ecosystems

GDPR harmonizes and extends EU Data Protection Directive 95/46/EC

Expands the definition of protected data

Expands data subject rights

Page 4: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

4© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Overview of GDPR Framework

Data Protection Authority

(supervising authority)

Data Controller

(organisations)

Data Subject

(individuals)

Data

Processor

Third

Countries

Third

Parties

Duties

Rights

Inform?

Disclosure?

Is Data Handling

Secure ?

Guarantees?

Advisory and

Enforcement

European Data Protection Board

(consistency mechanism) EU Courts National Courts

Complaint/

Resolution

Page 5: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

5© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

GDPR Data Privacy

5

Sources: 1. ec.europa.eu/justice/data-protection/reform/files/regulation_oj_en.pdf2. http://www.consilium.europa.eu/en/infographics/data-protection-regulation-infographics/

Page 6: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

7© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Rights & Obligations under GDPR

7

Controller Obligations– Clear Consent

– Clear Detailed Privacy notices

– Breach Notification (72 hours)

– Appointment of Data Protection Officer (250+, or high risk processing)

– Privacy by Design & Other considerations

―Lawful basis, Fair processing, & Specify Purposes

―Adequate, relevant, not excessive

―Data Accuracy, Retention, and Appropriate Security

– International Transfer adequacy

Page 7: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

8© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Rights & Obligations under GDPR

8

Individual Rights– Access to data

– Remedy from supervisory body/court

―Compensation for Damage

―Compensation for Distress

―Rectification

– Objection (for direct marketing)

– Erasure (right to be forgotten)

– Data Portability

– Restrict data processing (put on hold)

– Automated decisions and profiling

Page 8: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

9© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Broad Scope of GDPR

9

NOT ONLY data controllers or processors that are within the European Union

BUT ALSO– ANY processing of ANY personal data belonging to EU citizens

when the processing relates to the offering of goods or services, or monitoring behavior that takes place within the EU

Source: ec.europa.eu/justice/data-protection/reform/files/regulation_oj_en.pdf

Page 9: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

10© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

⬢ Comprehensive coverage across Hadoop ecosystem components

⬢ Plugins for components resident with component

⬢ Extensible Plugin Model: plugin for authorizing other sources can be built

Apache Ranger: Comprehensive Extensible Authorization

Page 10: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

11© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

⬢ Simple Intuitive UI for Policy Editing and Setup

⬢ Fine-grained specificity by resource type, user context, tags, and operation

⬢ Supports Access, Tag Based, Dynamic Data Masking, and Row Filtering Policy Types

Apache Ranger - Intuitive and Granular Policy Management

Page 11: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

12© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Apache Ranger Audits - Data Access

⬢ Comprehensive scalable audit logging ⬢ Audits for:

⬢ Resource Access Events with user context⬢ Policy Edits/Creation/Deletion⬢ User session information⬢ Component plugin policy sync operations

Page 12: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

13© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

STRUCTURED

Atlas: Metadata Truth in Hadoop

TRADITIONALRDBMS

METADATA

MPP APPLIANCES

Kafka Storm

Sqoop

Hive

ATLASMETADATA

Falcon

RANGERCustom

Partners

Metadata-driven governance services for Hadoop and enterprise big data ecosystems

Data Lineage/Provenance Along the entire data lifecycle with integrated Cross

component lineageData Classification Supports classification of data assets using tags (e.g. PII,

PHI, PCI etc.) and attributesMetadata Catalog Search Free text search on metadata Advanced search using DSLIntegrationsacross the Hadoop ecosystem, through a common metadata store Free text search on metadata OOtB real-time metadata and lineage ingestion with Hive,

Sqoop, Storm/Kafka APIs for custom metadata ingestion Apache Ranger integration for classification based

security

Key Benefits:

Modern Data Lakes need new ways to govern because:

• Cost – Traditional staff ratio to data size not possible

• Diversity – Only way to manage velocity of new datasets

• Agility – Quick change based on tags / taxonomy

Page 13: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

14© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

HDP – Security & Governance

Classification

Prohibition

Time

Location

Policies

PDPResource

Cache

Ranger

Manage Access Policies and Audit Logs

Track Metadataand Lineage

Atlas ClientSubscribers

to Topic

Gets MetadataUpdates

Atlas

MetastoreTags

Assets

Entitles

Streams

Pipelines

Feeds

HiveTables

HDFSFiles

HBaseTables

Entitiesin Data

Lake

Industry First: Dynamic Tag-based Security Policies

Page 14: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

15© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Dataguise: Company Background

Pioneers of Hadoop

Data Protection

2011-2013

Magic Quadrant

“Visionary” for Data

Masking

2015

Recommended for

Data-Centric

Security

2015

Recommended for

Protecting Big Data

in Hadoop

2015

2007-2010

“Breakthrough” Masking Technology

2014

The “Essential”

Solution for Data

Protection

in Hadoop

Cloud Platform

Coverage

2016

2017

Gartner Market

Guide for Data

Masking

2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017

Page 15: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

16© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

DGSECURE PRODUCT

16

Page 16: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

17© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

DgSecure Operation Sequence

Define the

Policy

Discover the

Sensitive Data

Secure

DataMonitor and

Reporting

Page 17: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

18© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Visualization: Enterprise-wide Data Security Posture

18

Page 18: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

19© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Enable Access Control based on Sensitivity Classification

19

Set up DgSecure to run on periodic basis to scan for sensitive data and generate classification information– DgSecure will continuously update Atlas with Tags as and when it find sensitive information.

Set up Ranger Policies based on Sensitive Tags

Ranger Policies will kick in at the time any user tries to access the data, for example, in a Hive Query

Page 19: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

20© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

DgSecure – Atlas/Ranger Integration Flow

20

DgSecure Detection

Atlas Populated with

Sensitivity Tags

Ranger Policies

based on tags

Access Control based

on Sensitivity

Page 20: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

21© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

DgSecure Integration with Atlas/Ranger

21

DgSECURE

DgSecure

Repository

Detection

DATA STORE

Hadoop, Hive, S3, Blob Storage

ATLAS RANGER

Atlas Tags

ACL

Enforcement

Data Store (Hadoop, Hive, S3, Blob Storage)

Page 21: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

22© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Demo – DgSecure + Atlas/Ranger

Page 22: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

23© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Key Takeaways: DgSecure + HDP can help with GDPR

Detection of Sensitive Data– Structured, Unstructured Data, Context Information used, Machine Learning capabilities

Protection of Sensitive Data at Element Level– Masking or Encryption options in Hadoop

– At Rest Protection (Masking or Encryption)

Monitoring – Raise Alerts on (Attempted) Access to Sensitive Data– Breach Notification Requirement

Access Control Integration– Via Atlas/Ranger integration, Ranger Tag-Based Policies

Reporting – Visualization of Enterprise-Level Data Exposure

Page 23: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

24© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Thank You

Page 24: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

25© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

DgSecure Policy

25

Page 25: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

26© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

DgSecure Hive Task

26

Page 26: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

27© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

DgSecure Detection Results (Hive)

Page 27: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

28© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Sensitive Data Tags in Atlas

28

Page 28: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

29© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

Ranger Tag-Based Policies

29

Page 29: Automatic Detection, Classification and Authorization of Sensitive Personal Data Impacted By GDPR

30© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved

For more information check outCheck out other relevant sessions:

Apace Atlas: Governance for your data, 4:10p, Wednesday April 5th

2017

Bridle Your Flying Islands And Castles In The Sky: Built-in Governance And Security For The Cloud, 11.30am, Thursday April 6, 2017

BoF sessions – Security and Governance 5:50p, Thursday, April 6th 2017

Hortonworks

www.hortonworks.com

Dataguise

www.dataguise.com