Top Banner
Database Access Control & Privacy: Is There A Common Ground? Surajit Chaudhuri, Raghav Kaushik and Ravi Ramamurthy Microsoft Research
29

Database Access Control & Privacy: Is There A Common Ground?

Feb 24, 2016

Download

Documents

marci

Database Access Control & Privacy: Is There A Common Ground?. Surajit Chaudhuri, Raghav Kaushik and Ravi Ramamurthy Microsoft Research. Data Privacy. Databases Have Sensitive Information Health care database: Patient PII, Disease information Sales database: Customer PII - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Database Access Control  &  Privacy: Is There A Common Ground?

Database Access Control & Privacy: Is There A Common Ground?

Surajit Chaudhuri, Raghav Kaushik and Ravi RamamurthyMicrosoft Research

Page 2: Database Access Control  &  Privacy: Is There A Common Ground?

2

Data Privacy Databases Have Sensitive Information

Health care database: Patient PII, Disease information Sales database: Customer PII Employee database: Employee level, salary

Data analysis carries the risk of privacy breach [FTDB 2009] Latanya Sweeney’s identification of the governor of MA

from medical records AOL search logs Netflix prize dataset

Focus of this paper: What is the implication of data privacy concerns on the DBMS? Do we need any more than access control?

Page 3: Database Access Control  &  Privacy: Is There A Common Ground?

3

Data PublishingName

Age Gender

Zipcode

Disease

Ann 28 F 13068 Heart disease

Bob 21 M 13068 FluCarol 24 F 13068 Viral disease… … … … …

Patients [FTDB2009]

Age Gender

Zipcode

Disease

[20-29]

F 1**** Heart disease

[20-29]

M 1**** Flu

[20-29]

F 1**** Viral disease

… … … …

Patients-AnonymizedQ1

Qn

...

K-Anonymity, L-Diversity, T-Closeness

Page 4: Database Access Control  &  Privacy: Is There A Common Ground?

4

Privacy-Aware Query AnsweringName

Age Gender

Zipcode

Disease

Ann 28 F 13068 Heart disease

Bob 21 M 13068 FluCarol 24 F 13068 Viral disease… … … … …

Patients [FTDB2009]

Age Gender

Zipcode

Disease

[20-29]

F 1**** Heart disease

[20-29]

M 1**** Flu

[20-29]

F 1**** Viral disease

… … … …

Patients-Anonymized

Q1

Qn

...

Differential Privacy, Privacy-Preserving OLAP

Page 5: Database Access Control  &  Privacy: Is There A Common Ground?

5

Data Publishing Vs Query Answering Jury is still out Data Publishing

No impact on DBMS De-identification algorithms over published data

are getting increasingly sophisticated Need to take a hard look at the query

answering paradigm Potential implications for DBMS “An interactive, query-based approach is

generally superior from the privacy perspective to the “release-and-forget” approach” [CACM’10]

Page 6: Database Access Control  &  Privacy: Is There A Common Ground?

6

Is “Privacy-Aware” = (Fine-Grained) Access Control (FGA)? Every user is allowed to view only subset of data

(authorization view) Subset defined using a predicate

Queries are (logically) rewritten to go against subset

Select *From PatientsWhere Patients.Physician = userID()

Page 7: Database Access Control  &  Privacy: Is There A Common Ground?

7

Is “Privacy-Aware” = (Fine-Grained) Access Control (FGA)? Every user is allowed to view only subset of data

(authorization view) Subset defined using a predicate

Queries are (logically) rewritten to go against subsetSelect Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug

Select Drug, count(*)From Patients right outer join Drugs on

DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug and auth(Side-Effects)) > 3 and auth(Patients) and auth(Drugs)Group by Drug

Page 8: Database Access Control  &  Privacy: Is There A Common Ground?

8

Authorization is “Black and White”

Query: Count the number of cancer patients

Utility

Privacy Grant access to cancer patients(Return accurate count)

Deny access to cancer patients

Page 9: Database Access Control  &  Privacy: Is There A Common Ground?

9

Beyond “Black and White”: Differential Privacy [SIGMOD09]

Perturb the output of agg. computation(Requires no changein execution engine)

Need to setparameters ε,Budget

Count the number of cancer patients

BaggageNon-deterministicPer-query privacy parameterOverall privacy budget

Page 10: Database Access Control  &  Privacy: Is There A Common Ground?

10

Seeking Common Ground Access Control

Supports full generality of SQL “Black and White”

Differential Privacy Algorithms A principled way to go beyond “black and white” Known mechanisms do not support full generality of SQL Data analysis involves aggregation but also joins, sub-

queries Can we get the best of both worlds?

Differential Privacy = Computation on unauthorized data What is the implication on privacy guarantees?

Page 11: Database Access Control  &  Privacy: Is There A Common Ground?

What Does “Best of Both Worlds” Look Like?

FGA Policy: Each physician can see:

Records of their patients Analyst can see:

Drug records manufactured by their employer

No patient records

Name

Disease

Drug Physician

Ann Heart disease

Lipitor Grey

… … … …

Drug Company

Lipitor

Pfizer

… …

PatientsDrug Side-

Effect

Lipitor

Muscle

Lipitor

Liver

… …

Drugs Side-Effects

Name Employer

JoeAnalyst

Pfizer

JaneAnalyst

Merck

… …

Analysts

Page 12: Database Access Control  &  Privacy: Is There A Common Ground?

12

FGA

Name Disease Drug Physician

… … … Grey

… … … Grey

… … … Stevens

… … … Stevens

… … … Yang

Select *From Patients

Select *From PatientsWhere Physician = userID()

Grey

Page 13: Database Access Control  &  Privacy: Is There A Common Ground?

13

Differential Privacy

Name Disease Drug Physician

… Heart Disease

… …

… Flu … …

… Cancer … …

… Cancer … …

… AIDS … …

Select count(*)From PatientsWhere Disease = ‘Cancer’

Select count(*) + NoiseFrom PatientsWhere Disease = ‘Cancer’

User = JaneAnalyst

Page 14: Database Access Control  &  Privacy: Is There A Common Ground?

14

Mix And Match: FGA + Differential Privacy

Find for each drug with more than 3 side-effects, count the number of patients who have been prescribed

Select Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug

Name

Disease

Drug Physician

… … … …

… … … …

Drug Company

Lipitor

Pfizer

… …

PatientsDrug Side-

Effect

Lipitor

Muscle

Lipitor

Liver

… …

Drugs Side-Effects

Name Employer

JoeAnalyst

Pfizer

JaneAnalyst

Merck

… …

Analysts

Page 15: Database Access Control  &  Privacy: Is There A Common Ground?

15

Architecture That Will Fail To Mix And Match

Execution Engine

Authorization Subsystem

Q

Policy

Result(AggQ)

ResultsDifferential Privacy API

AggQ

AggQ

Result(AggQ) + Noise

DBMS

Page 16: Database Access Control  &  Privacy: Is There A Common Ground?

16

Execution Engine

Authorization Subsystem

Q

PolicyResult(AggQ)

Results

Differential Privacy APIAggQ

Result(AggQ) + Noise

DBMS

Wrapper

Architecture That Will Fail To Mix And Match

Page 17: Database Access Control  &  Privacy: Is There A Common Ground?

17

Authorization-Aware Data Privacy

Execution Engine

Authorization Aware Privacy Subsystem

Q

Policy

Results

DBMS

Page 18: Database Access Control  &  Privacy: Is There A Common Ground?

18

Query Rewriting

Select Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug

Name

Disease

Drug Physician

… … … …

… … … …

Drug Company

Lipitor

Pfizer

… …

PatientsDrug Side-

Effect

Lipitor

Muscle

Lipitor

Liver

… …

Drugs Side-Effects

Name Employer

JoeAnalyst

Pfizer

JaneAnalyst

Merck

… …

Analysts

Non-aggregation: AuthorizationWhat about aggregation?

Page 19: Database Access Control  &  Privacy: Is There A Common Ground?

19

Query Rewriting

Select Drug, count(*)From Patients right outer join Drugs on DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3Group by Drug

Name

Disease

Drug Physician

… … … …

… … … …

Drug Company

Lipitor

Pfizer

… …

PatientsDrug Side-

Effect

Lipitor

Muscle

Lipitor

Liver

… …

Drugs Side-Effects

Name Employer

JoeAnalyst

Pfizer

JaneAnalyst

Merck

… …

Analysts

Page 20: Database Access Control  &  Privacy: Is There A Common Ground?

20

Query Rewriting

Select Drug, count(*)From Patients right outer join Drugs on

DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug and auth(Side-Effects)) > 3 and auth(Patients) and auth(Drugs)Group by Drug

Name

Disease

Drug Physician

… … … …

… … … …

Drug Company

Lipitor

Pfizer

… …

PatientsDrug Side-

Effect

Lipitor

Muscle

Lipitor

Liver

… …

Drugs Side-Effects

Name Employer

JoeAnalyst

Pfizer

JaneAnalyst

Merck

… …

Analysts

Authorized Groups

For each authorized group, find noisy count

Page 21: Database Access Control  &  Privacy: Is There A Common Ground?

21

Query Rewriting

Select Drug, count(*)From Patients right outer join Drugs on

DrugWhere (Select count(*) From Side-Effects Where Drug = Drugs.Drug and auth(Side-Effects)) > 3 and auth(Patients) and auth(Drugs)Group by Drug

Name

Disease

Drug Physician

… … … …

… … … …

Drug Company

Lipitor

Pfizer

… …

PatientsDrug Side-

Effect

Lipitor

Muscle

Lipitor

Liver

… …

Drugs Side-Effects

Name Employer

JoeAnalyst

Pfizer

JaneAnalyst

Merck

… …

Analysts

Authorized Groups

For each authorized group, find:(1)Noisy count on unauthorized subset(2)Accurate count on authorized subset

Page 22: Database Access Control  &  Privacy: Is There A Common Ground?

22

Class of Queries Select Drug, count(*) From Patients right outer join Drugs on Drug Where (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3 Group by Drug

Foreign key join

Predicate

Grouping

Aggregation

Rewriting: Go to unauthorized data for final aggregation

Principled rewriting for arbitrary SQL: open problem

Page 23: Database Access Control  &  Privacy: Is There A Common Ground?

23

Our Privacy Guarantee: Relative Differential Privacy Differential Privacy Intuition:

A computation is differentially private if its behavior is similar for any two databases D1and D2 that differ in a single record

Relative Differential Privacy Intuition: A computation is differentially private relative to

an authorization policy if its behavior is similar for any two databases D1and D2 that differ in a single

record and both result in the same authorization views

Page 24: Database Access Control  &  Privacy: Is There A Common Ground?

24

Noisy ViewCreate noisy view DrugCounts(Drug, PatientCnt) as (Select Drug, count(*) From Patients right outer join Drugs on Drug Where (Select count(*) From Side-Effects Where Drug = Drugs.Drug) > 3 Group by Drug)

Named Non-deterministic Rewriting is authorization aware Can be part of grant-revoke statements just like regular

views

Page 25: Database Access Control  &  Privacy: Is There A Common Ground?

25

Noisy View ExamplesSelect count(*)From PatientsWhere Disease =

‘Cancer’

Select Disease, count(*)From PatientsGroup by Disease

Select Category, count(*)From Patients join

DiseaseCategory on DiseaseGroup by Category

Page 26: Database Access Control  &  Privacy: Is There A Common Ground?

26

Noisy View Architecture

Execution Engine

Authorization Aware Privacy Subsystem

Q

Policy

Results

Tables

Noisy Views

Views

Enforce authorization

Rewrite as we saw before

Select Drug, Side-Effect, CntFrom DrugCounts, Side-EffectsWhere DrugCounts.Drug = Side-

Effects.Drug

DBMS

Page 27: Database Access Control  &  Privacy: Is There A Common Ground?

27

Differential Privacy Parameters [SIGMOD09]

Need to setparameters ε,Budget

Page 28: Database Access Control  &  Privacy: Is There A Common Ground?

28

Noisy View Architecture: Differential Privacy Parameters

Execution Engine

Authorization Aware Privacy Subsystem

(Q, ε)

Auth. Policy,Privacy Budget

Results

Tables

Noisy Views

Views

Fall back to access controlafter budget exhausted

DBMS

Page 29: Database Access Control  &  Privacy: Is There A Common Ground?

29

Conclusions and Future Work Noisy view based architecture to incorporate privacy-

preserving query answering with access control in a DBMS Based on differential privacy Needs minimal changes to engine Guarantee: Differential privacy relative to authorizations Baggage of differential privacy

Non-deterministic Per-query privacy parameter Overall privacy budget

Open Issues Larger class of noisy views (can we support arbitrary SQL?) Benchmark the privacy-utility tradeoff for complex data

analysis, e.g. TPC-H, TPC-DS. Query Optimization Integrating Access Control with other privacy models