Top Banner
De-Identification Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 Exhibition Street Melbourne Privacy and Data Protection Week 9-13 May 2016
41

Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Apr 15, 2018

Download

Documents

vantram
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

De-Identification Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 Exhibition Street Melbourne

Privacy and Data Protection Week 9-13 May 2016

Page 2: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Commissioner forPrivacy and Data Protection

Commissioner forPrivacy and Data Protection

Commissioner forPrivacy and Data Protection

ORANGE – PMS 1655UPBLUE – PMS 2756UPMUSEO SLAB – 100/700

Commissionerfor Privacy and Data Protection

Commissionerfor Privacy and Data Protection

Commissionerfor Privacy and Data Protection

ORANGE – PMS 1655UPBLUE – PMS 2756UPMUSEO SLAB – 100/700

2UNCLASSIFIED

PresentersAgency Name Role

SiuMingTan ChiefMethodologistandGeneralManageroftheMethodologyDivisionattheAustralianBureauofSta@s@cs(ABS)

Dr.StephenHardy GroupleaderforDataPlaFormEngineeringatData61inCSIRO

FionaDowsley ChiefSta@s@cianofVictorianCrimeSta@s@csAgency,andAc@ngDirectorofStrategicPlanningattheDepartmentofJus@ce&Regula@on

GregGough ManageroftheDataVicAccessPolicy,DepartmentofTreasuryandFinance

Page 3: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Siu Ming Tan

Page 4: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

ManagingDataConfidenBalityandAccessattheABS

10May2016

Page 5: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

WhatshallIcover?

SomeTerminology

Legisla@verequirementonmaintainingconfiden@ality

DataU@lity&DisclosureRisk

TheFiveSafesFramework

Page 6: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

SomeTerminology

Privacy: Requirementtorespecttheprivateinforma@onofindividuals

ConfidenBality: Requirementthatinforma@on,whetherprivateornot,bestored,keptorreleasedinamannerthatiden@fica@onofwhotheinforma@onreferstoisnotpossible

AnonymisaBon: Processtoremovethedirectiden@fiersfrominforma@on(e.g.name,address,ABN).

Un-idenBfiableinfo:Informa@ontreatedinsuchawaythatre-iden@fica@onisnotpossible

Page 7: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

TheCensusandStaBsBcsAct,1905

Ø  EveryABSofficertosignanundertakingoffidelityandsecrecy(sec@on7),

Ø  Sta@s@calinforma@onnottobedisseminatedinamannerlikelytoenable

theiden@fica@onofapar@cularpersonororganisa@on(subsec@on12(2))

Ø  De-iden@fica@onisnotsufficienttomeetlegisla@verequirements

Ø  Releasemustnotlikelyleadtore-iden@fica@on

Page 8: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

DataUBlityversusDisclosureRisk(I)

DisclosureRiskDataUBlity

ProtecBons

Abilityinusingthedatatodrawvalidconclusions

Ø  SpontaneousRecogni@on

Ø  Matchingrisk

Ø  Higherriskforunitrecord

thanaggregateddata

Ø  Perturba@onØ  CellSuppressionØ  CollapsingofCategories

Ø  Sampling

Ø  Recordmasking

Ø  Subs@tu@onofValues

Page 9: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

DataUBlityversusDisclosureRisk(II)

Ø  Disclosureriskreducebyapplyingmoreprotec@ons,butdatau@lityis

reduced

Ø  Datau@lityismaximisedifthereisnoprotec@onapplied,butdisclourerisk

issignificantlyincreased

Ø  Wheretodrawthebalance?

Ø  Needtothinkbeyondjustapplyingdataprotec@ons.

Page 10: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

TheFiveSafesFramework

Safepeople

Safeproject

Safese`ng

Safedata

Safeoutput

Canthepersonbetrustedtousethedataappropriately?

Isthespecificuseofthedataappropriate?

Howdoesthemodeofaccesslimittheriskofdisclosure?

Howmuchprotec@onsaretobeappliedtothedata?

Howmuchcontrolsareappliedtoensuretheoutputisnon-disclosive?

Amul@dimensionalapproachtodisclosureriskassessment

Page 11: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Dr. Stephen Hardy

Page 12: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

www.csiro.au

UBlityvsPrivacyWhyde-idenBficaBonisdifficult

DrStephenHardyAugust2015

Page 13: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

NeVlixre-idenBficaBon

U@lityvsPrivacy|StephenHardy13|

100,000,000moviera@ngs480,000NeFlixsubscribers

Anonymised:Id–movie–ra@ng-date

200510%sample

“RobustDe-anonymiza@onofLargeSparseDatasets”,NarayananandShma@kov(2008)

IdenBfied:Name–movie–ra@ng-date

Page 14: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

RaBngUniqueness

14|

8ra@ngs(2maybewrong)andadatewithin2weeksuniquelyiden@fies99%ofthepeopleintheNeFlixdatabase

U@lityvsPrivacy|StephenHardy

Page 15: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Mobilitydata

15|

“Unique in the Crowd: The privacy bounds of human mobility”, de Montjoye, Hidalgo, Verleysen, & Blondel. (2013).

U@lityvsPrivacy|StephenHardy

Page 16: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Uniquenessof1.5millionusers

16|

4loca@ons&@mesuniquelycharacterizes95%ofthepeopleina1.5mpersonmobilitydatabase

U@lityvsPrivacy|StephenHardy

Page 17: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

UBlityvsPrivacy

17|

Themoredatathatislinkedtogether, themoreuniqueitbecomes

Themoredatathatislinkedtogether, themoreusefulitbecomes

But… Because…

U@lityvsPrivacy|StephenHardy

Page 18: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

CurrentApproachestoAnonymisaBon

18|

•  Losesvaluableinforma@on.•  Cans@llbere-iden@fiedinsomecases.

2.Generalisa@on+grouping

1.Masking

FirstName:JohnLastName:Smith

Email:[email protected]:1SmithSt

Address2:Sydney,2000LastTravelDes@na@on:Spain

TravelDate:January2015

FirstName:JohnLastName:Smith

Email:Address1:

Address2:

LastTravelDes@na@on:SpainTravelDate:January2015

U@lityvsPrivacy|StephenHardy

Page 19: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

DifferenBalPrivacy

19|

TunedRandomnoise

Originaldata

Removeanyperson

Noisydata

TunedRandomnoise

Noisydata

IndisBnguishable!

U@lityvsPrivacy|StephenHardy

Page 20: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Anonaly@x:Privacy-SafeDataRelease|RoksanaBoreli20|

CreatesSynthe@cData–WithPrivacyLevelGuaranteesHighDataGranularity(UnitRecords)forspecificanalyses

AnonalyBxPrivacyTechnology

Page 21: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

ConfidenBalCompuBng

21|

Encrypted data

Encrypted data

Encrypted Analysis

Decrypted Answers

U@lityvsPrivacy|StephenHardy

Page 22: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

DifferenBalPrivacy

Tradeoffs

22|

UBlity

Privacy

Rawdata

Masking

k-Anonymity

EncryptedComputaBon

U@lityvsPrivacy|StephenHardy

Page 23: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Fiona Dowsley

Page 24: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Greg Gough

Page 25: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Data for Victorians

Greg Gough Manager, DataVic Access Policy

Page 26: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Benefits of open data

•  Increases productivity and improves personal and business decision making.

•  Improves research outcomes. •  Improves the efficiency and effectiveness of

government.

26

Page 27: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Economic Value

27

The Australian economy will grow by an extra $16 billion a year if government agencies make most

of their data freely available to the public.

•  Stimulates economic activity and drives innovation and new services.

Page 28: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Open data value

Burke Road level crossing

Page 29: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

DataVic Access Policy

•  The default obligation under the Policy is for agencies to make de‑identified datasets available.

•  If a dataset contains personally identifiable information, and cannot be de‑identified, it is not suitable for release under the Policy.

Page 30: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Open by design

•  When developing or procuring a database or dataset consideration should be given in the design phase to enabling public access to the data that is suitable for release under the Policy.

Just another way of looking at ‘Privacy by design’

Page 31: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Examples of de-identified data use in Australia

31

Page 32: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

DataVic example of de-identified data use

32 TripRisk by Geoplex

Page 33: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Queensland Police example

Page 34: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Queensland Police example

Page 35: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

OS examples of (de)-identified data use

Expectations?

35

Page 36: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

US Police example

Page 37: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

NYC Taxi data

Page 38: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Taxi data

Page 39: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Further information

•  Websites: www.data.vic.gov.au www.dtf.vic.gov.au

•  Email: [email protected] •  Twitter @data_vic •  Phone: (03) 9651 1880

Page 40: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

© State of Victoria 2016 You are free to re-use this work under a Creative Commons Attribution 4.0 licence, provided you credit the State of Victoria (Department of Treasury and Finance) as author, indicate if changes were made and comply with the other licence terms. The licence does not apply to any branding, including Government logos. Copyright queries may be directed to [email protected]

Page 41: Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project Safe seng Safe data Safe output Can the person be trusted to use the data appropriately?

Thank you