!
!
!
SATYAM AGARWALA
DEVELOPER
DATA ANONYMIZATION
Why do we need data?
What is data anonymization?
Why anonymize data?
How do we anonymize data?
https://github.com/sunitparekh/data-anonymization
Sunit Parekh Satyam Agarwala
You choose which attributes to anonymize
!!first name last name address zipcode handphone birth date
!!: Satyam : Agarwala : 87B Amoy Street : 069906 : 8765 4321 : 01/01/1945
BLACKLIST
WHITELIST
You choose which attributes NOT to
anonymize
!!first name last name address zipcode handphone birth date
!!: Satyam : Woodward : 10 Downing Street : 123456 : 8765 4321 : 01/01/1945
Show me!
Script (DSL, strategies, parallelization)
ORM (RDBMS, NoSQL)
source DB destination DB
SUMMARY
GOTCHAS
FK CONSTRAINTS Disable foreign key checks globally before running the script.
!
UNIQUE CONSTRAINTS Whitelist or ensure a sequential non-random strategy for attributes that need to be unique.
Are there other ways to anonymize data?
FORMAL APPROACH
k-anonymity !
l-diversity !
t-closeness !
δ-presence
ALTERNATIVE TOOLS
Arx https://github.com/arx-
deidentifier/arx
THANK YOU!