Top Banner
Data Warehousing Data Warehousing Data Mining Data Mining Privacy Privacy
26

Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Jan 03, 2016

Download

Documents

Bridget Lucas
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Data WarehousingData Warehousing

Data MiningData Mining

PrivacyPrivacy

Page 2: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

ReadingReading

Bhavani Thuraisingham, Murat Kantarcioglu, and Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and Srinivasan Iyer. 2007. Extended RBAC-design and implementation for a secure data warehouse. Int. J. implementation for a secure data warehouse. Int. J. Bus. Intell. Data Min. 2, 4 (December 2007), 367-Bus. Intell. Data Min. 2, 4 (December 2007), 367-382., 382., https://www.utdallas.edu/~bxt043000/Publications/Technical-Reports/UTDCS-bxt043000/Publications/Technical-Reports/UTDCS-35-07.pdf 35-07.pdf

Sweeney L, Abu A, and Winn J. Identifying Participants in the Personal Genome Project by Name. Harvard University. Data Privacy Lab. White Paper 1021-1. April 24, 2013. http://dataprivacylab.org/projects/pgp/1021-1.pdf

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 22

Page 3: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Data WarehousingData Warehousing

Repository of data providing Repository of data providing organized and cleaned organized and cleaned enterprise-wide data (obtained enterprise-wide data (obtained form a variety of sources) in a form a variety of sources) in a standardized formatstandardized format– Data mart (single subject area)Data mart (single subject area)– Enterprise data warehouse (integrated Enterprise data warehouse (integrated

data marts)data marts)– Metadata Metadata

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 33

Page 4: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

OLAP AnalysisOLAP Analysis

Aggregation functionsAggregation functions Factual data accessFactual data access Complex criteriaComplex criteria Visualization Visualization

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 44

Page 5: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Warehouse EvaluationWarehouse Evaluation

Enterprise-wide supportEnterprise-wide support Consistency and integration Consistency and integration

across diverse domainacross diverse domain Security supportSecurity support Support for operational usersSupport for operational users Flexible access for decision Flexible access for decision

makersmakers

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 55

Page 6: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Data IntegrationData Integration

Data accessData access Data federationData federation Change captureChange capture Need ETL (extraction, Need ETL (extraction,

transformation, load)transformation, load)

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 66

Page 7: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Data Warehouse Data Warehouse UsersUsers Internal usersInternal users

– EmployeesEmployees– Managerial Managerial

External usersExternal users– Reporting and auditingReporting and auditing– Research Research

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 77

Page 8: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Data MiningData Mining

Databases to be mined Knowledge to be mined Techniques Used Applications supported

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 88

Page 9: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Data Mining TaskData Mining Task

DM: mostly automatedDM: mostly automated Prediction TasksPrediction Tasks

– Use some variables to predict Use some variables to predict unknown or future values of other unknown or future values of other variablesvariables

Description TasksDescription Tasks– Find human-interpretable patterns Find human-interpretable patterns

that describe the datathat describe the data

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 99

Page 10: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Common TasksCommon Tasks

Classification [Predictive]Classification [Predictive] Clustering [Descriptive]Clustering [Descriptive] Association Rule Mining [Descriptive]Association Rule Mining [Descriptive] Regression [Predictive]Regression [Predictive] Deviation Detection [Predictive]Deviation Detection [Predictive]

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1010

Page 11: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Security for Data Security for Data WarehousingWarehousing Establish organizations security Establish organizations security

policies and procedurespolicies and procedures Implement logical access controlImplement logical access control Restrict physical accessRestrict physical access Establish internal control and Establish internal control and

auditingauditing

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1111

Page 12: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Data Warehousing Data Warehousing Issues: IntegrityIssues: Integrity Poor quality data: inaccurate, Poor quality data: inaccurate,

incomplete, missing meta-dataincomplete, missing meta-data Loss of traditional consistency, Loss of traditional consistency,

e.g., keyse.g., keys Source data quality vs. derived Source data quality vs. derived

data qualitydata quality– Trust in the result of analysis?Trust in the result of analysis?

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1212

Page 13: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Big Data Security and Big Data Security and Privacy Privacy Amount of data being Amount of data being

consideredconsidered Privacy-preserving analyticsPrivacy-preserving analytics Granular Access ControlGranular Access Control

– Flat, two dimensional tablesFlat, two dimensional tables Transaction logs and auditingTransaction logs and auditing Real time monitoringReal time monitoring

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1313

Page 14: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Big Data IntegrityBig Data Integrity

Data AccuracyData Accuracy Source provenanceSource provenance End-point filtering and validationEnd-point filtering and validation

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1414

Page 15: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Access ControlAccess Control

Layered defense:Layered defense:– Access to processes that extract Access to processes that extract

operational dataoperational data– Access to data and process that Access to data and process that

transforms operational datatransforms operational data– Access to data and meta-data in Access to data and meta-data in

the warehousethe warehouse

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1515

Page 16: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Access Control IssuesAccess Control Issues

Mapping from local to Mapping from local to warehouse policieswarehouse policies

How to handle “new” dataHow to handle “new” data ScalabilityScalability Identity ManagementIdentity Management

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1616

Page 17: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Inference ProblemInference Problem

Data Mining: discover “new knowledge” Data Mining: discover “new knowledge” how how to evaluate security risks?to evaluate security risks?

Example security risks: Example security risks: – Prediction of sensitive informationPrediction of sensitive information– Misuse of informationMisuse of information

Assurance of “discovery”Assurance of “discovery”

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1717

Page 18: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Privacy and Privacy and SensitivitySensitivity Large volume of private (personal) Large volume of private (personal)

datadata Need:Need:

– Proper acquisition, maintenance, Proper acquisition, maintenance, usage, and retention policyusage, and retention policy

– Integrity verificationIntegrity verification– Control of analysis methods Control of analysis methods

(aggregation may reveal sensitive (aggregation may reveal sensitive data)data)

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1818

Page 19: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

PrivacyPrivacy

What is the difference between What is the difference between confidentiality and privacy?confidentiality and privacy?

Identity, location, activity, etc.Identity, location, activity, etc. Anonymity vs. accountabilityAnonymity vs. accountability

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 1919

Page 20: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 2020

LegislationsLegislations

Privacy Act of 1974, U.S. Department of Justice Privacy Act of 1974, U.S. Department of Justice (http://www.usdoj.gov/oip/04_7_1.html )(http://www.usdoj.gov/oip/04_7_1.html )

Family Educational Rights and Privacy Act (FERPA), Family Educational Rights and Privacy Act (FERPA), U.S. Department of Education, U.S. Department of Education, (http://www.ed.gov/policy/gen/guid/fpco/ferpa/inde(http://www.ed.gov/policy/gen/guid/fpco/ferpa/index.html )x.html )

Health Insurance Portability and Accountability Act Health Insurance Portability and Accountability Act of 1996 (HIPAA), of 1996 (HIPAA), (http://en.wikipedia.org/wiki/Health_Insurance_Port(http://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act )ability_and_Accountability_Act )

Telecommunications Consumer Privacy Act Telecommunications Consumer Privacy Act (http://www.answers.com/topic/electronic-(http://www.answers.com/topic/electronic-communications-privacy-act )communications-privacy-act )

Page 21: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Online Social NetworkOnline Social Network

Social RelationshipSocial RelationshipCommunication context changes Communication context changes

social relationshipssocial relationships

Social relationships maintained Social relationships maintained through different media grow at through different media grow at different rates and to different different rates and to different depthsdepths

No clear consensus which media is No clear consensus which media is the bestthe best

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 2121

Page 22: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Internet and Social Internet and Social RelationshipsRelationships

InternetInternetBridges distance at a low costBridges distance at a low cost

New participants tend to “like” New participants tend to “like” each other moreeach other more

Less stressful than face-to-face Less stressful than face-to-face meetingmeeting

People focus on communicating People focus on communicating their “selves” (except a few their “selves” (except a few malicious users)malicious users)

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 2222

Page 23: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Social NetworkSocial Network

Description of the social structure Description of the social structure between actorsbetween actors

Connections: various levels of social Connections: various levels of social familiarities, e.g., from casual familiarities, e.g., from casual acquaintance to close familiar bondsacquaintance to close familiar bonds

Support online interaction and Support online interaction and content sharingcontent sharing

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 2323

Page 24: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Social Network Social Network AnalysisAnalysis

The mapping and measuring of The mapping and measuring of relationships and flowsrelationships and flows between between people, groups, organizations, people, groups, organizations, computers or other information computers or other information processing entitiesprocessing entities

Behavioral ProfilingBehavioral Profiling Note: Note: Social Network SignaturesSocial Network Signatures

– User names may change, family and User names may change, family and friends are more difficult to changefriends are more difficult to change

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 2424

Page 25: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

Interesting Read:Interesting Read:

M. Chew, D. Balfanz, B. Laurie, M. Chew, D. Balfanz, B. Laurie, (Under)mining Privacy in Social (Under)mining Privacy in Social Networks, Networks, http://citeseer.ist.psu.edu/viewdhttp://citeseer.ist.psu.edu/viewdoc/summary?oc/summary?doi=10.1.1.149.4468 doi=10.1.1.149.4468

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 2525

Page 26: Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation.

NextNext

Web application insecurity: risk Web application insecurity: risk to databasesto databases

FarkasFarkas CSCE 824 - Spring 2015CSCE 824 - Spring 2015 2626