Top Banner
1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School
48

1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

Mar 27, 2015

Download

Documents

Natalie Page
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

1

Privacy, Confidentiality and Data Security (PCDS) in

HSR: Best PracticesAlan M. Zaslavsky

Department of Health Care Policy

Harvard Medical School

Page 2: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

2

Privacy, Confidentiality and Data Security (PCDS)

• Importance and sensitivity of PCDS

• Basic concepts of disclosure risk– Deidentification and reidentification– Disclosure control

• Institutional and regulatory frameworks– Common Rule, HIPAA, Data use agreements

• File organization, data flow and computer security

Page 3: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

3

• This presentation offered in our department at least annually– Required attendance by all programmers,

students, fellow, project managers with data responsibilities

– Presented to faculty at meetings– Shortened version for lower-level staff– Tracking of attendance by personnel manager– Sanction is loss of computer account

• Seek to fully involve project management in PCDS issues

Page 4: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

4

Definitions• Privacy: the right of an individual to keep

information about herself or himself from others.

• Confidentiality: safeguarding, by a recipient, of information about another individual

• Disclosure: release (direct or indirect) of information about an identifiable individual

Page 5: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

5

Definitions (continued)

• Data security: protections on data to prevent unauthorized access or destruction

• Informed consent: a person's agreement to allow person data to be provided for research and statistical purposes

• Research: study producing generalizable knowledge– excludes internal operations, quality assurance

Page 6: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

6

Importance of PCDS

Nexus for balance between

• benefits of information to society

• possible harms of information use to individuals

in conducting the research enterprise.One person’s “invasion of privacy” is

another’s “essential use of information.”

Page 7: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

7

Inherent conflicts

• Law enforcement / legal process

• General access to research data– Freedom of Information Act (FOIA)

• Commercial use / beneficial products & services?

• Prevention of harm

• Need to save data for verification, revision

Page 8: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

8

Costs of violations of PCDS

• Damage to subjects– Material– Psychological/social

• Damage to the research enterprise

• Exposure to legal/administrative sanctions for researchers and data providers and their institutions

Page 9: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

9

Direct and indirect identifiers

Key: variable or combination of variables, the value for which results in a record being unique in the target and population data

Direct identifier: Information that is uniquely associated with a person.

Indirect identifier: Data which, in combination are uniquely associated with a person. Information which facilitates such associations.

Page 10: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

10

Direct Identifiers (keys)

.•Name•Telephone number•Street /e-mail address•Unique features (SSN, Medicare ID, Health plan, Medical record #, Certificate/License, voice-finger prints, photos)

Page 11: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

11

Re-identification by Matching

De-identification

Original target file Name abcdefghijklAnonymized target file abcdefghijkl

Re-identification key

Anonymized target file abcdefghijklPopulation file abcdefmnop

Name

Page 12: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

12

Data in Combination

Variables might be identifying in combination that are not identifying by themselves

• Month, day and year of birth • Gender• Zip code

Page 13: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

13

Example of reidentification using three variables

Variables % Unique in Maine state voter

registration listBirthdate alone 12Birthdate + gender 29Birthdate + Zip (5) 69Birthdate + Zip (9) 97

Sweeney, 1997

Page 14: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

14

Population (External) Data Bases

• Voter Registration Lists

• Research files

• State & Federal Files– Survey files with added administrative data

• Information Vendor Files

• The unknown: what might an “intruder” know about some or all members of your population?

Page 15: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

15

Identifiable population groups (entire data set highly

identifiable)• Rare diseases

•Sample drawn from a particular area

Page 16: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

16

Unique/unusual cases: rare values

•110 year-old woman

•Man who weighs 350 pounds

•Income > $100 million

•Verbatim text containing identifying details

Page 17: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

17

Unique/unusual cases: rare combinations of values

•16 year-old widow•20 year-old Ph.D.•Asian race in rural mid-west •Female/Asian Executive•60-year old male married to 30 year-old

female•Cause of death = prostate cancer for 30

year-old male

Page 18: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

18

Micro Data Protection 1

• Remove direct identifiers• Restrict geographical detail• Code to remove detail – larger categories,

top/bottom coding• Remove, code or edit verbatim comments• Case suppression• Variable suppression

Page 19: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

19

Micro Data Protection 2

• Special handling (e.g. coding) of data from external sources (esp. area data)

• Statistical modification (“noise”)

• Sample/subsample

• Eliminate link between persons and establishments

Page 20: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

20

Tabular data

• Information on individuals deduced from unique cases in tables

• Reidentification usually related to small groups, small cell counts

• Rounding, cell suppression, complementary suppression might be required

Page 21: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

21

Disclosure of individual information from a table

Cancer typeIncome($’000) Colon Lung Kidney Breast<10 60 80 0 24

10-25 25 36 0 36

25-50 19 12 2 17

>50 22 14 0 35

Page 22: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

22

Technical issues• Highly technical issues in both microdata

and tabular nondisclosure– Intersection of stats, math, computer science

• Software for detecting disclosure risk– RTI, -argus, etc.

• Nontechnical variables– Resources and intentions of “intruder”

Page 23: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

23

Disclosure control in released data• Affect us as producers and consumers of

data

• Masking– Affects analyses if performed on data we

receive– Complex to implement on our releases

• Limited access data centers

Page 24: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

24

Restricted access data centers

• Alternative to fully-deidentified public-use microdata files

• Data are held at restricted center– Limited set of researchers submit analyses

through intermediaries– Output reviewed for nondisclosure

• Only feasible for organizations with substantial, persistent resources– e.g. NCHS, Census

Page 25: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

25

Institutional and regulatory frameworks for PCDS

• Common Rule / IRB

• HIPAA

• Data Use Agreements

• State regulations

Page 26: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

26

Common Rule

• Governs protection of research subjects in all Federally-funded research– IRB evaluates adherence by researcher– Institutional sanctions for violations– Many institutions extend to all research

• Objective: protection of subject from harm– In HSR, often there is no intervention– Typically, commitment to minimal risk of

disclosure

Page 27: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

27

Common Rule (continued)• Informed consent

– generally required in primary data-collection– appropriate information about use of data– might be waived where impractical to obtain (e.g.

intrusive), if risks minimal & rights not injured

• Exemption from (full) review– No intervention that could harm subject– Secondary data with no identifiable data– Requires determination by IRB (but less tedious)

Page 28: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

28

Implications for researchers

• Commitments are made – To subjects: consent language– To IRB: safeguards promised in IRB

application– To funding agencies: in grant application

• May involve– Protection of data while used– Limits on duration of use

Page 29: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

29

HIPAAHealth Insurance Portability and

Accountability Act

• Specific rules for electronic transmission of health data – Primarily for efficiency but includes Privacy Rule

• Obligations imposed on health care providers– Includes direct providers, health plans and insurers– Research data distinguished from health plan /

provider operational functions

• Researchers must respect these obligations

Page 30: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

30

Who is Covered by HIPAA?

• A health care provider who transmits health information in electronic transactionsExample: a physician or hospital who

electronically bills for services

• A health plan

• A health care clearinghouse

Page 31: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

31

HIPAA implications for research• Practical implications of HIPAA

– What data providers will be looking for– Need to work around restrictions on content– More elaborate paths for data control

• HIPAA provisions for releasing data for research – fully deidentified– limited use dataset– waiver

Page 32: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

32

Option 1: De-identified Health Information

• Completely de-identified information (18 elements removed) and no knowledge that remaining information can identify the individual. OR

• Statistically “de-identified” information where a qualified statistician determines that there is a “very small risk ” that the information could be used to identify the individual and documents the methods and analysis.

Page 33: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

33

– Names– Geographic info (including city and

ZIP)– Elements of dates (except year)– Telephone #s– Fax #s – E-mail address– Social Security #– Medical record, prescription #s– Health plan beneficiary #s– Account #s

– Certificate/license #s– VIN and Serial #s, license

plate #s– Device identifiers, serial #s– Web URLs– IP address #s– Biometric identifiers (finger

prints)– Full face, comparable photo

images– Unique identifying #s

If the covered entity has actual knowledge that remaining information can be used to identify the individual, the information is considered individually identifiable, and therefore, generally is PHI.

Removal of These Identifiers Makes Information De-identified

Page 34: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

34

Option 2: Limited Data Set with Data Use Agreement

• The Privacy Rule permits limited types of identifiers to be released for research with health information (referred to as a Limited Data Set).

• Limited Data Sets can only be used and released in accordance with a Data Use Agreement between the covered entity and the recipient.

Page 35: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

35

• The Limited Data Set CAN contain – Elements of Dates– City and ZIP – Other unique identifiers, characteristics and

codes not previously listed as direct identifiers (previous slide)

• CANNOT contain other direct identifiers (among the 18)

Limited Data Set w/ Data Use Agreement

Page 36: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

36

Option 3: Waiver of Authorization

May use or disclose personal inforamtion for research if IRB or Privacy Board determines that :– research involves no more than minimal risk– research does not adversely affect the “ rights and

welfare” of subjects– the research could not be done without a waiver

Page 37: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

37

Data Use Agreements (DUA)

• Between data provider and data user

• Restrictions:– access by specific personnel– use for a specific reason– defined duration of retention

• Implements commitments made by data provider

Page 38: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

38

State regulations

• Variable from state to state

• Some are relatively restrictive– requires negotiation with data provider

Page 39: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

39

Iron-clad protection?

• Certificate of Confidentiality– Issued by DHHS– Protects data against legal process– Typically for sensitive topics, e.g. illicit drugs

• O, Canada!

Page 40: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

40

Data security in complex projects

• Multisite projects: special needs

• Careful mapping of data flow and access

• Minimal identifying information at each stage

• Particular care in technical aspects of security

Page 41: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

41

Example of a data flow plan (with security provisions)

Page 42: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

42

File management for PCDS• General practices of good management

– Practices necessary to maintain project continuity

• Well-structured directory organization and naming

• Include documentation with files• Separate project data from personal directories• Separate datasets from programs• Separate raw data from analytic datasets

Page 43: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

43

• We typically follow this presentation with a 15-minute tutorial on good practices for data and file management

Page 44: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

44

Backups

• Conflict of privacy/confidentiality (restrict) and data security (maintain)

• Basic backup schedule (undeletable)– All Unix files: 4 month retention– PC files: 2 month retention

• Project-specific backup: by request– Only possible if material is properly organized– Permanent media, physical security

Page 45: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

45

• The backup policy described here was adopted after several months of faculty discussion– Computer system managers wanted longer

retention– Faculty concerned about unexpected discovery

of material intended to be deleted– Conflicts of DUA requirements with rules

regarding retention of data for verification, revision of manuscripts, etc.

Page 46: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

46

General computer security• Proper use of computer accounts, only by

authorized individuals• Secure connections for outside access

– Remote users

– Home or “on road” access via Internet

– Applications can be “tunneled” securely

• Good practices with passwords• Maintain file permissions to restrict access to

authorized users

Page 47: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

47

• We follow this up with a training on mechanics of computer security– Permissions, file organization, etc.

• More or less fine-grained tools for protection of various files

• IT staff included in training– Responsible for implementing security and data

retention policies for various project datasets

• Teach methods for both Unix and Windows sides of our system

Page 48: 1 Privacy, Confidentiality and Data Security (PCDS) in HSR: Best Practices Alan M. Zaslavsky Department of Health Care Policy Harvard Medical School.

48

Conclusions

• Know your data

• Be prepared to accommodate restrictions required by data providers

• Maintain general security

• Seek guidance for tough situations!