Top Banner
The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid Services, Department of Health and Human Services. Current issues and challenges in sharing biomedical human subjects data OASIS 2014 Lucila Ohno-Machado, MD, PhD Division of Biomedical Informatics University of California San Diego Oasis 2014
18

The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Dec 30, 2015

Download

Documents

Megan Conley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid Services, Department of Health and Human Services.

Current issues and challenges in sharing biomedical human subjects data

OASIS 2014

Lucila Ohno-Machado, MD, PhDDivision of Biomedical InformaticsUniversity of California San Diego Oasis 2014

Page 2: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Personalized Healthcare

What is the influence of genetics, environment?

Which therapies work best for individual patients?

Page 3: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Person-Centered Outcomes Research

• Genome– Sequencing data

• Phenotype– Personal monitoring

• Blood pressure, glucose

– Personal health records– Behavior monitoring

• Adherence to medication, exercise

• Environment– Air sensors, food quality– Location Source: DOE

Page 4: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Where does knowledge come from?

• Controlled studies with strict eligibility criteria• Does this apply to me?

Hopefully, but we need a lot of data to answer this question:• We need to build infrastructure to access large data

repositories – Lower the barriers to share data

• We need to share tools to analyze the data– Algorithms and computational facilities

Page 5: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Big Data, Medium Data, and Small Data

• Data integration across biological scales• Data annotation and harmonization• Data ‘anonymization’ and privacy preservation

Page 6: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Data for Personalized Medicine

Prevention, Diagnosis and Therapy– Genetic predisposition– Biomarkers– Pharmacogenomics– Health records– Sensors

Handling Protected Health Information– Secure Electronic Environment

• Electronic Health Records• Genetic Data

Page 7: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Sharing Data

• Sharing data today– Data sharing plans required

• Little incentive to actually share– One model: users download data– Yes/No decision on sharing

• Data use agreements across institutions – Pairwise, limited and complicated – Specific to a particular study– Resources for sharing are limited– Security/privacy constraints are hard

for small institutions to follow

Page 8: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

National Centers for Biomedical Computing and iDASH

Page 9: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Mission

“A national center for biomedical computing that develops new algorithms, open-source tools, computational infrastructure, and services that will enable biomedical and behavioral researchers nationwide to integrate Data for Analysis, ‘anonymization,’ and Sharing”

Page 10: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Vision

• Share access to data and computation– Allow healthcare providers to focus on

care, biomedical researchers to focus on research

– Provide software, platform, and infrastructure

– Protect privacy– Share

• Data• Workflows• Computation• Security• Policies

Page 11: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Models for Data Sharing

• Cloud Storage: data exported for computation

elsewhere– Users download data from the cloud

• Cloud Compute and Virtualization: computation goes to the data

– Users analyze data in the cloud– Users download virtual machines

Page 12: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Three Different Models for Data Sharing

1. Users download data2. Users compute in a central facility3. Users install software that operates on their data and

transmits results of operations (e.g., queries, analyses)

Page 13: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Model 1: Users download data

• “De-identification” may be necessary• Encrypted transmission• Data Use Agreement CentralLawyers from the University of California helped write

– Data Contributor Agreement• Who can have access for what purpose

– Data User Agreement• Terms of use

• iDASH serves as ‘agent’ for the data

Page 14: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Model 2: Users compute in central facility

• Securing the privacy of human subjects data including biometrics such as genomes

• There are known security issues with commercial clouds (business associate liability agreement mitigates some risks)

• A protected cloud compute environment is capable of operating on genomes and clinical data

• We have built this cloud environment in iDASH

Page 15: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Infrastructure Security for Human Subjects Data

• HIPAA (Health Insurance Portability and Accountability Act) compliant computing environment

• Segmentation (Zones) of projects & functionality• Physical and environmental protection of compute hardware• Access control with Two Factor Authentication• Secure (encrypted tunnel) system access and upload

capability• Centralized logging, intrusion detection• Proxies and filters• Hardened (secured) system configurations

Page 16: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Model 3: Computation goes to the data

• Some health systems cannot host data outside their facilities (e.g., VA)

• Software can be sent to those facilities in order to build an overall model (e.g., regression)

Page 17: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

University of California Research eXchange UC-ReX

1. UC Davis2. UC Irvine3. UC Los Angeles4. UC San Diego5. UC San Francisco

Funded by the UC Office of the President to the NIH-funded CTSAs

• Integration of Clinical Data Warehouses from 5 University of California Medical Centers and affiliated institutions (>10 million patients)– Aggregate and individual-level patient data

will be accessible according to data use agreements and IRB approval

– Distributed models to adjust for confounders

• Objectives– Monitor patient safety– Improve outcomes– Promote research

Page 18: The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid.

Acknowledgements

• Slides contributed by the iDASH team

• Division of Biomedical Informatics

• Funding byNIHAHRQPCORIUCOPUCSD