AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Perspectives from the NIH Associate Director for Data Science (ADDS) Office Vivien Bonazzi, Ph.D. Senior Advisor for Data Science Technologies & Innovation NIH Office of the Associate Director for Data Science (ADDS)
48
Embed
Perspectives from the NIH Associate Director for Data Science (ADDS) Office
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Perspectives from the NIH Associate Director for Data Science (ADDS) Office
Vivien Bonazzi, Ph.D.Senior Advisor for Data Science Technologies & Innovation
NIH Office of the Associate Director for Data Science (ADDS)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
BIOMEDICAL
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
NIH Data
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
NIH Data
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
NIH Addresses Big Data• In response to the
incredible growth of large biomedical (digital) datasets, the Director of NIH established a special Data and Informatics Working Group (DIWG).
VolumeVelocityVarietyVeracity
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
US Government Memo Increasing Access to the Results of Federally Funded Scientific Research
In Feb 2013 the US OSTP issued a memo calling for all Federal
Agencies to make digital assets from federally funded research available.Each agency’s public access plan shall:
Maximize access, by the general public and without charge, to digitally formatted scientific
data created with Federal funds while:i) protecting confidentiality and personal privacy, ii) recognizing proprietary interests, business confidential information, and intellectual property rights and avoiding significant negative impact on intellectual property
rights, innovation, and U.S. competitiveness, and iii) preserving the balance between the relative value of long-term preservation and access and the associated cost and administrative burden.
Provide for the assessment of long-term needs for the preservation of scientific data and outline options for developing and sustaining repositories for scientific data in digital formats.
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
The Future of Open Data• The nature of the scientific enterprise is evolving.
• Must transform into a digital enterprise
(as have other industries: music, financial, advertising)
• To enable biomedical research as a digital enterprise through which new discoveries are made and knowledge generated by maximizing community engagement and productivity.
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
ADDS Mission StatementTo use data science
to foster an
open digital ecosystem
that will accelerate
efficient, cost-effective
biomedical research
to enhance health, lengthen
life, and reduce illness and
disability
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
ADDS Strategy • Discovery and Innovation
Enabling major scientific discovery and innovation through the BD2K Initiative
• Workforce developmentStrengthen the ability of a diverse biomedical workforce to develop and benefit from data science
• Policy and processContribute to policies & processes involving data that further the NIH mission
• LeadershipFurther visibility of NIH leadership in data science by the public, DHHS, USG at large, and international
funders
• SustainabilityTo foster a sustainable, efficient, and productive data science ecosystem
Sustainability
Workforce Development
Discovery & Innovation
Policy & Process
Leadership
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
ADDS Strategy • Discovery and Innovation
Enabling major scientific discovery and innovation through the BD2K Initiative
• Workforce development
Strengthen the ability of a diverse biomedical workforce to develop and benefit from data science
• Policy and process
Contribute to policies & processes involving data that further the NIH mission
• Leadership
Further visibility of NIH leadership in data science by the public, DHHS, USG at large, and international funders
• Sustainability
To foster a sustainable, efficient, and productive data science
ecosystem: The Commons
Sustainability
Workforce Development
Discovery & Innovation
Policy & Process
Leadership
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
The Commonsenabling the digital enterprise
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
What is The Commons?
• Treats products of research – data, methods, papers etc. as digital objects
• These digital objects exist in a shared virtual space
• Digital objects conform to FAIR principles:– Findable– Accessible (and usable)
– Interoperable – Reusable
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
• A shared virtual space where scientists can:– Find– Deposit– Manage– Share and – Reuse data, software, metadata and workflows
• An environment to find and catalyze the use of shared digital research objects
What is The Commons?
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
The Commons: Components• Computing environment
– cloud and/or HPC
– supports access, utilization, sharing and storage of digital objects.
• Methods for Interoperability– enables connectivity, shareability and interoperability between digital objects.
– APIs, Containers (docker etc)
• Digital object compliance model – describes the properties of digital objects that enables them to be discoverable and
ApplicationsAuth & acct managementAuthorization policiesProper service configurationNetwork configurationSecurity groupsOS firewallsOperating systems
+ =
• Re-focus your security professionals on a subset of the problem
• Partners can further reduce that burden
• Take advantage of high levels of uniformity and automation
The shared responsibility modelAuditedCustomer + Partner
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Genomics Data Security
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Store and analyze restricted-access genomics on AWS
bit.ly/aws-dbgap
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
NIH security best practices• Physical security
– Data center access and remote administrator access
• Electronic security– User account security (for example, passwords)– Use of access control lists (ACLs)– Secure networking– Encryption of data in transit and at rest– OS and software patching
• Data access security– Authorization of access to data– Tracking copies; cleaning up after use
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
EnterpriseApplications
Virtual Desktops
Collaboration and Sharing
PlatformServices
Databases
Caching
Relational
NoSQL
Analytics
Hadoop
Real-time
Data Workflows
Data Warehouse
App Services
Queuing
Orchestration
App Streaming
Transcoding
Email
Search
Deployment & Management
Containers
DevOps Tools
Resource Templates
Usage Tracking
Monitoring and Logs
Mobile Services
Identity
Sync
Mobile Analytics
Notifications
FoundationServices
Compute(VMs, Auto Scaling and Load Balancing)
Storage(Object, Block, and Archive)
Security & Access Control
Networking
Infrastructure Regions CDN and Points of PresenceAvailability Zones
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Amazon Virtual Private Cloud (Amazon VPC)
Create secure network configurations for working with sensitive data
EC2
10.0.2.12
AWS region – VPC network isolation
AZ A AZ B
VPC 10.0.0.0/16
SN 10.0.1.0/24 (DMZ) SN 10.0.2.0/24 (Private)
(23.20.103.11)
Internet
EC2
10.0.1.11
Internet GW Service
Virtual Gateway
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
EnterpriseApplications
Virtual Desktops
Collaboration and Sharing
PlatformServices
Databases
Caching
Relational
NoSQL
Analytics
Hadoop
Real-time
Data Workflows
Data Warehouse
App Services
Queuing
Orchestration
App Streaming
Transcoding
Email
Search
Deployment & Management
Containers
DevOps Tools
Resource Templates
Usage Tracking
Monitoring and Logs
Mobile Services
Identity
Sync
Mobile Analytics
Notifications
FoundationServices
Compute(VMs, Auto Scaling and Load Balancing)
Storage(Object, Block, and Archive)
Security & Access Control
Networking
Infrastructure Regions CDN and Points of PresenceAvailability Zones
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Encrypt your data prior to sending to AWS
Your applications in your data center
Your applications in Amazon EC2Encrypted
data
AWS Services
Amazon S3
Amazon Glacier
Amazon Redshift
Amazon Elastic Block Store
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Encryption: a brief primer
PlaintextPHI
Hardware/Software
EncryptedPHI
SymmetricData Key
Encrypted Data Key
Master KeySymmetricData Key
?
EncryptedData in Storage
Key Hierarchy
?
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Encryption of AWS storage services
Amazon EBS
Amazon S3
• HTTPS• AES-256 server-side encryption• AWS or customer-provided or customer-managed keys• Each object gets its own key