NCI Informatics 2014Warren Kibbe
The views expressed are my own and not a reflection of DHHS, NIH or NCI policy
Some history• Back to the dawn of time…my first BRIITE
And some very emotional imagery
BRIITE 2001• November 16 and 17th
BRIITE 2001• November 16 and 17th
Fast forward to the end of 2011• BSA informatics working group assembled
in 2011• BSA IWG report of 2011• Ken steps down under enormous pressure
and criticism in December 2011• George Komatsoulis appointed acting
director• CBIIT is pounded by waves of uncertainty
CBIIT, 2013• 71 Federal staff• Serving 6500 NCI staff across 18 buildings• 5 petabytes of NCI data• 2.5 petabytes of TCGA data• 2- 5 MW new data centers• Just completed a rollout of Unified
Communications to 5000 NCI staff– 1.5 FTEs, now on loan to NIH CIT to deliver
UC to 45000 desktops
DHHS Requirements• FISMA Moderate• Complete move to IPv6 by Oct 2014• Data center consolidation• Two factor authentication • Only government furnished equipment (‘GFE’)
may connect to the network from outside (limits on VPN)
• Compensating controls…• Tiered network, appropriate traffic monitoring
and scanning
NCI General strategic objectives• Reduce cancer risk – public health• Improve cancer outcomes – better
treatment and survivorship• Educate providers and population• Provide informative data and powerful
examples
NCI CBIIT Guiding Principles• Supporting the mission of the NCI• Lowering barriers for the cancer
community• Promote the importance of informatics in
solving problems in public health, healthcare, precision oncology, and basic research
• Build communities around problems• Aggregate and disseminate knowledgeUsing computing technology to reduce
the incidence, suffering and mortality due to cancer
Highlights from the November National Cancer Forum Policy Summit
My outline
Disruptive technologies Getting socialWhat is big data?Open access to data
Disruptive Technologies
• Printing• Steam power• Transportation• Electricity• Antibiotics• Semiconductors &VLSI
design• http• High throughput biology
Systems view - end of reductionism?
Disruptive Technologies
• Printing• Steam power• Transportation• Electricity• Antibiotics• Semiconductors &VLSI design• http• High throughput biology• Ubiquitous computing
Everyone is a data providerData immersion
World:6.6B active mobile contracts1.9B smart phone contracts1.1B land linesWorld population 7.1B
US:345M active mobile contracts287M smart phone contractsUS population 313M
Getting Social• Measuring behavior across a population• Understanding behavior – can we provide
better risk estimates for individuals?• Social media is a big data opportunity – what
are the ethics of big data?• Synergize with the energy and immediacy of
patient advocates• Patients want more data sharing – how can
we facilitate that appropriately?This changes trial design – statistics until now has been focused on how to design an appropriate sample so that the sample can be generalized to the population – what happens when we measure the ENTIRE population ??
The future
• Elastic computing ‘clouds’• Social networks• Big Data analytics
• Precision medicine• Measuring health• Practicing protective medicine
Learning systems that enable learning from every cancer patient
Semantic and synoptic data
Intervening before health is
compromised
Open Data Access
• We need to provide data access to people outside of biomedicine who have the skills and training to mine and analyze data
• More access will mean more innovation
Precision Oncology
• The era of precision medicine and precision oncology is predicated on the integration of research, care, and molecular medicine and the availability of data for modeling, risk analysis, and optimal care
How do we re-engineer translational research policies
that will enable a true learning healthcare system?
Consent• In a learning healthcare system, we ‘learn’
from every patient who comes in for treatment. What is consent in this model? What is research?
• What role is there for standardized consent?• Are there ways to reimagine translational
research without consent? Would that help us?
CBIITs mission – the long form• CBIIT will help the cancer community
coordinate, aggregate, disseminate, promote cancer awareness, public health data, cancer risk reduction, novel treatments, quality of life and comparative effectiveness data, and basic and translational research outcomes
CBIIT strategic activities• Promote social media as a mechanism for
communication, education, and improving lifestyle choices
• Work productively with patient advocates• Understand risk factors leading to cancer• Support cancer models and modeling, e.g.
cancer initiation and progression• Promote precision oncology• Promote learning healthcare systems
Informatics strategic objectives• Lower barriers to data access, analysis
and modeling • Promote agility, flexibility, data liquidity• Promote Open Access, Open Data, Open
Source, Open Science• Promote semantic interoperability,
standards, CDEs and Case Report Forms
Informatics strategic objectives• Promote mobile and BYOD for patient reported
outcomes, education, surveillance, eligibility • Use informatics to improve and lower barriers
to clinical trials accrual• Use informatics to blur the distinction between
care and research – support clinical standards in research
• Identify and disseminate innovations and practices that make research more efficient and effective
Supporting Precision Oncology• Help bring together imaging, molecular,
pathology, labs, and clinical data in a highly structured and machine readable way to enable detailed characterization and action for individual patients
Learning Healthcare Systems• Enable the data flowing from precision
medicine to form learning healthcare systems, where we better characterize, model and predict the response, outcomes and quality of life for every cancer patient
Public Health• As a community we already know how to
prevent 50% of the current cancer burden world wide. Making more effective use of social media, mhealth approaches, virtual communities should enable us to impact vaccination rates (HPV, EBV, mono, hepatitis), and promote healthy lifestyles, including diet, exercise, and smoking cessation.
Public Health• These three factors - infectious disease,
smoking, and poor nutrition and exercise contribute to at least 50% of our current cancer burden. And the cost from loss of quality of life and pain and suffering is incalculable.
Lowering barriers for the community• Improve our patient-focused materials
dissemination technology. What is our Social Media strategy? Partnership with education and communication, healthcare organizations writ broadly.
Opportunities in prevention• How do we work together as a community
to make our prevention, communication and education researchers more effective and translate this to effect global change. We need to partner with social media and technology-savvy next generation behavioral psychologists!
Lowering barriers for the community• Simplify the creation and distribution of
CDE-based forms. Use existing medical terminologies (SNOMED, ICD, LOINC, RxNorm) whenever possible. Link every concept to UMLS as soon as feasible
Lowering barriers for the community• Simplify access to EVS, CDEs, NIC
Thesaurus (knowledge dissemination too!)– Ideally with NLM, CDISC, FDA, ONC, PCORI
as partners• Creative and appropriate security – we all
will need to live in a FISMA moderate world
• Simplify data access – move toward a ‘library card’ model?
Collaborate with patients• It is still a very rare event for patients or
even patient advocates to be involved during the planning or implementation of any cancer informatics project
• We need to do better if we are going to meet the needs of our patients
• When the requests are impossible for us to meet with our existing processes and workflow, it is time to re-design and re-implement!
Precision Oncology• As I mentioned with EVS and CDEs, we
need to incorporate clinical standards into research where ever and whenever appropriate
• Our ability to semantically reason and make inference over diverse data types is critical to realizing the goals of Precision Oncology
• NLP, ontologies, checklists, CDEs embedded in forms will let us move to next gen data capture
Enabling Analytics• If we have captured and annotated our
data using reasonable, well-defined semantics, this will enable data mining and discovery
Molecular Medicine• While this goes hand in hand with Precision
Medicine, it requires a focus on automated, well annotated data flows and multi-stage analysis/analytics. For instance, for next gen sequencing, there is primary stage data, secondary stage data, and tertiary stage data. These steps enable useful outputs, like BAM files, from each machine run. Imaging (functional MRI, high def optical, PET, CAT, etc) has similar (but more mature) data evaluation
Molecular Medicine• Incorporating molecular results into clinical
decision support is the end game. To make good decisions, we need to be constantly sampling and re-evaluating the latest outcomes. This dynamic model presents many problems – how do we do this with a high level of integrity and reliability while maintaining agility?
NCI activitiesJust a few…
• EVS, NCI Thesaurus, NCI Metathesaurus• CDEs, Case Report Forms• RAS Initiative – hub at NCI Frederick
NCI activitiesJust a few…
• NCI Cloud Pilot– How technically can we bring community
computation to large (2.5 petabyte) data sets– What is the sustainability model?
• TCGA re-imagined – Genomics Data Commons– Many technologies used, many different QA
and analysis pipelines– Standardization and re-analysis of existing data
NCI ActivitiesJust a few…
• MATCH trial– Initial findings from IMPACT– Couples molecular findings with a decision tree
for treatment• Cooperative Groups & GBC
– Navigator• FDA Clinical Trials Repository
– Janus– Collaboration with the NCI
CBIIT NCIP activities• Focus on clinical trials (MATCH, CTRP,
CTR)• Focus on translation • Focus on imaging• Focus on molecules• Moving all projects to true open source• Semantic Infrastructure: EVS, NCI
Thesaurus, Metathesaurus, CDEs, CRFs• HubZero as a collaborative space…
Questions?