Page 1
Webinar Session 5
Metabolomics and Beyond: Challenges and Strategies for Next-Gen Omic Analyses
Dr. Dmitry Grapov
Data Scientist,
CDS- Creative Data Solutions and
Genome Data Analytics,
Monsanto, USA
[email protected] Please note that the Webinars are presently free, courtesy of the Metabolomics Society and will be uploaded to the society's website. Please feel free to contact us with any questions or suggestions via [email protected]
Page 2
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Dmitry Grapov, PhDCDS- Creative Data Solutions
Page 3
Background
Born: Minsk, Belarus in 1981
Minsk, BelarusUniversity of Utah (2000-2007)• B.S. Biology • B.S. Chemistry
Salt Lake City, UT
University of California, Davis (2007-2012)• Ph.D. Analytical Chemistry
with Emphasis in Biotechnology
• Post doc, Oliver Fiehn Lab
Davis, CA
Interests:• Omics, integromics, microbials and big biological data• Multivariate data analysis and visualization, machine learning and software design
WCMC
• Principal Statistician at the NIH West Coast Metabolomics Center (WCMC)
Data Scientist• CDS - Creative Data
Solutions• Genome Analytics,
Monsanto
St. Louis, MO
Page 4
Experience: Omic’ data analysis and visualization
Grapov et. al., Circ. Cardiovasc. Genet. 2014
Network Analysis
Multivariate Modeling
Grapov et. al.,PLoS ONE (2014) doi:10.1371/journal.pone.0084260
J. Proteome Res., 2015, 14 (1), pp 557–566 DOI: 10.1021/pr500782g
Biomarker validation
• Metabolomics can offer real-time insight into treatment efficacy and drive personalized medicine decisions
Page 5
Metabolomics: study of small molecules
Page 6
Metabolome: a proxy for phenotype
Page 7
• Large and complex studies
• Integration of multiple biochemical domains
• Interpretation of experimental results within a biological context
Challenges for Next-gen Omic Analyses
Page 8
Large longitudinal studies may be required to identify small phenotypic and environmental effects
http://teddy.epi.usf.edu/TEDDY/
TEDDY: The Environmental Determinants of Type 1 Diabetes in the Young
multi-Omic longitudinal study involving > 15,000 samples acquired over 3 yrs
Time
TimeAnalytical batch effects can hide smaller
biological effects
Page 9
Data normalization strategies should be considered during experimental design
Analyte specific data quality overview
normalizations can be used to remove analytical variance
Raw Data Normalized Data
log mean
low precision
%RS
D
high precision
Page 10
Data normalization may require a combination of approaches
Internal standard (ISTD) based normalization
Retention time of normalized compounds
Number of analytes optimally normalized by each ISTD
(qcISTD)
qcISTD: analytical replicate optimize QC selection
Page 11
Data normalization may require a combination of approaches
Internal standard (ISTD) based normalization may not fully remove analytical batch effects
Analytical replicate-based normalizations can be used to estimate and remove
analytical variance
Raw Data Normalized Data
SamplesQCs
LOESS
Page 12
Quality Control (QC) based normalizationOptimal method should use no sample knowledge
Across-batch performance
Within-batch performance
14,526 measurements of 443 variables acquired
over 2 years
Comparison of normalization methods
Raw (RSD ~75)
Normalized (25)
Page 13
Normalizations need to be numerically and visually validated
Good
Bad: QCs don’t match samples
Bad: overtrained
Challenge: getting appropriate QCs and implementation of normalizations
Page 14
Identification of systems of changes requires integration of multiple analytical platforms
Am J Clin Nutr. 2015 Aug;102(2):433-43. doi: 10.3945/ajcn.114.103804. Epub 2015 Jul 8.
Page 15
Modern metabolomic analyses often require combinations of multiple measurement platforms
American Journal of Physiology - Endocrinology and Metabolism 2015 Vol. no. , DOI: 10.1152/ajpendo.00019.2015
Page 16
PMID:24204828
2009
~10% variance explained
Many diseases, including aging, have dominant metabolic components (e.g. metabolic syndrome)
Genotype + metabolome >40% variance explained
Type 2 DiabetesNeed for Integromics
Page 17
Omic’ data integration strategies
Biomarker Insights 2015:Suppl. 4 1-6 DOI: 10.4137/BMI.S29511
Empirical correlation
Network based
Biochemical pathway
Page 19
Metabolomic network analysis
Page 20
MetaMapR: Metabolomic network calculation
http://dgrapov.github.io/MetaMapR/
Page 21
MetaMapR: Metabolomic network calculation
• Biochemical reactions
• Structural similarity
• Mass spectral similarity
• Empirical relationships
Page 22
MetaMapR: Network visualization
Page 23
Omic’ network analysis
http://kwanjeeraw.github.io/grinn/
Page 24
MappingsNetwork Mapped Network
Grapov D.,American Society of Mass Spectrometry Conference (2013, 2014)
Network Mapping
+ =
Page 25
DeviumWeb: Data analysis and visualization
https://github.com/dgrapov/DeviumWeb
Page 26
DeviumWeb: Interactive visualization
Page 27
DeviumWeb: Statistical Analysis
Page 28
DeviumWeb: Cluster Analysis
Page 29
DeviumWeb: Exploratory Analysis
Page 30
DeviumWeb: Predictive Modeling
Page 31
DeviumWeb: Pathway analysis
Page 32
Thank you:
Metabolomics SocietyDr. Biswapriya Misra
CollaboratorsDr. Johannes FahrmannDr. Kwanjeera WanichthanarakDr. Oliver FiehnDr. Suzanne MiyamotoDavid Liesenfeld
Page 33
[email protected]
More information:https://imdevsoftware.wordpress.com/
Software:https://github.com/dgrapov
Hire me:[email protected]