Bayesian Networks as framework for data integration Jun Zhu, Ph. D. Department of Genomics and Genetic Sciences Icahn Institute of Genomics and Multiscale Biology Icahn Medical School at Mount Sinai New York, NY @IcahnInstitute UCLA workshop, July, 2013---Jun Zhu, Ph. D.
44
Embed
Bayesian Networks as framework for data integration · 2013. 7. 23. · Bayesian Networks as framework for data integration Jun Zhu, Ph. D. Department of Genomics and Genetic Sciences
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Bayesian Networks as framework for data integration
Jun Zhu, Ph. D.
Department of Genomics and Genetic Sciences
Icahn Institute of Genomics and Multiscale
Biology
Icahn Medical School at Mount Sinai
New York, NY
@IcahnInstitute UCLA workshop, July, 2013---Jun Zhu, Ph. D.
What are Bayesian networks?
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Association vs Causality
From Stephen Friend
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
A simple biological question: are there
causal/reactive relationships?
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
A Bayesian network approach:
Best model
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
A Bayesian network approach:
A
B C
Best models Markov Equivalent models
A
A
A
B
B
B
C
C
C
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
A Bayesian network ≠ a causal structure
Markov Equivalent models
A
A
A
B
B
B
C
C
C
|B C A
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
X
F1
F2
F0
Diabetes
resistant
Diabetes
susceptible
Animal model: mouse F2 intercrosses
Bayesian network: how to break
Markov equivalent?
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Liver Brain Muscle
White
adipose
Genotyping
Constructing
genetics map
Scanning QTLs
clinical traits Molecular profiling
Network
reconstruction
General data flow genetic crosses
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Variation in mRNA leads to
variation in protein, which in
turn can lead to disease
Causal inference: genetics
Perturbations with a causal anchor
--Natural variation in a segregating population provides the same type of
causal anchor
DNA Supporting
Gene X
Variation in DNA leads to
variation in mRNA
AA
CA
GT
T
AA
CG
GT
T
High expression, alt
splicing, codon
change, etc.
Low expression, no alt.
splicing, no codon
change, etc.
Central Dogma of Biology
Schadt et al. Nature Genetics (2005)
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
A Bayesian network approach:
Best models Markov Equivalent models
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Structure priors based on causality
▶ Estimate confidence of causality
– Bootstrap samples for 200
times
– Factions of causal, reactive,
independent calls
▶ The pair is independent
▶ The pair is causa/reactive
Zhu et al., PLoS CompBio, 2007
Bayesian network: integrating genetic data
• Give a sense of causality to Bayesian network
• how much improvement is achieved by integrating genetic data?
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Bayesian Network: a simulation study
Zhu et al., PLoS CompBio, 2007
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Bayesian network: Genetics information is critical
when sample size is small
Largest improvement in recall occurs
with smaller sample sizes
Zhu et al., PLoS CompBio, 2007
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Bayesian network: integrating genetic data
L1 L2 Ln-1 Ln
G1 G2 Gn-1 Gn Gj
Lj
Cis-regulation
Genetic loci
trans-regulation Transcriptional regulation
Gene
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
recall
pre
cis
ion
Weak signals Strong signals
300 samples 900 samples 300 samples 900 samples
Bayesian network: why samples matter?
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
UCLA workshop, July, 2013---Jun Zhu, Ph. D.
Bayesian network: integrating genetics
Experimental Hsd11b1 signature : mice treated with Hsd1
inhibitor
Prediction Hsd1 signatures based on BxD data
Correlation to Hsd1 10% of predicted signature overlap with experimental one
BN without genetics 20% of predicted signature overlap with experimental one
BN with genetics 52% of predicted signature overlap with experimental one
• The activity of Leu3p is positively regulated by alpha-isopropylmalate (IPM), the product of the first step in leucine biosynthesis
Sze JY, et al. (1992) In vitro transcriptional activation by a metabolic intermediate: activation by Leu3 depends on alpha-isopropylmalate. Science 258(5085):1143-5
• The degree of activation by Leu3p is Leu3p concentration dependent, and it has been shown that LEU3 gene expression is regulated by general amino acid control, which is mediated by the GCN4 transcription factor
Zhou K, et al. (1987) Structure of yeast regulatory gene LEU3 and evidence that LEU3 itself is under general amino acid control. Nucleic Acids Res 15(13):5261-73