1 Harvard Medical School Mapping Transcription Mechanisms from Multimodal Genomic Data Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT Division of Health Sciences and Technology Harvard Medical School March 10, 2010
21
Embed
Mapping Transcription Mechanisms from Multimodal Genomic Data
Mapping Transcription Mechanisms from Multimodal Genomic Data. Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni. Children ’ s Hospital Informatics Program Harvard-MIT Division of Health Sciences and Technology Harvard Medical School March 10, 2010. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Harvard Medical School
Mapping Transcription Mechanisms from Multimodal Genomic Data
Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni
Children’s Hospital Informatics ProgramHarvard-MIT Division of Health Sciences and Technology
Harvard Medical School March 10, 2010
2
Harvard Medical School
Information Flow in Multimodal Genomic Data
• Genetic Variants– 100k – 1000k SNPs– 250k copy number
Expression Quantitative Trait Loci (eQTLs)• Connection from variant to expression is an
information channel– A DNA locus is modulating the expression level of
a gene = eQTL• Cis(Trans) eQTLs are the genetic variants
located close to (far away) genes.• Identifying cis-eQTLs is easier
– Focusing on cis-eQTL reduces search space– trans eQTLs?
4
Harvard Medical School
• Cancer: based on genetic modification (variants) and cellular malfunction (gene expression)
• Identification of eQTLs helps understand molecular mechanisms in cancer and provides biological insight.
• Clinical study of Acute lymphoblastic leukemia (ALL)– The most common malignancy in children, nearly one third of all
pediatric cancers.– A few cases are associated with inherited genetic syndromes (i.e., Down
syndrome, Bloom syndrome, Fanconi anemia), but the cause remains unknown.
• Data– 29 patients.– Genotyped 100,000 SNPs (Affymetrix Human Mapping 100K).– Profiled 50,000 gene expressions (Affymetrix HG-U133 Plus 2.0).
Clinical Study on Pediatric Leukemia
5
Harvard Medical School
Challenges in Finding eQTLs
• Compare the distribution of each Variant to the levels of each expression measurement– Computational
• All pairs of variants vs. expressions is costly• Usually discretize expression levels (Pensa et al., BioKDD, 2004)
– Multiple testing considerations• Understanding
– Too many associations to test via laboratory science• Computational methods of biological discovery• Want to summarize main informational (biological) pathways
• Answer: Use transcriptional information
6
Harvard Medical School
Transcriptional Information Channel
X Y
SNPs are modeled as binomial variables.
Expressions are modeled as log-normal variables.
• Mutual Information quantifies information flow:
• Higher MI is achieved by larger σ2 and smaller σk2 , i.e., when expression level Y is more likely modulated by SNP X.
Transcription Channel
• Info Theory:measures Entropy,H(X)
7
Harvard Medical School
• Transcript Y is modulated by SNP X:
• Transcript Y is independent of SNP X:
8
Harvard Medical School
Transcriptional Information Map
X1 Y1
Y2
X4 Y4
X5 Y5
X6
X7 Y7
Y8
X9 Y9
X8
Y3
Y6
9
Harvard Medical School
ALL Transcriptional Information Map of Chr21
10
Harvard Medical School
Cluster Genes and SNPs into Networks
X1 Y1
X2 Y2
X3
X4 Y4
X5 Y5
X6
X7 Y7
Y8
X9 Y9
X8
Y3
Y6
11
Harvard Medical School
X1 Y1
Y2
X3
X4
Y9
X8
Cluster Genes and SNPs into Networks
• We can further infer the optimal modulation patterns using Bayesian networks.
12
Harvard Medical School
• Bayesian networks are directed acyclic graphs: – Nodes correspond to random variables.– Directed arcs encode conditional probabilities of the target nodes on the source nodes.
– p(X) depends on (A,B)– p(Z|X,Y) independent of (A,B)
Bayesian Networks
A
B
X
Y Z
13
Harvard Medical School
Infer Bayesian Networks in Individual Clusters
Y1
Y2
Y9• Step 1: Use TIM as the initial network.• Step 2: Bayesian network infers SNP-SNP connections.
14
Harvard Medical School
A Bayesian Network Inferred from Chr21 TIM
15
Harvard Medical School
Information Theoretic Network Analysis
• Find hubs, motifs, guilds, etc.– Abstract edges– Global patterns -> local patterns– Reveal emergent properties– Information theoretic approach using Data
Compression
• Alterovitz G, and Ramoni MF, “Discovering biological guilds through topological abstraction,” AMIA Annu Symp Proc, pp. 1-5, 2006.
16
Harvard Medical School
Identified Fundamental Components
Reference: Alterovitz and Ramoni, AMIA Annu Symp Proc, pp. 1-5, 2006.
17
Harvard Medical School
Identification of Cis- and Trans eQTL
• RIPK4, 21q22.3– Related to Downs
Syndrome– RIPK4 has 5
(trans) SNPs in q11.2 (shown as blue in the figure) affecting its expression.
RIPK4
18
Harvard Medical School
Identification of Cis and Trans eQTL• CYYR1, 21q21.1
– Recently discovered. – Encodes a cysteine and
tyrosine-rich protein.– Recent study found a
correlation with neuroendocrine tumors.
– TIM shows CYYR1 modulated by SNPs across the q arm of chromosome 21.
– DSCAM related to Down’s syndrome
– DSCAM-CYYR1 interaction leads to ALL?
DSCAM
19
Harvard Medical School
Complete TIM Algorithm
Infer Network in Individual
Clusters
Cluster 1
Cluster N
...
...
...
...
...
Compute Transcriptional
Information
...
...
...
...
Genetic Variant Transcript
Group Linked SNPs and Transcripts
Cluster 1
Cluster N
. . .
Network Topology
Analysis and Summary
20
Harvard Medical School
Transcriptional Information Maps
• Make large multimodal genetic dataset amenable to transcriptional analysis
• Identifies– Modulation patterns between genetic variants
and transcripts.– CIS and TRANS eQTL.
• Analysis of pediatric ALL helps identify biological hypotheses regarding connection to Down’s syndrome
21
Harvard Medical School
Questions?Thanks to
Prof. Marco F. Ramoni, Dr. Hsun-Hsien Chang, Dr. Gil Alterowitz, Children’s
Hospital Informatics Program, Brigham and Women’s Hospital