Top Banner
IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th , 2010 YOO-AH KIM NIH / NLM / NCBI
34

IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Mar 31, 2015

Download

Documents

Jacob Whitty
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES

Nov. 6th, 2010

YOO-AH KIMNIH / NLM / NCBI

Page 2: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Complex Diseases

Associated with the effects of multiple genesAs opposed to single gene diseases

The combination of genomic alteration may vary strongly among different patients

Dysregulating the same components, thus often leading to the same disease phenotype

Difficult to study and TreatCancer, Heart diseases, Diabetes, etc.

Page 3: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Copy Number Variations

Two copies of each gene are generally assumed to be present in a genome

Genomic regions may be deleted or duplicated causing CNV

Some CNVs are associated with susceptibility or resistance to diseases such as cancer

Copy Number Variations in 158 Glioblastoma patients

Page 4: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Identifying Genomic Causes in Complex Diseases

Identify genotypic causes in individual patients as well as dysregulated pathways

Systems biology approachGenome-wide searchGraph theoretic algorithms

Circuit flowSet cover

158 Glioblastoma multiforme patients

Page 5: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Glioblastoma multiforme (GBM)

the most common and most aggressive type of primary brain tumor in humans

Page 6: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Expression as Quantitative Trait

Genotype:Copy number variations

Phenotype:Gene expression

Page 7: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

eQTL (expression Quantitative Trait Loci) Analysis

While we assume that the genetic variation is the cause and expression change is the effect, we don’t know molecular pathways behind the relation

Putative target gene Putative causal gene/loci

Page 8: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Method Outline

A. Target gene selection Gene expression

B. eQTL Find association between

expression and copy number

C. Circuit flow algorithm Molecular interactions Candidate causal genes

D. Causal gene selection Weighted multiset cover

cases

target g

enes

gm

g3

g2

g1

tag lo

ci

sn

s3

s2

s1

s4

cases

causalgenes

cases

targetGene gm

tagSNP sn

causalgenes

+ -

A

CTF-DNA

phosphoryl.event

protein-protein

D

B

Page 9: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Target Gene Selection

Select a representative set of disease genes Filter differentially expressed genes

for each case Multi-set cover

Gene 1 Gene 2 Gene 3

.

.

.

.

.

Controls Disease Cases

Gene Expression

Page 10: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Associations between the expression of target genes and copy number variations of genomic loci Linear regression For every pair of tag loci and

target genes

eQTL

casestarget genes

tag Loci

cases

Page 11: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Finding Candidate Causal Genes

Genotypic Variations Target Genes

Page 12: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Finding Candidate Causal Genes

?

Genotypic Variations Target Genes

C1

C2

C3

C4

C5

Candidate Genes

Page 13: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Finding Candidate Causal Genes

Genotypic Variations Target Genes

C1

C2

C3

C4

C5

Candidate Genes

D

Interaction Network

protein-protein interactions phosphorylation eventstranscription factor interactions.

Page 14: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Finding Candidate Causal Genes

Genotypic Variations Target Genes

C1

C2

C3

C4

C5

Candidate Genes

u

v

D

Current flow

+-

Resistance (u, v) is set to be reversely proportional to (|corr (expr(u), expr(D))| + |corr(expr(v), expr(D))|)/2

Interaction Network

Page 15: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Finding Candidate Causal Genes

Genotypic Variations Target Genes

C1

C2

C3

C4

C5

Candidate Genes

D

Current flow

+-

Compute the amount of current entering each causal gene by solving a system of linear equations

Interaction Network

Page 16: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Method Outline

A. Target gene selection Gene expression

B. eQTL Find association between

expression and copy number

C. Circuit flow algorithm Molecular interactions Candidate causal genes

D. Causal gene selection Weighted multiset cover

cases

target g

enes

gm

g3

g2

g1

tag lo

ci

sn

s3

s2

s1

s4

cases

causalgenes

cases

targetGene gm

tagSNP sn

causalgenes

+ -

A

CTF-DNA

phosphoryl.event

protein-protein

D

B

Page 17: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Final Causal Gene Selection

cases

causal genesA putative causal gene explains a disease case if • its corresponding tag locus has a copy

number alteration• its affected target genes (i.e., genes

sending a significant amount of current to the causal gene) are differentially expressed in the disease case

Page 18: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Final Causal Gene Selection

cases

causal genesA putative causal gene explains a disease case if • its corresponding tag locus has a copy

number alteration• its affected target genes (i.e., genes

sending a significant amount of current to the causal gene) are differentially expressed in the disease case

Page 19: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Final Causal Gene Selection

cases

causal genesA putative causal gene explains a disease case if • its corresponding tag locus has a copy

number alteration• its affected target genes (i.e., genes

sending a significant amount of current to the causal gene) are differentially expressed in the disease case

WEIGHT

Page 20: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Final Causal Gene Selection

Find a smallest set of genes covering (almost) all cases at least k’ times minimum weighted multi-set cover

Page 21: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Dysregulated Pathways

Causal paths between a target and a causal gene a maximum current path

C1

C2

C3

C4

C5

D

Page 22: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Selected Causal Genes

Number of Genes Overlap with GBM genes

Step B: eQTL 16056 0.56 (75)

Step C: Circuit flow 701 0.045 (10)

Step D: Set cover 128 4.7 10-4 (6)

Page 23: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Results

128 causal genes from set cover (STEP D)

701 candidate causal gene from circuit flow algorithm (STEP C)

Page 24: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Causal Genes

BSOSC Review, November 2008

P-value Genes

Glioma 0.008 PRKCA,EGFR,AKT1,CDKN2A,CAMK2G,TP53,RB1,PTEN

Cell cycle 0.028 MCM7,CDKN2A,CDC2,TP53,ORC5L,RB1,ATR,BUB3,CUL1p53 signaling pathway 0.030 CDKN2A,CDC2,TP53,ATR,FAS,THBS1,PTEN

Proteasome 0.026 PSMA1,PSMC6,PSMB1,PSMC3,PSMA5,PSMA4

Functional analysis using DAVID

The selected causal gene set includes many known cancer implicated genes

Page 25: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

PTEN as causal gene

fold change- 0 +

TF-DNAprotein-protein

kinase

TF

causalgenes

Page 26: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

EGFR as causal and target gene

fold change- 0 +

kinase

TF

causalgenes

TF-DNAprotein-protein

phosphorylation

Causal EGFR

Target EGFR

Page 27: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Conclusion

A novel computational method to simultaneously identify causal genes and dys-regulated pathways Circuit flow algorithm Multi-set cover

Augmentation of eQTL evidence with interaction information resulted in a very powerful approach uncover potential causal genes as well as intermediate

nodes on molecular pathways Our method can be applied to any disease system

where genetic variations play a fundamental causal role

Page 28: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Acknowledgements

Teresa M. Przytycka Stefan Wuchty

Other group members Dong Yeon Cho Yang Huang Damian Wojtowicz Jie Zheng

Page 29: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Method Outline

A. Target gene selection Gene expression

B. eQTL Find association between

expression and copy number

C. Circuit flow algorithm Molecular interactions Candidate causal genes

D. Causal gene selection Weighted multiset cover

cases

target g

enes

gm

g3

g2

g1

tag lo

ci

sn

s3

s2

s1

s4

cases

causalgenes

cases

targetGene gm

tagSNP sn

causalgenes

+ -

A

CTF-DNA

phosphoryl.event

protein-protein

D

B

Page 30: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.
Page 31: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

EGFR as causal and target geneCAU

SAL PATHS

fold change- 0 +

kinase

TF

causalgenes

TF-DNAprotein-protein

phosphorylation

causal EGFR

target EGFR

Page 32: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

PTEN as causal geneCAU

SAL PATHS

fold change- 0 +

TF-DNAprotein-protein

kinase

TF

causalgenes

Page 33: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Our Method

Integrate several types of data Gene expression Copy number variations Molecular interactions

Page 34: IDENTIFYING CAUSAL GENES AND DYSREGULATED PATHWAYS IN COMPLEX DISEASES Nov. 6 th, 2010 YOO-AH KIM NIH / NLM / NCBI.

Methods and Results

Method model the expression change of disease

genes as a function of genomic alterations translated the propagation of information

from a potential causal to a disease gene as the flow of electric current through a network of molecular interactions.

multi-set cover: select most prominent genes

Validated our approach by testing the enrichment of selected causal genes with known GBM/Glioma related genes

diseasegene gm

tagSNP

sn

causalgenes

+ -