Top Banner
Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer Yifeng Tao 1 , Chunhui Cai 2 , William W. Cohen 1,* , Xinghua Lu 2,3,* 1 School of Computer Science, Carnegie Mellon University 2 Department of Biomedical Informatics, University of Pittsburgh 3 Department of Pharmaceutical Sciences, School of Pharmacy, University of Pittsburgh 1
19

Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Jul 19, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact TransformerYifeng Tao1, Chunhui Cai2, William W. Cohen1,*, Xinghua Lu2,3,*

1School of Computer Science, Carnegie Mellon University2Department of Biomedical Informatics, University of Pittsburgh

3Department of Pharmaceutical Sciences, School of Pharmacy, University of Pittsburgh

1

Page 2: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Tumor origin and progression

• Cancers are mainly caused by somatic genomic alterations (SGAs)• Driver SGAs (~10s/tumor): Promote tumor progression• Passenger SGAs (~100s/tumor): Neutral mutations• How to distinguish drivers from passengers?

2

S Nik-Zainal et al. 2017

Page 3: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Cancer drivers

• How to distinguish drivers from passengers?• Frequency: recurrent mutations more likely to be drivers

• Conserved domain: protein function significantly disturbed

• All unsupervised. But drivers are defined as mutations that promote to tumor development…

3

B Vogelstein et al. 2013ND Dees et al. 2012MS Lawrence et al. 2013

B Reva et al. 2011 B Niu et al. 2016

Page 4: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Cancer drivers

• Identify driver SGAs with supervision of downstream phenotypes

• Change of RNA expression• Differentially expressed

genes (DEGs)

• Candidate models• Bayesian model (C Cai et al. 2019)

• Lasso/Elastic net (R Tibshirani1994)

• Multi-layer perceptrons(MLPs) (F Rosenblatt 1958)

• Models do prediction & driver detection?

4

Model (?) thatpredicts DEGs accurately& identifies driver SGAs

Page 5: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Self-attention mechanism

• Models do prediction & driver detection?

• Attention mechanism• Initially in CV (K Xu et al.

2015)/NLP (A Vaswani et al. 2017)

• Better interpretability• Improves performance

• Self-attention mechanism (Z Yang et al. 2016)

• Contextual deep learning framework: weights determined by all the input mutations

5

Model with self-attention thatpredicts DEGs accurately& identifies driver SGAs

𝛼" 𝛼# 𝛼$ 𝛼% 𝛼& 𝛼'𝛼( = 1

Page 6: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Genomic impact transformer (GIT)

• Transformer: encoder-decoder architecture• Encoder: self-attention mechanism; Decoder: MLP

6

Page 7: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Encoder: Multi-head self-attention

• Tumor embedding is the weighted sum of gene embeddings:

• Weights determined by input gene embeddings:

7

Page 8: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Pre-training gene embedding: Gene2Vec

• Co-occurrence pattern (e.g., mutually exclusive alterations)

8

g

c

Pathway 1

Pathway 2

Pathway 3

MD Leiserson et al. 2015 T Mikolov et al. 2013

Page 9: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Improved performance in predicting DEGs

• Predicting DEGs from SGAs• Conventional models• Ablation studies

9

51

53

55

57

59

61

63

F1 s

core

73

74

75

76

77

78

79

Accu

racy

Page 10: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Candidate drivers via attention mechanism

10

Page 11: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Gene embedding space

• Functionally similar genes are close in gene embedding space• Qualitatively and quantitatively (i.e., GO enrichment, NN accuracy)

11

Page 12: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Tumor embedding: Survival analysis

• Tumor embeddings reveal distinct survival profiles

12

Page 13: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Tumor embedding: Drug response

• Tumor embeddings are predictive of drug response

13

Page 14: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Conclusions and future work

• Biologically inspired neural network framework• Identifying cancer drivers with supervision of DEGs• Accurate prediction of DEGs from mutations

• Side products• Gene embedding: informative of gene functions• Tumor embedding: transferable to other phenotype prediction tasks

• Code and pretrained gene embedding:https://github.com/yifengtao/genome-transformer

• Future work• Fine-grained embedding representation in codon level• Tumor evolutionary features, e.g., hypermutability, intra-tumor

heterogeneity

14

Page 15: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Acknowledgments

• Dr. Xinghua Lu• Dr. William W. Cohen• Dr. Chunhui Cai• Michael Q. Ding• Yifan Xue

15

Page 16: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Quantitative measurement of gene embeddings

• Functional similar genes à closer in embedding space• Go enrichment:

• NN accuracy:

16

4

5

6

7

8

9

10

11

Random pairs Gene2Vec Gene2Vec+GIT

NN

acc

uray

(%)

Page 17: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Tumor embedding space

17

Page 18: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Gene2Vec algorithm

18

Page 19: Predicting Cancer Phenotypes from Somatic Genomic ...yifengt/paper/tao2020git-slides-v1.0.pdf · Predicting Cancer Phenotypes from Somatic Genomic Alterations via Genomic Impact Transformer

Gene2Vec: Co-occurrence patterns

• Co-occurrence does not necessarily mean similar embeddings• Ex 1: two cats sit there .• Ex 2: two cats stand there .• Ex 3: two dogs sit there .

19

Pathway 1:number

Pathway 2:noun

Pathway 3:verb

MD Leiserson et al. 2015 T Mikolov et al. 2013

onetwo

several

cat

dog

stand

sit

lie