Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology
Jan 18, 2016
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics)
The Chinese University of Hong KongCSCI5050 Bioinformatics and Computational Biology
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 2
Lecture outline1. Special considerations in cancer omics
Last update: 13-Nov-2015
SPECIAL CONSIDERATIONS IN CANCER OMICS
Part 1
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 4
Some considerations• Large number of mutations– Structural variations– Driver vs. passenger mutations– Tumor heterogeneity
• Mixture of tumor and non-tumor cells• Emphasis of somatic changes– Choice of control samples
• Presence of cancer sub-types• Search for druggable targets
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 5
Large number of mutations• Causes:– Carcinogens (polycyclic
aromatic hydrocarbons (PAH) in cigarette smoke, UV, etc.)
– Defect of DNA repair– Disrupted apoptosis
pathway
Last update: 13-Nov-2015
Image credit: Brown and Attardi, Nature Reviews Cancer 5(3):231-237, (2005)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 6
Structural variations• In many types of omic studies, SVs are
considered rare.• In cancer omic studies, the detection of SVs is
considered an indispensible step.
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 7
Driver vs. passenger mutations• Driver mutation: Causal mutation in oncogenesis– Growth advantage– Positively selected
• Passenger mutation: Not contributing to cancer development
Last update: 13-Nov-2015
Image credit: Stratton et al., Nature 458(7239):719-724, (2009)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 8
Detection of driver mutations• Mutations that affect known cancer genes• Unexpected high frequency of recurrence– Same mutation in different cells in the same
sample• Detected by allele ratio• Implication of early event and positive selection
– Mutations that affect the same genes/pathways in different samples
– Statistical significance needs to carefully evaluated according to the non-uniform background [discussion paper]
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 9
Tumor heterogeneity• Due to the high mutation rate, different tumor
cells in a tumor can have different genomes– And potentially transcriptomes and epigenomes
• Standard sequencing of a tumor sample results in data that reflect the mixed population of cells rather than individual cells– Sequencing different parts of a tumor– Single-cell sequencing• Potential biases caused by whole-genome amplification
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 10
Tumor heterogeneity• Single-cell sequencing from different sectors of a
breast cancer sample:
Last update: 13-Nov-2015
Image credit: Navin et al., Nature 472(7341):90-94, (2011)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 11
Mixture of tumor and non-tumor cells
• Presence of infiltrating stromal and immune cells– Micro-dissection– Estimation of tumor content– Computational removal of “contaminating” data
from non-tumor cells
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 12
Consequence of non-tumor cells• Suppose the sample contains c% of non-tumor
cells– If G=AA in normal cells and there are no
sequencing errors, expect• (100-c)% reads supporting alternative allele if G=aa in
tumor cells• (100-c)/2% reads supporting alternative allele if G=Aa
in tumor cells
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 13
Consequence of non-tumor cells• Assuming all bases have a phred score of 30 (base
error=0.001) and 60x coverage:
Last update: 13-Nov-2015
1E-2091E-1961E-1831E-1701E-1571E-1441E-1311E-1181E-105
1E-921E-791E-661E-531E-401E-271E-14
0.1
0 10 20 30 40 50 60 70Contamination rate, c
Tumor genotype=aa, non-tumor genotyp=AA
Pr(D|G=aa)
Pr(D|G=Aa)
Pr(D|G=AA)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 14
Consequence of non-tumor cells• Assuming all bases have a phred score of 30 (base
error=0.001) and 60x coverage:
Last update: 13-Nov-2015
1E-1781E-1671E-1561E-1451E-1341E-1231E-1121E-101
1E-901E-791E-681E-571E-461E-351E-241E-13
0.01
0 10 20 30 40 50 60 70Contamination rate, c
Tumor genotype=Aa, non-tumor genotyp=AA
Pr(D|G=Aa)
Pr(D|G=AA)
Pr(D|G=aa)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 15
Emphasis on somatic changes• Comparing tumor and non-tumor samples• Choice of non-tumor control:– Normal tissue• How to obtain? Transplant?
– Tumor-adjacent from same patient• Can be considered as normal?
– Blood from same patient• Useful given different tissue types?
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 16
Special analysis pipelines
Last update: 13-Nov-2015
Figure credit: Saunders et al., Bioinformatics 28(14):1811-1817, (2012); Wang et al., Genome Medicine 5(10):91, (2013)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 17
Cancer sub-types• Patients diagnosed to have the same type of
cancer could have very different prognosis and drug response
• Cancer sub-types can be identified by molecular signatures
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 18
Cancer sub-types• Sub-types of gastric cancer:
Last update: 13-Nov-2015
Figure credit: The Cancer Genome Atlas Research Network, Nature 513(7517):202-209, (2014)
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 19
Druggable targets• Cancer omics do not only aim at
understanding the molecular mechanisms, but also identifying druggable targets
• Druggable targets: Proteins of mutated/aberrantly activated genes with known inhibitors– Other members of the same families (e.g.,
kinases)
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 20
Identification of druggable targets• Computational docking• siRNA screens• CRISPR-Cas9 knock-out
Last update: 13-Nov-2015
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 21
Summary• Factors of abnormal allele ratios– Copy number variation– Tumor heterogeneity– Contamination of non-tumor cells
• Some major research directions– Identification of driver (somatic) mutations– Discovery of cancer sub-types– Search for druggable targets
Last update: 13-Nov-2015