Top Banner
The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1
43
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

The Cancer Genome Browser

Sofie Salama

COAT-PhD Summer School 2012

1

Page 2: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

The Cancer Genome Browser

• OUTLINE– Slide show to introduce the Cancer Genomics Browser

• What’s there?• How to visualize the data?• Tools

– Live Demo• Basic setup• Breast cancer data

– Using signatures– Microarray vs RNA-Seq– Comparing across datasets

• GBM data– Genesets– What genes correlate with phenotypes?

– Playtime!2

Page 3: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

3https://genome.ucsc.edu

UCSC Genome Browser

• Base level to full genome display capability

• ENCODE• Human sequence

variation• Whole genome

association studies• Human genetic

and disease related genome annotation

Page 4: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

4https://genome-cancer.ucsc.edu

Large-scale Medical Genomics Datasets

New issues arise to visualize high-throughput cancer genomics data: data security and access control, sample cohort, multi-analytes, and clinical and phenotypic information.

Page 5: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

5

UCSC Cancer Genomics Browser

• Simultaneously display patient genomic and clinical data from a cohort of samples

• Base level to full genome display capability

• Multiple studies

• Growing list of published studies, including public-tier TCGA data

• Integrated with popular UCSC Genome Browser and its vast store of genomic information

Zhu J et. al Nature Methods. 2009 Sanborn JZ et.al. Nucleic Acids Res. 2010

Page 6: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

New UCSC Cancer Browser Portal

genome-cancer.ucsc.edu

Page 7: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

User Interface: A portal to display high throughput data sets

Teresa Swatloski, Brian Craft, Mary Goldmangenome-cancer.ucsc.edu

Page 8: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

toggle on/off RefSeq genes

link to tumor image browser

link to human genome browser

user sign in help menu

view in chromosome mode

select dataset to view

configuregenesets

configuregenomic signatures

view in gene mode

resize panels

position or gene search bar

User Interface Features

Teresa Swatloski, Brian Craft, Mary Goldmangenome-cancer.ucsc.edu

Page 9: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Dataset selection showing TCGA breast cancer data

TCGA breast cancer datasets•Gene expression, copy number, DNA Methylation, RPPA, Paradigmlite•TCGA clinical data

Teresa Swatloski

Page 10: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Genomic and phenotypic data heatmaps

Genomic data Clinical data

genome-cancer.ucsc.edu

Page 11: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Individual dataset layout

Samples

Genomic data Clinical data

Genomic locations / Genes

genome-cancer.ucsc.edu

Page 12: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Samples

Samples

Clinical Heatmap

sample_type days_to_last_followup

Solid tissue normal

Primary solid tumor

amplification deletion

Genomics Heatmap

Metastatic

•Multiple clinical features

•Clinical data encoded in color

Page 13: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Sample sorting determined by clinical data

• Sample (i.e. vertical) order is determined by the clinical data on the right• The samples is always sorted by clinical features• Tie break using subsequent clinical features

Samples

genome-cancer.ucsc.edu

Page 14: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Zoom in to See Individual Sample

drag zoomslider

genome-cancer.ucsc.edu

Page 15: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

genomic heatmap

clinical heatmap

heatmap view

adjust display coloring

configuration window for clinical variables, sample subgrouping and statistics

box plot summary view

proportions summary view

click to show dataset detail

remove dataset

Individual Dataset Control

Teresa Swatloski

Page 16: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Summary Views

Heatmap View - Amplified / Deleted Regions

Proportions Summary View

Box Plot Summary View

Page 17: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

glioblastoma multiforme

breastcarcinoma

lung squamous

cell

DNA Copy Number Profile Summary View

TCGA CNV

Page 18: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

glioblastoma multiforme

breastcarcinoma

lung squamous

cell

DNA Copy Number Profile Summary View

TCGA CNV

EGFRCDKN2A,CDKN2B

Page 19: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Genes View Mode

genome-cancer.ucsc.edu

Page 20: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

20

“Genes” Configuration

Currently displayed gene list

Three ways to add a gene list

Type or copy and paste user defined genes

1

2

3

genome-cancer.ucsc.edu Teresa Swatloski

Page 21: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Genes view to see the PAM50 intrinsic gene expression subtypes in TCGA Breast

data

Basal

LuA

LuB

Her2-like

Normal-likePAM50: Parker et al., Journal of Clinical Oncology (2009)

Page 22: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Basal

LumA

LumB

Her2

Tumor

Solid normal

Same thing with RNA-Seq Data

Page 23: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Online statistical tests compare two subgroups

Samples

Subgroup samples

genome-cancer.ucsc.edu

Page 24: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Online statistical tests compare two subgroups

Samples

Subgroup samples

p values

genome-cancer.ucsc.edu

Page 25: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

click to view detail and use the variable to subgroup samples

perform statistical tests to compare subgroup1 and subgroup2

subgroup 1

subgroup 2

variables used in defining subgroups

“Active Feature List” area

Sample subgroup configuration

Page 26: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Compare subgroups using the summary view

EGFR amplification in GBM is largely in the non CpG island DNA methylator samples (non G-CIMP)

methylator samples in GBM is largely proneural by gene expression, also from younger patients, with better survival

Page 27: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Evaluate Genomic Signature on the Browser

B. Computed signatures online -> approximate prediction

A. Enter signature as an algebraic expression

Page 28: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Evaluate Genomic Signature on the Browser

• 21 gene signature predicts rate of recurrence at 10 yr in ER+ patients treated with TAM (Paik 2004)

• Genomic signature online approximation: higher score -> higher likelihood of recurrence; low score -> lower likelihood of recurrence

Page 29: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Evaluate Genomic Signature on the Browser

• Browser view of ER+ patients in a preoperative chemotherapy study dataset • Signature score correlates with pathCR: the paradox that ER+ patient who is more likely to

have recurrent disease in 10 years treated with TAM is also more likely to respond to chemotherapy

Page 30: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Genomic Signature Configuration

Current signatures

Three ways to add a genomic signature1

2

3Enter signature as an algebraic expression

Such as: + TP53 – 0.25* ERBB2

Teresa Swatloski

Page 31: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

User Support

[email protected]

Mary Goldman, Teresa Swatloski

Page 32: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Web APICreate a url to specify a view to the cancer browser•base: https://genome-cancer.ucsc.edu/hgHeatmap/#?•data track(s): comma separated gene names

•display mode•gene list: coma separated gene names

•chromosomal position •genomic signature: e.g. +TP53-0.25*ERBB2

Examples

•dataset=vijver2002&pos=chr2:123767566-chr2:187943340

•dataset=ucsfNeveCGH&displayas=geneset&gene_list=TP53,ERBB2

Documentation

https://genome-cancer.soe.ucsc.edu/proj/site/help Brian Craft, Mary Goldman

Page 33: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

User Account and Security

Brian Craftgenome-cancer.ucsc.edu

Page 34: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

cgData: Cancer Genomic data specification

• Gene expression, copy number, RPPA, DNA methlylation, siRNA viability, phenotypes, clinical data

• Support large-scale genomic data repository

- Currently supports Cancer Browser

- Plan to support automated data analysis pipeline

• “Solve” (address) common data linking problem

• Meta data tracking

• Once data in this specification, automated data ingestion to UCSC Cancer Browser

Kyle Ellrott

Page 35: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Cancer Browser Updates

• Current improved version launched January, 2012

• Monthly data freeze

• Latest freeze data viewable on the Cancer Browser within a few days

• July, 2012 – Added ability to download processed datasets and improved user interface for clinical features, subgrouping and statistics

Page 36: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

Data freeze 2012-02-28 summary (sample number)

Page 37: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

37

Summary

• Simultaneously display patient genomic and clinical data from a cohort of samples

• Multiple studies data visualization• Base level to full genome, and genesets display

capability• cgData data repository driven

• Monthly data freeze and version control

• User account

• Project-specific access-control

• Single signon portal

• Provide web API for linking

[email protected]

Page 38: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

DCC,Firehose

UCSC cgData Repository

UCSC Next-gen Sequencing UCSC Next-gen Sequencing Data AnalysisData Analysis•DNA-seq (bambam, bridget)DNA-seq (bambam, bridget)•mutation, allelic-specific copymutation, allelic-specific copy number, number, structural rearrangementstructural rearrangement•Combined RNA/DNA analysisCombined RNA/DNA analysis•RNA editingRNA editing

converter

browser

pathway analysisClinical Clinical Predictors Predictors (TopModel)(TopModel)

Bam files Mutation call Mutation call comparisoncomparison

PARADIGM PARADIGM pathway pathway analysisanalysis

UCSC Cancer UCSC Cancer Genomics Genomics BrowserBrowser

cBio

Page 39: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

39

UCSC Cancer Genomics GroupBrian CraftTeresa SwatloskiMary GoldmanKyle EllrottErich WeilerChris WilksSinger MaChristopher SzetoSofie SalamaMia GriffordSam NgTed GoldsteinDan CarlinDaniel ZerbinoMelissa ClineMark DiekhansJosh StuartDavid Haussler

CollaboratorsThe Cancer Genome AtlasStand Up To CancerIntl. Cancer Genomics ConsortiumISPY consortiumMSKCCLINCS consortiumChristopher Benz, Buck InstituteLaura Esserman, UCSFJoe Gray, OHSUEric Collisson, UCSFGordon Mills, MDACCRachel Schiff, BCM

Funding AgenciesNCI/NIH, NHGRIAmerican Association for Cancer Research

Acknowledgment

Page 40: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

The Cancer Genome Browser

• OUTLINE– Slide show to introduce the Cancer Genomics Browser

• What’s there?• How to visualize the data?• Tools

– Live Demo• Basic setup• Breast cancer data

– Using signatures– Microarray vs RNA-Seq– Comparing across datasets

• GBM data– Genesets– What genes correlate with phenotypes?

– Playtime!40

Page 41: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

cgData Packages

genomic data A (CNV)

genomic data B (RPPA)

clinical data1(FFPE, timepoint)

clinical data 2(patient, age,..)

meta-data

Most likely your data files

Need to add meta data file

meta-datameta-data

meta-data

Page 42: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

cgData Packages

idMap(TCGA BRCA)

genomic data A (CNV)

genomic data B (RPPA)

clinical data1(FFPE, timepoint)

clinical data 2(patient, age,..)

TCGA-01-ABCD-01A

TCGA-01-ABCD-01A-EG

TCGA-01-ABCD

TCGA-01-ABCD-01A-JH

patient

sample

aliquot

sample

aliquot

Page 43: The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1.

cgData Packages

idMap(TCGA BRCA)

genomic data A (CNV)

genomic data B (RPPA)

clinical data1(FFPE, timepoint)

clinical data 2(patient, age,..)

Mostly likely already in UCSC cgData library

Most likely your data files

Need to add meta data file

Identifiers used in data files

parent-child relationships

probeMap Bassembly

(hg18)probeMap B(antibody)