The cBio Cancer Genomics Portal: An Open Platform for Exploring Multidimensional Cancer Genomics Data Ethan Cerami, Ph.D. Director, Cancer Informatics Development Computational Biology Center (cBio) Memorial Sloan-Kettering Cancer Center CBIIT Talk May 23, 2012 cBio Cancer Genomics Portal Introduction Motivation: Pathway Analysis Network Analysis CBIIT Talk Examples of Usage Advanced Options Web API / R Package TCGA Ecosystem & Future Plans http://cbioportal.org Friday, May 18, 12
On May 23, Dr. Ethan Cerami delivered a presentation titled "cBio Cancer Genomics Portal." This presentation provided an introduction to the portal and description on how to mine data generated by The Cancer Genome Atlas (TCGA) project.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The cBio Cancer Genomics Portal: An Open Platform forExploring Multidimensional Cancer Genomics Data
Ethan Cerami, Ph.D.Director, Cancer Informatics Development
Computational Biology Center (cBio)Memorial Sloan-Kettering Cancer Center
CBIIT Talk May 23, 2012
cBio Cancer Genomics Portal
Introduction
Motivation: Pathway Analysis Network Analysis
CBIIT Talk
Examples of Usage Advanced Options Web API / R Package
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Comprehensive genomic characterization defines human glioblastoma genes and core pathways The Cancer Genome Atlas Research Network Nature 455, 1061-1068(23 October 2008) 4
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Web-Based Interface for Iterative Exploratory Data Analysis
Integration of
Genomic Data
Types, Clinical
Data, and Biologi-
cal Pathways.
OncoPrint: Compact Visualization of Discrete Genomic Events
Survival Analysis Network Analysis
Comprehensive Cancer Genomic Studies
Web-Service Interface
R-Package
MATLAB ToolBox
Mutation Details
Predicted Functional Impact
of Mutations
Multidimensional Genomic
Data Plots
Other Reports
Alteration Frequency (%)
...
cBio Cancer Genomics Portal
Gene A
Gene B
Gene C
Biological Insight
Clinical Trial Design
The cBio Cancer Genomics Portal Cerami, et. al, Cancer Discovery (May, 2012)
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
CBIIT Talk
cBio Portal in Context • Other Portals available:
• TCGA Data Portal
• ICGC Data Portal
• UCSC Cancer Genome Browser
• cBio Portal:
• Supports Exploratory Data Analysis
• Lowers the barrier to access - specifically for biologists andclinical researchers
• Provides integrated access to data
8
Friday, May 18, 12
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
CBIIT Talk
Multiple Portals • Public Portal: http://www.cbioportal.org/
• Contains published TCGA studies + a fewother studies.
• Now also contains public copy number, mRNA, RPPA data for all TCGA tumor types (everything, but mutation data).
• Open Access.
• TCGA Portal: http://cbio.mskcc.org/gdac-portal/
• Contains all provisional TCGA data, updatedmonthly.
• Requires a user name / password.
• Register at: http://bit.ly/gdac-form. • Stand-Up to Cancer (SU2C) Portal
4-step web interface for querying a single cancer study
RB1 CDK4 CDKN2A
Advanced: Onco Query Language (OQL)Enter Gene Set:
Select Cancer Study:
Select Patient/Case Set:
Select Genomic Profiles:
All Complete Tumors (seq, mRNA, CNA)
MutationsCopy Number Data. Select one of the profiles below:
mRNA Expression z-Scores
Glioblastoma (TCGA)
The Cancer Genome Atlas (TCGA) Glioblastoma project. 206 primary glioblastoma samples.Nature 2008. Raw data via the TCGA Data Portal.
Or Select from Example Gene Sets:
User-Defined List
Query Download Data
Optional Arguments:Compute Mutual Exclusivity / Co-occurence between all pairs of genes. (Not recommended for more than 10 genes.)
Submit
Putative copy-number alterations (RAE)
Putative copy-number alterations (GBM Pathways)
1
2 Select one or more genomic profilesFor example: Mutation and Copy Number Data
Select a Cancer Study or “All Cancer Studies”
3 Select a Patient Set
4 Enter a Gene or Gene Set
Optional argument to compute mutual exclusivity / co-occurence between all pairs of genes.
10
cBio Cancer Genomics Portal
Introduction
Motivation: Pathway Analysis Network Analysis
CBIIT Talk
Examples of Usage Advanced Options Web API / R Package
TCGA Ecosystem & Future Plans
Main Features:
11
Friday, May 18, 12
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Key Abstraction: Discrete Genomic-Level Events
• Each Gene within each sample is assigned multiple discrete genomic level events:
• Mutations: Mutated or WT.
• Copy Number: Amplification, Homozygous Deletion, etc.
• Important caveats:
• Portal does not provide confidence intervals for mutations.
• Copy number calls (as determined by GISTIC or RAE) are putative.
12
Friday, May 18, 12
New Tutorials Available
Friday, May 18, 12
13
Querying a Single Cancer Study
Friday, May 18, 12
14
Friday, May 18, 12
15
Friday, May 18, 12
16
Friday, May 18, 12
17
Friday, May 18, 12
18
Mutation Assessor is maintained by Boris Reva & Yevgeniy Antipin@ cBio.
Friday, May 18, 12
19
Friday, May 18, 12
20
Friday, May 18, 12
21
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Cross-Cancer Queries
22
How do Pi3K alterations varyacross ovarian and endometrial cancers?
Friday, May 18, 12
Available soon...!
Friday, May 18, 12
23
Pathway Commons
Reactome
HRPD
HumanCyc
BioGrid
MSKCC Cancer CellMap
IMID
IntAct
MINT
NCI Nature PID
PSI-MI
UniProt Entrez Gene RefSeqBio
PAX
ID Mapping
PC
Batch Download
http://www.pathwaycommons.org
Web Site
Web Service
Pathway Commons, a web resource for biological pathway data. Cerami, et. al, Nucleic Acids Res. 2011 24
Collaboration with Ugur Dogrusoz, Bilkent University; separately fundedby National Resource for Network Biology (NRNB) grant.
Friday, May 18, 12
26
Ovarian Cancer Gene Set: PTEN
Recently Added: RPPA Analysis
27
Friday, May 18, 12
OncoQuery Language (OQL)
RB1
RB1: MUT
RB1: HOMDEL MUT
Step 4: Onco Query Description OncoPrint Output
Default. Shows putative amplifications, homozygous deletions, and mutations.
Shows only mutations.
Shows putative homozy-gous deletions and mutations.
Steps 1-3
User selects TCGA Ovarian Cancer, with genomic profiles:
Mutations (next-gen) Putative CNA (GISTIC)
All Complete Tumors
User selects TCGA GBM, with genomic profiles:
mRNA Expression (Z-Scores)
All Complete Tumors
PTEN Default. Shows up-down mRNA regulation at least 2 standard deviation from the mean.
PTEN: EXP < -1 Shows only down-regulated mRNA events more than 1 standard deviation below the mean.
}}
A) Onco Query Examples: Copy Number and Mutations
B) Onco Query Examples: mRNA Expression Data
Putative Copy Number Amplification
Putative Homozygous Deletion
Mutation
mRNA up-regulation
mRNA down-regulation
Friday, May 18, 12
28
29
Endometrial Cancer: PIK3CA
PIK3CA
A
B C
Friday, May 18, 12
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Web Service API and R/MATLAB Packages
• Access via Web API
• Access via R Package and MATLAB Library A) Example Query: Retrieve all Cancer Studies
Get Genomic Profile Data Restrict to all TCGA Ovarian Cancer Samples
Retrieve Copy Number (GISTIC) Data
Gene List Amplification
Homozygous DeletionHemizygous Deletion
Putative Copy Number Status+2
-2
Gain+1
-1Diploid0
30
Friday, May 18, 12
R and MATLAB Packages
• Access portal data within R via the CGDS-R package.
• Available via CRAN.
• Vignette and Reference PDF available.
R Package maintained by Anders Jacobsen; MATLAB package maintained by Erik Larsson. 31
Friday, May 18, 12
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Integrating with the Cancer Genome Atlas Project (TCGA)
GDAC Broad Firehose
Data Coordination Center (DCC)
All Data...
TCGA Researchers
TCGA Disease Working Groups
cBio Portal (s)
32
Friday, May 18, 12
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
33
Firehose @ Broad
Data Coordination Center (DCC) @ NCI
Central repository for all TCGA data.
Pipeline for processing all TCGA data.
cBio Portal @ MSKCC
Open platform for exploring, mining and visualizing TCGA data.
UCSC Cancer Genome Browser
Web portal for exploring TCGA genomic, clinical, and image data.
Integrative Genomics Viewer
(IGV) @ Broad
High-performance visualization tool for interactive exploration of large, inte-grated genomic datasets.
Multidimensionalgenomic profiling data
Oncotator @ Broad
Web application for annotating human genomic point mutations and indels with data relevant to cancer researchers
Mutation Assessor @ cBio
Predicted functional conse-quences of mutations in cancer.
Web API
Implemented
Work In Progress
Legend
User Cross Links (Beta)
RB1CDK4CDKN2A
Freeze lists, subtypes, and other case lists
Tools at ISB
Regulome Explorer, ...
Proposed / Planned
Analysis Working Groups
Generates freeze lists, sub-types, and other case lists
Web API
User Cross Linksfor IGV and Network Visualization
Web API
Download of FirehoseData via DCC
TCGA Ecosystem
Friday, May 18, 12
Planned Features • Adding Drugs and Drug Targets to the network view.
• Adding clinical features and new sort features to the OncoPrint, e.g. group/sort by MSI-Status or Histological Grade, etc.
• Improved analysis and visualization of RPPA (collaboration with Gordon Mills).
• Integration of mutation and copy number algorithm results, e.g. MutSig and GISTIC.
• full support for DNA methylation events.
• [your idea here...]
34
Friday, May 18, 12
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Open Source • Portal software open source (GNU Lesser GPL).
Motivation: cBio Cancer Genomics Portal TCGA Ecosystem & Future Plans Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package
Acknowledgements • cBio Portal
• Nikolaus Schultz • Benjamin Gross • Arthur Goldberg • Caitlin Byrne • Anders Jacobsen • Jianjiong Gao• Erik Larsson • Selcuk Onur Sumer, Bilkent University • Sinan Sonlu, Bilkent University• Ugur Dogrusoz, Bilkent University• Chris Sander
• Collaborators: • Broad Firehose Team • The TCGA Project Team
• Pathway Commons:• Benjamin Gross • Emek Demir • Igor Rodchenkov, U. Toronto • Ozgün Babur• Nadia Anwar • Nikolaus Schultz • Gary D. Bader, U. Toronto • Chris Sander