Epigenetics QC (EpiQC) and single-cell RNA- seq variant calling from NIST GIAB samples Christopher E. Mason Associate Professor Department of Physiology and Biophysics & The Institute for Computational Biomedicine at the Weill Cornell Medical College and the Tri-Institutional Program on Computational Biology and Medicine August 27 th , 2015 _ @mason_lab
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Epigenetics QC (EpiQC) and single-cell RNA-seq variant calling from NIST GIAB samples
Christopher E. MasonAssociate Professor
Department of Physiology and Biophysics &The Institute for Computational Biomedicine at the
Weill Cornell Medical College and theTri-Institutional Program on Computational Biology and Medicine
August 27th, 2015
_
@mason_lab
FDA’s PMET-QC(Personalized Medicine Enabling Technologies Quality Control; the
forth phase of MAQC project (MAQC-IV)
Objectives: • QC PMET for an enhanced
reproducibility and reliability• Benchmarking bioinformatics
approaches for PMET data analysis to achieve best practice and standard data analysis protocols
Overview: profiling drugs response to a common panel of cancer cell lines for predicting drug sensitivity based on patient-specific genomic profiles. Specifically,
• Assess reproducibility of HTS assays for drug efficacy and safety (inter- and intra-lab reproducibility and cross-platform consistency)
• Benchmark bioinformatics approaches for HTS data analysis
WG1 - HTSQC
WG3 - EpiQC
Systems Biology
Overview: Generating reference gene expression and epigenetic datasets from (1) individual cell fractions and (2) whole tissue samples to develop in silico dormular based on mixed cell samples for their cellular composition to identify cell type specific signatures (e.g. in relevant diseases) using whole tissue samples..
WG2 - SeqQC
Code Release
Apps
Overview: systematically evaluate targeted sequencing approaches for identification of somatic mutations in cancer in a set of well-defined primary tumors. • Comparative analysis of WGS, WES and target-seq• Bioinformatics effects• Integrated analysis of DNA-seq and RNA-seq for an
improved variant call
Overview: • Comparative analysis of sequencing based
methods with microarrays for study of DNA methylation
• Benchmark computation tools for sequencing based DNA methylation data
Overview: emphasis on the integration of different molecular data (DNA methylation, DNA, RNA) for an enhanced personalized medicineVariant calling, RNA editing, and allele-specific expression from matched samples
Overview: Full Reproducibility of analysis and methods for samples and instances of code and runtime parameters• Virtual machines or docker instances• Zenodo code base
Planned Epigenetics data sets for GIAB• XTen WBGS; not yet formally supported but proof of principle has been
demonstrated-Currently being processed at New York Genome Center, with
members from Mason Lab, John Greally, Soren Germer, and Frank Wos.-Generating 30X WGBS data for all the GIAB samples
• Illumina 450K methylation array data (Youping Deng)• CpGiant capture data from Roche (Mason)• Pending TAB-seq for hydroxy-methylation profiling (Mason)• Plan to share with the GIAB community• Also planning for single-cell RRBS on the Fluidigm C1 (in development)
Why?
DNA methylation defines cellular phenotypes and lineage specification, and much of the action is beyond CpG islands
Fernandez et al, Genome Research, 2012
Weidner CI et al., Aging of blood can be tracked by DNA methylation changes at just three CpG sites. Genome Biol. 2014 Feb 3;15(2):R24.
DNA Marks can predict your age!
Bisulfite conversion sequencing for detection of DNA methylation
Base resolution methylation level: C/(C+T)
MethSuite:DNA methylome sequencing and analysis suite
Akalin A, Garrett-Bakelman F, et al., 2012. PloS GeneticsAkalin A, Kormaksson M, Li S, et al., 2012. Genome BiologyLi S, Garrett-Bakelman F, et al., 2013. BMC Bioinformatics.Li S, Garrett-Bakelman F, et al., 2014. Genome Biology.Garrett-Bakelman F, Sheridan C, et al., 2014, JOVE.Rampal R, Akalin A, et al., 2015, Cell Reports.
DNA methylation reveals dramatically different tumor types
Akalin et al. ,PLOS Genetics, 2012
DNA methylation also measures epiallelesAn epiallele is one of a number of alternative, phased DNA methylation patterns of the same genetic locus
Epiallele Frequency/Read count
60
100
20
20
Genomic locus with four adjacent CpGs
Li et al., Dynamic Evolution of Clonal Epialleles Revealed by Methclone. Genome Biology, 2014.
Open source and free epiclonality software
https://code.google.com/p/methclone/
Li et al., Dynamic Evolution of Clonal Epialleles Revealed by Methclone. Genome Biology, 2014.
Epialleles reveal the clonality of cells in leukemia
D = DiagnosisR= Relapse
Hydroxy-methylation (hmC) changes can also drive AML phenotypes
Rampal, Akalin, et al., Cell Reports, 2015
hmC is a better predictor of gene expression change
Rampal, Akalin, et al., Cell Reports, 2015
mC hmC
Why else?
Single Cell RNA-seq Variant Calling with
GATK Haplotype Caller
Exciting to see single cell expression of significantly differentially expressed genes between R and NR in oncology patients – but can we call variants?
method: monocle
significance cutoff:FDR <0.01
expression value:log2 (FPKM)
heatmap shows DEGs with mean FPKM > 1
Non-Responders Responders
Tested Parameters• Min Pruning: Paths with fewer supporting kmers than the specified
threshold will be pruned from the graph (default 2) • Min Base Quality score-minimum base score to be considered for
calling. (Default 10)• Min Reads Per Alignment Start- Minimum number of reads sharing
the same alignment start for each genomic location in an active region(default 10).
• Heterozygosity: The probability the sample will differ from the reference (default .001).