. . . . . . . . Computational analysis of the ENCODE datasets and other related epigenetic explorations Ved Topkar Harvard College class of 2016 Gunawardena Lab Harvard Medical School Department of Systems Biology 13 August 2013 Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 1/1
46
Embed
Computational analysis of the ENCODE datasetsvcp.med.harvard.edu/papers/poster-ved-topkar-PRISE.pdf · Computational analysis of the ENCODE datasets and other related epigenetic explorations
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
. . . . . .
.
......
Computational analysis of the ENCODE datasetsand other related epigenetic explorations
Ved Topkar
Harvard College class of 2016
Gunawardena LabHarvard Medical School
Department of Systems Biology13 August 2013
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 1 / 1
. . . . . .
Introduction
Presentation Goals
FULL understanding of discussed materialAsk questions along the way!
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 2 / 1
. . . . . .
Introduction
Outline
...1 Molecular biology in a jiffy
...2 A case study
Hypothesis formulationAnalyzing data
...3 More examples
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 3 / 1
. . . . . .
Introduction
Outline
...1 Molecular biology in a jiffy
...2 A case study
Hypothesis formulationAnalyzing data
...3 More examples
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 3 / 1
. . . . . .
Introduction
Outline
...1 Molecular biology in a jiffy
...2 A case study
Hypothesis formulationAnalyzing data
...3 More examples
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 3 / 1
. . . . . .
Molecular Biology Essentials
The Cell
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 4 / 1
. . . . . .
Molecular Biology Essentials
The Central Dogma
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 5 / 1
. . . . . .
Molecular Biology Essentials
Transcriptional Regulation
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 6 / 1
. . . . . .
Molecular Biology Essentials
Transcriptional Access
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 7 / 1
. . . . . .
Background
Epigenetics and Gene Expression
Things beyond just the base pairs in DNA matter → gene expression
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 8 / 1
. . . . . .
Background
The Question
Analyze the ENCODE dataset
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 9 / 1
. . . . . .
Background
The Question
Analyze the ENCODE dataset
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 9 / 1
. . . . . .
Background
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 10 / 1
. . . . . .
Background
ENCODE (Overview)
.Overview..
......
National Human GenomeInstitute: Encyclopedia ofDNA Elements (ENCODE)
Nearly 600 collaboratinglabs post HGP
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 11 / 1
. . . . . .
Background
The Data Set
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 12 / 1
. . . . . .
Background
Data Types
Raw signals
Raw signal peak calling outputs (e.g. PeakSeq results)
Relatively course-grain peak data
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 13 / 1
. . . . . .
A simple example TFBS
The Game Plan
.Can we reduce transcription factor binding landscapes into categories?..
......
Scan across genome, looking for promoters
Bin promoters appropriately
Score binding at each promoter
Clustering analysis
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 14 / 1
. . . . . .
A simple example TFBS
RefSeq
.Overview..
......
Curated database of genes
New versions released asfrequently as Firefox
Includes pseudogenes,haplotype variations, andpredicted genes
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 15 / 1
. . . . . .
A simple example TFBS
Defining Promoter?
Only upstream from TSS?
Incredibly far regulatory regions?
Intronic regulation?
Post termination regulatory elements?
1000 bp upstream
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1
. . . . . .
A simple example TFBS
Defining Promoter?
Only upstream from TSS?
Incredibly far regulatory regions?
Intronic regulation?
Post termination regulatory elements?
1000 bp upstream
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1
. . . . . .
A simple example TFBS
Defining Promoter?
Only upstream from TSS?
Incredibly far regulatory regions?
Intronic regulation?
Post termination regulatory elements?
1000 bp upstream
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1
. . . . . .
A simple example TFBS
Defining Promoter?
Only upstream from TSS?
Incredibly far regulatory regions?
Intronic regulation?
Post termination regulatory elements?
1000 bp upstream
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1
. . . . . .
A simple example TFBS
Defining Promoter?
Only upstream from TSS?
Incredibly far regulatory regions?
Intronic regulation?
Post termination regulatory elements?
1000 bp upstream
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1
. . . . . .
A simple example TFBS
Binning
How do we quantitatively analyze promoter presence?
Break promoter regions into bins for a finer metric?
Do we give weights to bins as a function of their position?
Single, unweighted 1000 bp bin of counts
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1
. . . . . .
A simple example TFBS
Binning
How do we quantitatively analyze promoter presence?
Break promoter regions into bins for a finer metric?
Do we give weights to bins as a function of their position?
Single, unweighted 1000 bp bin of counts
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1
. . . . . .
A simple example TFBS
Binning
How do we quantitatively analyze promoter presence?
Break promoter regions into bins for a finer metric?
Do we give weights to bins as a function of their position?
Single, unweighted 1000 bp bin of counts
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1
. . . . . .
A simple example TFBS
Binning
How do we quantitatively analyze promoter presence?
Break promoter regions into bins for a finer metric?
Do we give weights to bins as a function of their position?
Single, unweighted 1000 bp bin of counts
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1
. . . . . .
A simple example Survey of Promoters
Forward vs. Backward Strand
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 18 / 1
. . . . . .
A simple example Survey of Promoters
TF Binding Frequency
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 19 / 1
. . . . . .
A simple example Survey of Promoters
Histogram of promoter binding
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 20 / 1
. . . . . .
A simple example Survey of Promoters
Promoter/TFBS intersections
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 21 / 1
. . . . . .
A simple example Survey of Promoters
Computational Efficiency
This was an exercise in program optimization
Original algorithm took about 5 days, optimized/parallelizedalgorithm took just a few hours
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 22 / 1
. . . . . .
A simple example Survey of Promoters
Clustering
The unsupervised grouping of information such that groups have similarelements that are dissimilar from elements in other groups
Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 23 / 1