Page 1
Introduction Biological Background Method Results Moving Forward
Modeling gene expression using five
histone modificationsFifth Annual Primes MIT Conference
Lalita Devadas
Mentor: Angela Yen
May 17, 2015
Lalita Devadas — Modeling gene expression using five histone modifications 1/31
Page 2
Introduction Biological Background Method Results Moving Forward
Outline
1 Biological Background
2 Method
3 Results
4 Moving Forward
Lalita Devadas — Modeling gene expression using five histone modifications 2/31
Page 3
Introduction Biological Background Method Results Moving Forward
Outline
1 Biological Background
2 Method
3 Results
4 Moving Forward
Lalita Devadas — Modeling gene expression using five histone modifications 3/31
Page 4
Introduction Biological Background Method Results Moving Forward
Gene ExpressionCentral dogma of molecular biology
Lalita Devadas — Modeling gene expression using five histone modifications 4/31
Page 5
Introduction Biological Background Method Results Moving Forward
Gene ExpressionRelevance
Important to understanding biological activity
Crucial to advances in medicine
Detection, prevention, and treatment of disease
Lalita Devadas — Modeling gene expression using five histone modifications 5/31
Page 6
Introduction Biological Background Method Results Moving Forward
Gene ExpressionRegulation
Genetic
Sequences of nucleotides (ACTG)
Lalita Devadas — Modeling gene expression using five histone modifications 6/31
Page 7
Introduction Biological Background Method Results Moving Forward
Gene ExpressionRegulation
Epigenetic
Changes to environment surrounding DNA
Lalita Devadas — Modeling gene expression using five histone modifications 7/31
Page 8
Introduction Biological Background Method Results Moving Forward
EpigeneticsHistone modifications
Chemical changes to histone protein core or protrudingtail
Lalita Devadas — Modeling gene expression using five histone modifications 8/31
Page 9
Introduction Biological Background Method Results Moving Forward
EpigenomesRoadmap Project
Lalita Devadas — Modeling gene expression using five histone modifications 9/31
Page 10
Introduction Biological Background Method Results Moving Forward
Outline
1 Biological Background
2 Method
3 Results
4 Moving Forward
Lalita Devadas — Modeling gene expression using five histone modifications 10/31
Page 11
Introduction Biological Background Method Results Moving Forward
Data PipelineObjective
Density of HistoneModifications
Level of GeneExpression
Prediction
Lalita Devadas — Modeling gene expression using five histone modifications 11/31
Page 12
Introduction Biological Background Method Results Moving Forward
Data PipelineOverview
All Histone Data
Relevant Histone Data
“Off” genes (Unexpressed) “On” genes (Expressed)
Predicted Expression
Best-binning
Classification Classification
Predict at 0 Regression
Lalita Devadas — Modeling gene expression using five histone modifications 12/31
Page 13
Introduction Biological Background Method Results Moving Forward
Data PipelineBest-bin approach
All Histone Data
Relevant Histone Data
“Off” genes (Unexpressed) “On” genes (Expressed)
Predicted Expression
Best-binning
Classification Classification
Predict at 0 Regression
Lalita Devadas — Modeling gene expression using five histone modifications 13/31
Page 14
Introduction Biological Background Method Results Moving Forward
Best-bin approachDividing genes
Lalita Devadas — Modeling gene expression using five histone modifications 14/31
Page 15
Introduction Biological Background Method Results Moving Forward
Best-bin approachChoosing best bin
strongest correlation
epigenome X, histone mark Y
p = best bin
bin 1 bin 2 bin 3 bin 4 . . bin p . . . bin 81 expression
gene a
gene b
gene c
.
.
.
.
.
Lalita Devadas — Modeling gene expression using five histone modifications 15/31
Page 16
Introduction Biological Background Method Results Moving Forward
Data PipelineClassification
All Histone Data
Relevant Histone Data
“Off” genes (Unexpressed) “On” genes (Expressed)
Predicted Expression
Best-binning
Classification Classification
Predict at 0 Regression
Lalita Devadas — Modeling gene expression using five histone modifications 16/31
Page 17
Introduction Biological Background Method Results Moving Forward
Types of ModelsRandom Forest
Random Forest model
Returns majority vote of classification determined by a groupof decision trees
Lalita Devadas — Modeling gene expression using five histone modifications 17/31
Page 18
Introduction Biological Background Method Results Moving Forward
Types of ModelsRandom Forest
Random Forest model
Returns majority vote of classification determined by a groupof decision trees
Lalita Devadas — Modeling gene expression using five histone modifications 18/31
Page 19
Introduction Biological Background Method Results Moving Forward
Data PipelineRegression
All Histone Data
Relevant Histone Data
“Off” genes (Unexpressed) “On” genes (Expressed)
Predicted Expression
Best-binning
Classification Classification
Predict at 0 Regression
Lalita Devadas — Modeling gene expression using five histone modifications 19/31
Page 20
Introduction Biological Background Method Results Moving Forward
Types of ModelsLinear Model
Linear model
Finds a linear correlation between predictors and response
Lalita Devadas — Modeling gene expression using five histone modifications 20/31
Page 21
Introduction Biological Background Method Results Moving Forward
Data PipelineOverview
All Histone Data
Relevant Histone Data
“Off” genes (Unexpressed) “On” genes (Expressed)
Predicted Expression
Best-binning
Classification Classification
Predict at 0 Regression
Lalita Devadas — Modeling gene expression using five histone modifications 21/31
Page 22
Introduction Biological Background Method Results Moving Forward
Outline
1 Biological Background
2 Method
3 Results
4 Moving Forward
Lalita Devadas — Modeling gene expression using five histone modifications 22/31
Page 23
Introduction Biological Background Method Results Moving Forward
EpigenomesRoadmap Project
Lalita Devadas — Modeling gene expression using five histone modifications 23/31
Page 24
Introduction Biological Background Method Results Moving Forward
Data PipelineObjective
Density of HistoneModifications
Level of GeneExpression
H3K9me3
H3K4me3
H3K4me1
H3K36me3
H3K27me3
Prediction
Lalita Devadas — Modeling gene expression using five histone modifications 24/31
Page 25
Introduction Biological Background Method Results Moving Forward
Results of PipelineConclusions
Models created for cultured epigenomes have a muchhigher predictive power than those created for tissuesamples
H3K36me3 is the most important histone mark used forprediction
Lalita Devadas — Modeling gene expression using five histone modifications 25/31
Page 26
Introduction Biological Background Method Results Moving Forward
Results of PipelineGraph
Lalita Devadas — Modeling gene expression using five histone modifications 26/31
Page 27
Introduction Biological Background Method Results Moving Forward
Specifics of Best ModelClassification Accuracy
Lalita Devadas — Modeling gene expression using five histone modifications 27/31
Page 28
Introduction Biological Background Method Results Moving Forward
Specifics of Best ModelRegression Accuracy
every data pointrepresents one gene
r2 value: 0.640
Lalita Devadas — Modeling gene expression using five histone modifications 28/31
Page 29
Introduction Biological Background Method Results Moving Forward
Outline
1 Biological Background
2 Method
3 Results
4 Moving Forward
Lalita Devadas — Modeling gene expression using five histone modifications 29/31
Page 30
Introduction Biological Background Method Results Moving Forward
Next Steps
Improve predictive power
Broaden scope of predictors and response
Further analysis of current results
Apply procedure to different data
Release code as a tool for other researchers
Lalita Devadas — Modeling gene expression using five histone modifications 30/31
Page 31
Introduction Biological Background Method Results Moving Forward
Acknowledgements
I would like to thank:
My mentor, Angela Yen
Prof. Manolis Kellis
Roadmap Project
PRIMES program
My family
Lalita Devadas — Modeling gene expression using five histone modifications 31/31