Top Banner
DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928
19

DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

DNA methylation age of human tissues and cell types.

Genome Biol. 2013 14(10):R115 PMID: 24138928

Page 2: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Statistical goal and challenge

• Goal: Build an age prediction method based on tens of thousands of variables– dependent variable y= transformed version of

chronological age (in years) – covariates= CpGs– Approach: Penalized regression (elastic net)

• Challenge: how to combine multiple training data generated by different labs etc

Page 3: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Training data sets

Data label (color)

DNA origin Platform Data Use n (Prop.Female)

Median Age(range)

Citation

1 (turquoise) Blood WB 27K Training 715 (0.38) 33 (16,88) Horvath 20122 (blue) Blood WB 450K Training 94 (0.28) 29 (18,65) Horvath 20123 (brown) Blood WB 450K Training 656 (0.52) 65 (19,100) Hannum 20124 (blue2) Blood PBMC 450K Training 72 (0) 3.1 (1,16) Alisch 20125 (green) Blood PBMC 450K Training 48 (0.52) 15 (3.5,76) Harris et al 20126 (red) Blood Cord 27K Training 216 (0.51) 0 (0,0) Adkins 20117 (black) Brain CRBLM 27k Training 168 (NaN) 45 (20,70) Liu 20138 (pink) Brain CRBLM 27K Training 114 (0.3) 44 (16,96) Gibbs 20109 (magenta) Brain FCTX 27K Training 133 (0.32) 43 (16,100) Gibbs 201010 (purple) Brain PONS 27K Training 125 (0.3) 43 (15,100) Gibbs 201011 (greenyellow)Brain Prefr.CTX27K Training 108 (0.48) 19 (-0.5,84) Numata 201212 (tan) BrainVariousCells450K Training 145 (0.48) 35 (13,79) Guintivano 201313 (salmon) Brain TCTX 27K Training 127 (0.33) 44 (15,100) Gibbs 201014 (cyan) Breast NL 27K Training 23 (1) 46 (19,75) Zhuang 201215 (midnightblue)Buccal 27K Training 109 (0.61) 15 (15,15) Essex 201116 (indianred) Buccal 27K Training 8 (0.75) 43 (16,68) Rakyan 201017 (grey60) Buccal 450K Training 53 (0.45) 0 (0,1.5) Martino 2013 18 (green2) Cartilage Knee 27k Training 41 (0.49) 66 (40,79) Fernández-Tajes 201319 (gold) Colon 27K Training 35 (0.63) 74 (43,90) TCGA, COAD20 (royalblue) Colon 450K Training 24 (0.54) 14 (3.5,19) Kellermayer 201321 (darkred) Dermal fibroblast27K Training 14 (1) 20 (6,73) Koch 201122 (darkgreen) Epidermis 27K Training 10 (0) 50 (26,71) Gronniger 201023 (darkturquoise)Gastric 27K Training 52 (NaN) 68 (25,88) Zouridis 201224 (darkgrey) Head+Neck 450K Training 50 (0.24) 62 (26,87) TCGA, HNSC25 (orange) Heart 27K Training 17 (0.41) 55 (16,68) Haas 2013 26 (darkorange)Kidney 450K Training 43 (0.3) 66 (31,83) TCGA, KIRP27 (lightsteelblue2)Kidney 450K Training 160 (0.34) 63 (38,90) TCGA, KIRC28 (skyblue) Liver 27K Training 57 (0.14) 51 (20,79) Shen 201229 (saddlebrown)Lung NL Adj 27K Training 27 (0.15) 69 (52,83) TCGA, LUSC30 (steelblue) Lung NL Adj 27K Training 24 (0.58) 66 (51,77) TCGA, LUAD31 (paleturquoise)Lung NL Adj 450K Training 40 (0.32) 73 (40,85) TCGA, LUSC32 (violet) MSC (bonemarrow)27K Training 16 (0.38) 52 (21,85) Bork 201033 (darkolivegreen)Placenta 27K Training 28 (1) 0 (0,0) Gordon 201234 (darkmagenta)Prostate NL 27K Training 69 (0) 61 (44,73) Kobayashi 201135 (sienna3) Prostate NL 450K Training 44 (0) 63 (44,72) TCGA, PRAD36 (yellowgreen)Saliva 27K Training 131 (0.015) 29 (21,55) Liu 201037 (skyblue3) Saliva 27K Training 69 (0) 35 (21,55) Bockland 201138 (plum1) Stomach 27K Training 41 (0.51) 69 (43,87) TCGA, STAD39 (orangered4)Thyroid 450K Training 25 (0.8) 40 (18,76) TCGA, THCA

Page 4: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Test data sets

40 (mediumpurple3)Blood WB 27K Test 191 (0.51) 43 (24,74) Teschendorff 201041 (lightsteelblue1)Blood WB 27K Test 93 (1) 63 (49,74) Rakyan 201042 (darkcyan) Blood WB 27K Test 262 (1) 67 (49,91) Song 201043 (orange) Blood WB 27K Test 269 (1) 64 (52,78) Teschendorff 2010 Song 200944 (green) Blood WB 450K Test 689 (0.71) 54 (17,70) Liu 201345 (darkorange2)Blood PBMC 27K Test 386 (0) 9.3 (3.6,18) Alisch 201246 (brown4) Blood PBMC 450K Test 38 (0.74) 44 (0,100) Heyn 201247 (bisque4) Blood PBMC 27K Test 92 (NaN) 33 (24,45) Lam 201248 (darkslateblue)Blood Cord 27K Test 48 (0.021) 0 (0,0) Turan49 (plum2) Blood Cord 27K Test 84 (0.52) 0 (0,0.75) Khulan 201250 (thistle2) Blood Cord 27K Test 53 (0.45) 0 (0,0) Gordon 201251 (darkblue) Blood CD4 Tcells450K Test 48 (NaN) 0.5 (0,1) Martino 201252 (salmon4) Blood CD4+CD1427K Test 50 (0.68) 34 (16,69) Rakyan 201053 (palevioletred3)Blood Cell Types450K Test 16 (0.62) 32 (17,60) Heyn 201354 (brown3) Brain Cerebellar27K Test 20 (0) 22 (1,60) Ginsberg 201255 (maroon) Brain Occipital Cortex27K Test 16 (0) 25 (1,60) Ginsberg 201256 (lightpink4) Breast NL Adj 450K Test 81 (1) 55 (28,90) TCGA, BRCA57 (lavenderblush3)Breast NL Adj 27K Test 27 (1) 51 (35,88) TCGA, BRCA58 (deepskyblue)Buccal 450K Test 51 (0.45) 0 (0,1.5) Martino 2013 59 (darkseagreen4)Colon 450K Test 38 (0.45) 72 (40,90) TCGA,COAD60 (coral1) Fat Adip 27K Test 10 (0.4) 75 (73,78) Ribel-Madsen 201261 (brown2) Heart 27K Test 6 (0) 60 (55,71) Pai 201162 (coral2) Kidney 27K Test 198 (0.35) 60 (33,86) TCGA, KIRC63 (mediumorchid)Liver 450K Test 37 (0.35) 68 (20,81) TCGA, LIHC64 (skyblue2) Lung NL Adj 450K Test 26 (0.46) 66 (42,86) TCGA, LUAD65 (yellow4) Muscle 27K Test 22 (0.55) 66 (53,78) Ribel-Madsen 201266 (skyblue1) Muscle 27K Test 44 (0) 25 (25,25) Jacobsen 201267 (plum) Placenta 450k Test 40 (NaN) 0 (0,0) Blair 201368 (orangered3)Saliva 27K Test 52 (0.92) 27 (21,55) Liu 201069 (mediumpurple2)Uterine Cervix 27K Test 152 (1) 25 (19,55) Zhuang 201270 (lightsteelblue)Uterine Endomet450K Test 28 (1) 62 (35,90) TCGA, UCEG71 (lightcoral) Various Tissues27K Test 44 (0.41) 71 (0,83) Myers 201272 (indianred4) Chimp+Human Tissues27K Other 35 (0.4) 47 (9,81) Pai 201173 (firebrick4) Ape WB 450k Other 32 (0.62) 22 (9,43) Hernando-Herraez 201374 (darkolivegreen4)Sperm 27K Other 19 (1) 0 (0,0) Pacheco 2011 75 (brown2) Sperm 450k Other 26 (0) 0 (0,0) Krausz 201276 (blue2) Vasc.Endoth(Umbilical)27K Other 42 (0.43) 0 (0,0) Gordon 2012

77 Stem cells+Somatic Cells27K Other 271 (NA) NA Nazor 201278 Stem cells+Somatic Cells450K Other 153 (0.63) NA Nazor 201279 Reprogrammed mesenchymal stromal cells 450K Other 24 (NA) NA Shao 201280 hESC and normal primary tissue27k Other 34 (NA) NA Calvanese 201281 hESC 27k Other 6 (NA) NA Ramos-Mejía 201282 Blood Cell Types450K Other 60 (0) NA Reinius 2012

Page 5: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Construction of the epigenetic clock• assembled a large DNA methylation data set

by combining publicly available individual data sets measured on the Illumina 27K or Illumina 450K array platform.

• training+test data involved n=7844 non-cancer samples from 82 individual data sets which assess DNA methylation levels in 51 different tissues and cell types.

• Although many data sets were collected for studying certain diseases, they largely involved healthy tissues. – In particular, cancer tissues were excluded

Page 6: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Illumina data sets• The first 39 data sets were used to construct ("train") the

age predictor. • Data sets 40-71 were used to test (validate) the age

predictor. • Data sets 72-82 served other purposes e.g. to estimate

the DNAm age of embryonic stem and iPS cells. • Training data were chosen i) to represent a wide

spectrum of tissues/cell types, ii) to involve samples whose mean age (43 years) is similar to that in the test data, and iii) to involve a high proportion of samples (37%) measured on the Illumina 450K platform since many on-going studies use this recent Illumina platform.

• Only studied 21369 CpGs (measured with the Infinium type II assay) which were present on both Illumina platforms (Infinium 450K and 27K) and had fewer than 10 missing values across the data sets.

Page 7: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Age predictor• To ensure an unbiased validation in the test

data, only used the training data to define the age predictor.

• A transformed version of chronological age was regressed on the CpGs using a penalized regression model (elastic net).

• The elastic net regression model automatically selected 353 CpGs.

• I refer to the 353 CpGs as (epigenetic) clock CpGs since their weighted average (formed by the regression coefficients) amounts to an epigenetic clock.

Page 8: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Accuracy across tissues and cell types (training)

Page 9: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Accuracy across test data

Page 10: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Accuracy in brain tissue

Page 11: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Results send to me via email

Blood data from Marco Boks Jan 2014

Page 12: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Excerpts from emailsEpigenetic clock applied to large cohort studies

Median error is less than 3.5 years.

Page 13: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Aging clock applied to urine• This figure, created by bioinformatician Wei Guo at Zymo

Research

Page 14: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Factors influencing accuracy: standard deviation of age, tissue

Page 15: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Using the clock for measuring the age of different parts of the body

Page 16: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

The clock works in the genus pan: common chimpanzees+bonobos

Page 17: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

ES cells and iPS cells are perfectly young

Page 18: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Heritability (based on twin studies) of age accelerationis 40% in older subjects and 100% in newborns

Rows correspond to 2 different twin data setsRed dots=monozygotic twin pairBlack dots=dizygotic twin pair

Page 19: DNA methylation age of human tissues and cell types. Genome Biol. 2013 14(10):R115 PMID: 24138928.

Conclusions• Most studies that involved telomere length and

other biomarkers can be revisited• User friendly software can be found on my

webpage– I recommend the online age calculator since it

outputs a host of array quality statistics that can be used to identify samples where the age prediction may not be accurate.

• Data get deleted right after you upload them. – Don't pre-process data too much. Don't remove

batch effects, etc. Raw beta values will be fine.• I am always happy to collaborate.