Supporting Information Resource base influences genome-wide DNA methylation levels in wild baboons RRBS library construction, sequencing, and data processing 2 Testing for effects of resource base on DNA methylation levels 3 Estimating the relationship between sample size and power 4 Testing for cell type heterogeneity-related confounds 5 Defining genomic compartments to test for enrichment of differentially methylated sites 7 Transforming count data to normalized methylation proportions 8 Testing the degree to which DMRs occur more often than expected by chance 8 PFKP reporter constructs 9 Cell culture and transfection procedures 11 Using support vector machines to investigate the plasticity versus stability of DNA methylation 12 References 14 Figure S1. Flow chart describing data processing steps and main analyses 18 Figure S2. Genomic compartment annotations used in this study 19 Figure S3. Histone marks and genomic compartments associated with chromatin states 20 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
48
Embed
Amboseli Baboon Research Project - RRBS library ... · Web viewTesting for effects of resource base on DNA methylation levels We used the software MACAU [3] to test for differences
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Supporting InformationResource base influences genome-wide DNA methylation levels in wild baboons
RRBS library construction, sequencing, and data processing 2
Testing for effects of resource base on DNA methylation levels 3
Estimating the relationship between sample size and power 4
Testing for cell type heterogeneity-related confounds 5
Defining genomic compartments to test for enrichment of differentially methylated sites 7
Transforming count data to normalized methylation proportions 8
Testing the degree to which DMRs occur more often than expected by chance 8
PFKP reporter constructs 9
Cell culture and transfection procedures 11
Using support vector machines to investigate the plasticity versus stability of DNA methylation 12
References 14
Figure S1. Flow chart describing data processing steps and main analyses 18
Figure S2. Genomic compartment annotations used in this study 19
Figure S3. Histone marks and genomic compartments associated with chromatin states 20
Figure S4. Differentially methylated sites are enriched near genes expressed in whole blood 21
Figure S5. Effect of resource base on DNA methylation levels at the pathway level 22
Figure S6. DMRs are observed in the real data set more often than expected by chance 23
Figure S7. RRBS enriches for putatively functional regions of the genome and recapitulates known
patterns of DNA methylation across the genome 24
Figure S8. Power to detect differentially methylated sites increases with sample size 26
Figure S9. Magnitude of the effect of resource availability on DNA methylation levels in different genomic
compartments 27
Figure S10. Cell type proportions did not significantly differ between wild-feeding and Lodge 28
Figure S11. Enrichment of resource base-associated sites is strongest near genes expressed in whole
blood, compared to genes expressed in other tissues 29
Table S1. Information about males that switched between resource base conditions 30
transfection was performed by adding 20 uL of OPTIMEM Reduced Serum Media (Gibco)
containing the following reagents: (i) 100 ng of methylated, partially methylated, or unmethylated
vector (4 replicates for each condition); (ii) 10 ng of Renilla control vector; (iii) 0.5 uL of
Lipofectamine; and (iv) 0.1 uL of the PLUS reagent (from the Lipofectamine 2000 system, Life
Technologies).
Cells were incubated for 24 hours following transfection, and subsequently assayed for
transgene luciferase expression with the Dual Luciferase Assay kit (Promega). Firefly luciferase
activity was normalized against Renilla activity to control for variation in the transfection
efficiency or total number of cells in each experimental replicate.
Using support vector machines to investigate the plasticity versus stability of DNA
methylation levels
To differentiate between the hypotheses presented in Figure 5, we used a machine
learning approach. First, we built an SVM classifier that could distinguish between individuals
that spent most or all of their lives in either a wild-feeding group or in Lodge group (n = 61
individuals) based on DNA methylation data alone. To do so, we used the ksvm function
implemented in R package kernlab, using a linear kernel and setting the penalty constant, C, to
100 (Karatzoglou et al. 2004; note that our results were robust across several orders of
magnitude of C). As predictive features for this model, we used the 334,840 CpG sites that were
not associated with age, sex, bisulfite conversion rate, or sample age at a nominal p-value of
0.05. Because SVMs cannot work on binomially distributed count data, we used data that had
been transformed to methylation proportions, transformed to a standard normal, and imputed to
remove missing values (see Supporting Information: Transforming count data to normalized
methylation proportions).
To assess the performance of the SVM model, we used leave-one-out cross-validation.
Specifically, we iteratively (i) removed one individual from the data set, (ii) trained the SVM on
12
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
DNA methylation data from equally sized samples of the remaining Lodge group individuals and
wild-feeding individuals (to avoid biased estimates as a result of differences in class size); and
(iii) used the resulting fitted model to predict the resource base of the originally removed test
case. We repeated this procedure 610 times, so that DNA methylation data from every
individual (n = 61) was used as the test case 10 times. We did not observe any consistent bias
in class assignments during this procedure (49.02% of all misclassification events involved data
from Lodge individuals, while 50.98% involved wild-feeding individuals).
Finally, to understand whether individuals that switched between resource bases more
closely resembled their pre-switch or post-switch conspecifics, we repeated the same
procedures described above, but used DNA methylation data from the 8 switching individuals as
the test set. In this case, we used a fitted model trained on data (i.e., the n = 334,840 CpG sites
not associated with any covariates) from the full set of Lodge individuals (minus switching
individuals sampled in Lodge group), as well as an equally sized random sample of wild-feeding
individuals (minus switching individuals sampled in a wild-feeding group). We used this fitted
model to predict the resource base of the 8 individuals that dispersed between groups, using
their DNA methylation data alone. We repeated this procedure 60 times to ensure that
subsampling from the wild-feeding individuals did not bias our model predictions.
13
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
References
1. Zou J, Lippert C, Heckerman D, Aryee M, Listgarten J. Epigenome-wide association studies without the need for cell-type composition. Nat Methods. 2014;11: 309–11. doi:10.1038/nmeth.2815
2. Lam LL, Emberly E, Fraser HB, Neumann SM, Chen E, Miller GE, et al. Factors underlying variable DNA methylation in a human community cohort. Proc Natl Acad Sci. 2012;109: 17253–60. doi:10.1073/pnas.1121249109
3. Lea A, Tung J, Zhou X. A flexible, efficient binomial mixed model for identifying differential DNA methylation in bisulfite sequencing data. bioRxiv. 2015; doi:http://dx.doi.org/10.1101/019562
4. Altmann J, Altmann S, Hausfater G. Physical maturation and age estimates of yellow baboons, Papio cynocephalus, in Amboseli National Park, Kenya. Am J Primatol. 1981;1: 389–399. doi:10.1002/ajp.1350010404
5. Alberts SC, Buchan JC, Altmann J. Sexual selection in wild baboons: from mating opportunities to paternity success. Anim Behav. 2006;72: 1177–1196. doi:10.1016/j.anbehav.2006.05.001
6. Buchan JC, Alberts SC, Silk JB, Altmann J. True paternal care in a multi-male primate society. Nature. 2003;425: 179–81. doi:10.1038/nature01866
7. Wang J. COANCESTRY: a program for simulating, estimating and analysing relatedness and inbreeding coefficients. Mol Ecol Resour. 2011;11: 141–5. doi:10.1111/j.1755-0998.2010.02885.x
8. Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, et al. Estimating dataset size requirements for classifying DNA microarray data. J Comput Biol. 2003;10: 119–142. doi:10.1089/106652703321825928
9. Reinius LE, Acevedo N, Joerink M, Pershagen G, Dahlén S-E, Greco D, et al. Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PLoS One. 2012;7: e41361. doi:10.1371/journal.pone.0041361
10. Jaffe AE. FlowSorted.Blood.450k: Illumina HumanMethylation data on sorted blood cell populations. R package version 1.5.1. 2015.
11. Jaffe AE, Irizarry R a. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014;15: R31. doi:10.1186/gb-2014-15-2-r31
13. Shulha HP, Cheung I, Guo Y, Akbarian S, Weng Z. Coordinated Cell Type–Specific Epigenetic Remodeling in Prefrontal Cortex Begins before Birth and Continues into Early Adulthood. Ren B, editor. PLoS Genet. 2013;9: e1003433. doi:10.1371/journal.pgen.1003433
14. Deng J, Shoemaker R, Xie B. Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming. Nat Biotechnol. 2009;27: 353–360. doi:10.1038/nbt.1530.Targeted
15. Karolchik D, Barber GP, Casper J, Clawson H, Cline MS, Diekhans M, et al. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 2014;42: 764–770. doi:10.1093/nar/gkt1168
16. Hernando-Herraez I, Prado-Martinez J, Garg P, Fernandez-Callejo M, Heyn H, Hvilsom C, et al. Dynamics of DNA methylation in recent human and great ape evolution. PLoS Genet. 2013;9: e1003763. doi:10.1371/journal.pgen.1003763
17. Gu H, Smith ZD, Bock C, Boyle P, Gnirke A, Meissner A. Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat Protoc. 2011;6: 468–81. doi:10.1038/nprot.2010.190
18. Rönn T, Volkov P, Davegårdh C, Dayeh T, Hall E, Olsson AH, et al. A six months exercise intervention influences the genome-wide DNA methylation pattern in human adipose tissue. PLoS Genet. 2013;9: e1003572. doi:10.1371/journal.pgen.1003572
19. Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis C, Doyle F, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489: 57–74. doi:10.1038/nature11247
20. Hinrichs A, Karolchik D, Baertsch R, Barber G, Bejerano G, Clawson H. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34: D590–D598. doi:10.1093/nar/gkj144
21. Goeman JJ, Van de Geer S, De Kort F, van Houwellingen HC. A global test for groups fo genes: Testing association with a clinical outcome. Bioinformatics. 2004;20: 93–99. doi:10.1093/bioinformatics/btg382
22. Karatzoglou A, Smola A, Hornik K, Zeileis A. kernlab -- An S4 Package for Kernel Methods in R. J Stat Softw. 2004;11: 1–20.
23. Hastie T, Tibshirani R, Narasimhan B, Chu G. Impute: imputation for microarray data. R package version 1.42.0. 2015.
24. Jaffe AE, Murakami P, Lee H, Leek JT, Fallin MD, Feinberg AP, et al. Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. Int J Epidemiol. 2012;41: 200–209. doi:10.1093/ije/dyr238
25. Lister R, Pelizzola M, Dowen R, Hawkins R. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;
15
379380381382
383384385
386387388
389390391
392393394
395396397
398399400
401402403
404405406
407408
409410
411412413
414415
26. Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, et al. Increased methylation variation in epigenetic domains across cancer types. Nat Genet. Nature Publishing Group; 2011;43: 768–75. doi:10.1038/ng.865
27. Tung J, Zhou X, Alberts SC, Stephens M, Gilad Y. The genetic architecture of gene expression levels in wild baboons. eLife. 2015;4: 1–22. doi:10.7554/eLife.04729
28. Peng X, Thierry-Mieg J, Thierry-Mieg D, Nishida a., Pipes L, Bozinoski M, et al. Tissue-specific transcriptome sequencing analysis expands the non-human primate reference transcriptome resource (NHPRTR). Nucleic Acids Res. 2014;43: D737–D742. doi:10.1093/nar/gku1110
29. Jostins L, Ripke S, Weersma RK, Duerr RH, McGovern DP, Hui KY, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 2012;491: 119–124. doi:10.1038/nature11582.Host-microbe
30. Franke A, McGovern DPB, Barrett JC, Wang K, Radford-Smith GL, Ahmad T, et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn’s disease susceptibility loci. Nat Genet. Nature Publishing Group; 2010;42: 1118–1125. doi:10.1038/ng.717
31. Estrada K, Krawczak M, Schreiber S, van Duijn K, Stolk L, van Meurs JBJ, et al. A genome-wide association study of northwestern Europeans involves the C-type natriuretic peptide signaling pathway in the etiology of human height variation. Hum Mol Genet. 2009;18: 3516–3524. doi:10.1093/hmg/ddp296
32. Vithana EN, Khor C-C, Qiao C, Nongpiur ME, George R, Chen L-J, et al. Genome-wide association analyses identify three new susceptibility loci for primary angle closure glaucoma. Nat Genet. 2012;44: 1142–1146. doi:10.1038/ng.2390
33. Estrada K, Styrkarsdottir U, Evangelou E, Hsu YH, Duncan EL, Ntzani EE, et al. Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture. Nat Genet. 2012;44: 491–502. doi:10.1038/ng.2249
34. Soranzo N, Spector TD, Mangino M, Kühnel B, Rendon A, Teumer A, et al. A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nat Genet. Nature Publishing Group; 2009;41: 1182–1190. doi:10.1038/ng.467
35. Gieger C, Kühnel B, Radhakrishnan a, Cvejic a, Serbanovic-Canic J, Meacham S, et al. New gene functions in megakaryopoiesis and platelet formation. Nature. Nature Publishing Group; 2011;480: 201–208. doi:10.1038/nature10659
36. Comuzzie AG, Cole S a., Laston SL, Voruganti VS, Haack K, Gibbs R a., et al. Novel Genetic Loci Identified for the Pathophysiology of Childhood Obesity in the Hispanic Population. PLoS One. 2012;7. doi:10.1371/journal.pone.0051954
37. Service SK, Verweij KJH, Lahti J, Congdon E, Ekelund J, Hintsanen M, et al. A genome-wide meta-analysis of association studies of Cloninger’s Temperament Scales. Transl Psychiatry. 2012;2: e116. doi:10.1038/tp.2012.37
16
416417418
419420
421422423424
425426427
428429430431
432433434435
436437438
439440441
442443444445
446447448
449450451
452453454
38. Divaris K, Monda KL, North KE, Olshan a. F, Lange EM, Moss K, et al. Genome-wide Association Study of Periodontal Pathogen Colonization. J Dent Res. 2012;91: S21–S28. doi:10.1177/0022034512447951
39. Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, et al. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet. 2007;3: 1200–1210. doi:10.1371/journal.pgen.0030115
17
455456457
458459460
461
Figure S1. Flow chart describing data processing steps (light blue boxes) and main analyses (dark blue boxes).
18
462463
19
464
Figure S2. Genomic compartment annotations used in this study. We tested for enrichment of resource-base associated sites in promoters, gene bodies, CpG islands, CpG island shores, enhancers, and unannotated regions of the genome. Below, we provide a cartoon depiction of these functional elements and their typical methylation status at an example gene. Methylated CpG sites are depicted as gray shaded lollipops, unmethylated CpG sites are depicted as white/unshaded lollipops, the gene body is depicted as a blue rectangle, and molecules that aid in transcriptional activation (e.g., transcription factors/activator proteins) are depicted as colored ovals. The promoter region is directly upstream of the gene body (defined as 2 kb upstream in our analyses), and is often associated with a CpG island (a dense cluster of CpG sites, usually unmethylated). CpG shores are defined as the 2 kb flanking CpG islands. Enhancer regions are short regions of DNA that often occur far from genes (although can also be found within or proximal to genes). Distal enhancers interact with promoter regions through DNA looping; they bind proteins (e.g., the green oval) that activate transcription. In our study, unannotated regions are defined as regions that do not fall into one of the five defined functional genomic compartments (promoters, gene bodies, CpG islands, CpG island shores, and enhancers). Such regions are generally hypermethylated.
20
465466467468469470471472473474475476477478479480
Figure S3. Histone marks and genomic compartments associated with individual chromatin states. (A) Histone mark data generated by the NIH Roadmap Epigenomics Project were used to define the 15 chromatin states used in this study (also produced by the Roadmap Epigenomics project). Each chromatin state is defined by the presence (dark blue square = strongly enriched; light blue square = weakly enriched) or absence (white square) of individual histone modifications (x axis labels). (B) We overlaid the Roadmap Epigenomics chromatin state annotations for human peripheral blood mononuclear cells onto the CpG sites tested in our data set and the genomic compartment annotations described in Figure S2. Here, we show the degree to which different chromatin states are more likely to occur in specific genomic compartments (dark purple square = strongly enriched; light purple square = weakly enriched; white square = not enriched).
21
481482483484485486487488489490491492493
494495496497498
Figure S4. Differentially methylated sites are enriched near genes expressed in baboon whole blood. In the main text, we report that differentially methylated sites (10% FDR) are more likely to occur in or near genes that are expressed in baboon whole blood, compared to genes that are unexpressed in this tissue (where “in or near” is defined as within the gene body or within 10 kb of the transcription start site, TSS, or transcription end site, TES). Here we report parallel results using alternative definitions for assigning CpG sites to genes, which result in analyses of different subsets of the data. These alternative definitions correspond to sites that occur: (i) 2 kb upstream, defined as <2 kb upstream of the TSS only (i.e., the putative promoter region); (ii) 10 kb upstream, defined as <10 kb upstream of the TSS only; (iii) within gene bodies only; or (iv) in CpG islands near genes, defined as in CpG islands within the gene body or within 10 kb of the gene TES or TSS. Below, we show the odds ratio from a Fisher’s exact test, asking whether differentially methylated sites are enriched near blood-expressed genes. Significant tests (p<0.05) are marked with a red asterisk, and the number of sites tested in each case is shown in parentheses.
22
499500501502503504505506507508509510511512
513514
515
Figure S5. Effect of resource base on DNA methylation levels analyzed at the pathway level. We used the R package ‘GlobalTest’ [21] to test for a global effect of resource base on DNA methylation levels at CpG sites in or near genes in specific predefined pathways. We performed this test on 36 pathways related to the metabolism of food or to energy balance. Results of these analyses are shown here (blue = significant at a 10% FDR). For pathways that include CpG sites in or near PFKP, we also conducted a parallel analysis excluding all sites in or near this gene (denoted as red diamonds). When sites near PFKP are excluded from the analysis, the top three pathways do not show differential methylation at a 10% FDR threshold, and only the glycolysis and gluconeogenesis pathway remains significant at a nominal p-value of 0.05.
23
516517518519520521522523524525
526527
Figure S6. DMRs are observed in the real data set more often than expected by chance. We counted the number of sites that occur within 2 kb (i.e., ≤1 kb upstream or ≤1 kb downstream) of sites associated with resource base at a 10% FDR, considering only the 847 resource base-associated sites with at least 1 nearby site within the 2 kb window. The resulting distribution of sites is shown in blue; the distribution of values obtained from performing the same analysis on permuted data is shown in red. To obtain the permuted data distribution shown below, we analyzed four different permuted datasets and averaged the results.
24
528529530531532533534535
536537
Figure S7. RRBS enriches for putatively functional regions of the genome and recapitulates known patterns of DNA methylation across the genome. Here, we present quality control measures for RRBS data from the full sample set we analyzed (n=69 individuals; panels B, D, and F) and from the previously published data set (n=50 of the 69 total individuals; panels A, C, and E). (A-B) Proportion of total annotated features in the baboon genome for which a least one CpG site was analyzed. (C-D) Mean DNA methylation levels as a function of distance from the TSS, stratified by gene expression level quartiles obtained from whole blood RNA-seq for the same baboon population [27]. Only expressed genes (as identified by [27] were included in these analyses. As expected, more highly expressed genes exhibit lower levels of DNA methylation near the TSS. (E-F) Violin plots showing the distribution of average DNA methylation levels for CpG sites located in different genomic compartments. The white boxes indicate the interquartile range, and the black bars indicate the median DNA methylation level for each group of CpG sites. As expected, CpG islands, H3K3me1-marked enhancers and promoters tend to be lowly methylated, while gene bodies and the background set of all sites analyzed tend to be highly methylated. Note that the background set in RRBS data is highly biased towards functionally active regulatory elements, reducing mean/median methylation levels below true genome-wide values.
Figure S8. Power to detect differentially methylated sites (between lifelong wild-feeding and Lodge individuals) increases with sample size. Using data from individuals that spent all or the majority of their lives in one resource base condition (either wild-feeding or Lodge: n=61), we estimated the relationship between sample size and power to detect putative true positive sites (x-axis = sample size; y-axis = proportion of putative true positive sites detected at a 10% FDR). Results are stratified by quartiles of effect sizes (e.g., Q1 shows putative true positive sites with effect sizes in the the top 25% of our data set). There appear to be many true resource base-associated sites in our data set that do not pass genome-wide significance, especially for small effect sizes.
27
558559560561562563564565566
567
Figure S9. Magnitude of the effect of resource availability on DNA methylation levels in different genomic compartments. The cumulative distribution function is shown for betas (effect sizes) generated by MACAU. Each line represents the distribution of betas associated with the effect of resource base on DNA methylation levels in a given genomic compartment. Only CpG sites with a significant effect of resource base (10% FDR) are shown. If the direction of the effect of resource base was random (i.e., methylation levels increased in Lodge versus wild animals with equal probability), we would expect all lines to pass through the intersection of the black dotted lines (at x = 0 and y = 0.5). Instead, promoters and enhancers are somewhat more likely to show increased methylation in Lodge animals, while all other regions are more likely to show decreased methylation in Lodge animals.
28
568569570571572573574575576577
Figure S10. Cell type proportions did not significantly differ between wild-feeding and Lodge individuals. The distribution of cell-type proportions, obtained from manual counts of Giemsa-stained blood smears, are shown below (n = 15 Lodge individuals and 25 wild-feeding individuals). For the five major cell types we measured, cell type proportions did not differ between the two resource bases (p-values are from a generalized linear model with a binomial link function, controlling for age, sex, and the identity of the individual who scored the blood slide). Note that lymphocytes and neutrophils make up by far the largest proportion of blood cell types in whole blood.
29
578579580581582583584585586587
30
588589
Figure S11. Enrichment of resource base-associated sites is strongest near genes expressed in whole blood, compared to genes expressed in other tissues. We report in the main text that differentially methylated sites are enriched near genes expressed in whole blood. This result could be due to targeted regulation of blood-expressed genes, or arise simply because sites affected by diet fall near genes expressed across many tissues. Here, we used tissue specific RNA-seq data from olive baboons to identify genes expressed (FPKM > 1) or unexpressed (FPKM < 1) in a range of tissues [28]. We then asked, for each tissue, whether resource base-associated sites were enriched near expressed genes, using a Fisher’s Exact Test. We considered a CpG site to be near a gene if it fell in the gene body or within 10kb of the transcription start or end sites. The FET odds ratio is plotted for each tissue. Differentially methylated sites are significantly biased towards genes expressed in all tissues except skeletal muscle (at a nominal p-value of 0.05), but are most strongly biased for whole blood.
31
590591592593594595596597598599600601
602603
Table S1. Information about males that switched between resource base conditions (n=8 baboons).
Individual Resource base (natal/adult)1
Years in post-dispersal resource base condition2
Certainty level for years in post-dispersal condition3
1 L = Lodge group; W = wild-feeding group2 Number of years the male resided in the post-dispersal resource base group prior to blood sample collection 3 For individuals that switched from the Lodge group to a wild-feeding group, the timing of dispersal events and group residency are known. For individuals that switched from a wild-feeding group to the Lodge group, early histories were inferred (see main text) and the precise timing of their switch from wild-feeding to Lodge is unknown. For these individuals, we provide the number of years they were directly observed in the Lodge group, which serves as a lower bound for the total number of years they experienced the Lodge resource base prior to blood sampling.
32
604605606
607608609610611612613614615616
617
Table S2. Baboon RRBS data set sample characteristics and read mapping summary.
Individual Sex Age of animal (years)
Bisulfite conversion
rate1
Sample age
(years)2
Total reads generated
(in millions)
Uniquely mapped reads
(in millions)
Resource base
(natal/adult)3
AMB_01 M 11.29 0.9850 8.39 37.023 25.100 L/WAMB_02 F 10.06 0.9994 24.30 33.072 22.820 L/L
AMB_03 M 7.67 0.9842 6.37 24.088 16.943 W/W
AMB_04 M 5.40 0.9988 20.20 14.729 10.458 L/L
AMB_05 M 18.01 0.9849 25.22 51.052 35.687 W/L
AMB_06 M 6.39 0.9847 25.16 21.887 14.800 W/W
AMB_07 M 6.85 0.9840 4.13 14.934 10.013 W/W
AMB_08 M 7.92 0.9988 25.21 32.612 22.532 W/W
AMB_09 M 5.16 0.9994 25.13 14.677 10.610 W/W
AMB_10 M 6.25 0.9837 25.21 35.170 23.064 W/W
AMB_11 F 14.56 0.9995 25.16 18.719 13.103 L/L
AMB_12 M 3.98 0.9837 25.14 26.056 17.660 L/L
AMB_13 M 6.01 0.9840 25.13 24.440 16.309 W/W
AMB_14 M 3.76 0.9989 25.16 20.660 14.073 L/L
AMB_15 F 9.53 0.9989 25.19 9.586 7.285 L/L
AMB_16 F 7.84 0.9994 22.64 18.432 12.718 L/L
AMB_17 M 11.01 0.9990 25.15 18.549 12.903 L/W
AMB_18 M 15.79 0.9990 6.29 36.645 25.193 W/W
AMB_19 M 3.04 0.9990 25.07 31.059 21.321 W/W
AMB_20 M 4.50 0.9990 25.13 29.389 20.758 W/W
AMB_21 F 6.71 0.9995 6.30 28.666 19.779 W/W
AMB_22 F 5.23 0.9994 25.17 16.784 12.084 W/W
AMB_23 M 9.79 0.9963 7.42 11.771 8.084 W/W
AMB_24 M 4.27 0.9987 20.20 24.483 16.747 L/L
AMB_25 M 6.00 0.9986 21.16 71.814 42.517 W/W
AMB_26 M 1.76 0.9987 25.09 15.461 10.784 L/L
AMB_27 M 5.98 0.9987 25.20 31.122 21.156 W/W
AMB_28 M 8.29 0.9980 25.20 35.575 24.680 W/W
AMB_29 M 4.80 0.9981 25.16 35.878 25.526 L/L
AMB_30 M 14.01 0.9980 25.21 15.382 10.708 W/L
AMB_31 M 2.90 0.9980 25.13 34.860 24.045 L/L
AMB_32 M 14.30 0.9980 8.34 21.900 16.169 W/W
AMB_33 F 5.03 0.9988 20.88 20.593 14.621 L/L
AMB_34 F 6.13 0.9963 25.16 39.121 27.385 W/W
AMB_35 F 3.96 0.9994 25.09 19.536 13.870 W/W
AMB_36 M 6.76 0.9978 25.18 39.791 27.011 W/W
AMB_37 M 6.11 0.9978 25.20 41.871 29.169 W/W
33
618619
AMB_38 M 14.01 0.9978 25.13 23.945 18.063 W/L
AMB_39 F 8.10 0.9994 24.25 19.089 13.553 L/L
AMB_40 F 4.97 0.9995 25.18 22.715 15.673 L/L
AMB_41 F 3.49 0.9988 24.25 37.165 26.015 L/L
AMB_42 M 18.01 0.9977 25.22 30.003 21.484 W/L
AMB_43 F 4.69 0.9994 24.28 27.103 18.972 L/L
AMB_44 M 5.80 0.9990 25.22 23.953 16.974 L/L
AMB_45 F 16.44 0.9995 23.27 16.226 11.683 L/L
AMB_46 F 4.01 0.9964 25.13 53.669 37.032 W/W
AMB_47 M 3.64 0.9990 25.08 30.674 20.747 W/W
AMB_48 M 10.62 0.9991 6.30 37.266 26.408 W/W
AMB_49 M 11.86 0.9987 5.92 29.500 20.155 W/W
AMB_50 M 6.72 0.9988 24.30 79.784 54.079 L/L
AMB_51 F 4.92 0.9989 4.58 13.903 9.732 W/W
AMB_52 F 7.54 0.9988 25.16 30.061 22.546 L/L
AMB_53 M 2.15 0.9843 25.11 29.804 20.001 L/L
AMB_54 F 4.14 0.9964 25.13 16.493 11.545 W/W
AMB_55 M 7.43 0.9855 7.40 13.354 9.737 W/W
AMB_56 M 9.19 0.9955 8.30 11.922 8.275 W/W
AMB_57 F 5.95 0.9995 6.62 9.902 7.228 W/W
AMB_58 F 7.73 0.9966 4.15 33.865 23.028 W/W
AMB_59 M 5.44 0.9988 2.28 11.851 8.533 W/W
AMB_60 M 6.26 0.9990 4.96 29.918 20.598 W/W
AMB_61 M 2.59 0.9988 24.23 22.170 15.360 W/W
AMB_62 M 18.01 0.9981 25.22 36.660 25.248 W/L
AMB_63 M 4.50 0.9994 25.16 11.387 8.623 W/W
AMB_64 M 6.49 0.9977 5.64 17.274 11.996 W/W
AMB_65 F 9.24 0.9995 7.39 27.200 19.040 W/W
AMB_66 M 4.72 0.9990 25.18 61.599 41.574 L/L
AMB_67 M 7.89 0.9990 25.13 17.518 12.255 L/W
AMB_68 F 2.92 0.9995 24.29 18.857 13.200 W/W
AMB_69 M 7.58 0.9987 7.29 17.789 11.953 W/W
Mean 7.45 0.9965 19.53 27.2465 18.8260
Standard deviation
4.01 0.0051 8.38 13.6195 8.9361
1 To calculate bisulfite conversion rates, we mapped the sequencing data, for each individual separately, to the lambda phage genome. We then summed (i) the number of reads that mapped to lambda phage CpG sites and were read as thymine (reflecting an unmethylated cytosine converted to thymine); and (ii) the total number of reads that mapped to lambda phage CpG sites. Because all CpG sites in the lambda phage genome were completely unmethylated (and should thus have been converted to thymine), the ratio of these two sums gives us the efficiency of the bisulfite conversion. Here, a ratio of 1 would represent perfect conversion of every unmethylated cytosine to thymine. 2 Years from collection of blood sample to RRBS library construction 3 L = Lodge group; W = wild-feeding group
34
620621622623624625626627628
629630
Table S3. Differentially methylated regions.
DMR coordinatesDifferentially
methylated sites (10% FDR)
Closest gene (within 100kb)
Phenotypes associated with genetic variation at this gene (in humans)