Transgenerational Epigenetic Instability Is a Source of Novel …signal.salk.edu/publications/Schmitz_etal_SOM.pdf · 2011. 12. 21. · DOI: 10.1126/science.1212959 Science 334, 369

DOI: 10.1126/science.1212959, 369 (2011);334 Science

, et al.Robert J. SchmitzMethylation VariantsTransgenerational Epigenetic Instability Is a Source of Novel

This copy is for your personal, non-commercial use only.

clicking here.colleagues, clients, or customers by , you can order high-quality copies for yourIf you wish to distribute this article to others

here.following the guidelines

can be obtained byPermission to republish or repurpose articles or portions of articles

): November 6, 2011 www.sciencemag.org (this infomation is current as of

The following resources related to this article are available online at

http://www.sciencemag.org/content/334/6054/369.full.htmlversion of this article at:

including high-resolution figures, can be found in the onlineUpdated information and services,

http://www.sciencemag.org/content/suppl/2011/09/14/science.1212959.DC1.html can be found at: Supporting Online Material

http://www.sciencemag.org/content/334/6054/369.full.html#relatedfound at:

can berelated to this article A list of selected additional articles on the Science Web sites

http://www.sciencemag.org/content/334/6054/369.full.html#ref-list-1, 13 of which can be accessed free:cites 40 articlesThis article

http://www.sciencemag.org/cgi/collection/geneticsGenetics

subject collections:This article appears in the following

registered trademark of AAAS. is aScience2011 by the American Association for the Advancement of Science; all rights reserved. The title

CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience

on

Nov

embe

r 6,

201

1w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

http://www.sciencemag.org/about/permissions.dtl

http://www.sciencemag.org/about/permissions.dtl

http://www.sciencemag.org/content/334/6054/369.full.html

http://www.sciencemag.org/content/334/6054/369.full.html#related

http://www.sciencemag.org/content/334/6054/369.full.html#ref-list-1

http://www.sciencemag.org/cgi/collection/genetics

http://www.sciencemag.org/

FM pulsing can be implemented by a simplecircuit of three genes (rsbW, rsbV, and sigB), withinput from a phosphatase complex. This systemprovides a fundamental signal-processing capabil-ity to bacterial cells, enabling them to convertsteady “DC” inputs into pulsatile, predominantly“AC” outputs. Noise plays a key functional rolein this signal processing system (3). The sB cir-cuit conserves its core architecture in diversebacteria (7), and other alternative sigma factorssimilarly feature both posttranslational regulationby anti-sigma factors and autoregulatory feed-back. Thus, related stochastic pulse modulationschemes are likely employed more generally inbacteria (10). The relatively slow time scale of sB

pulses (Fig. 1E) could confer advantages in re-sponding to unpredictable environments andmain-taining a broad, but dynamic, distribution of statesin the population through bet-hedging (25, 26).Given the negative effect of sB activation ongrowth rate in some conditions, even under energystress (27), these results suggest that cells balancethe benefits and costs of sB activation dynami-cally. It will be interesting to see whether otherdynamic encoding schemes are similarly imple-mented by relatively simple circuit modules.

References and Notes1. A. Raj, A. van Oudenaarden, Cell 135, 216 (2008).2. N. Rosenfeld, J. W. Young, U. Alon, P. S. Swain,

M. B. Elowitz, Science 307, 1962 (2005).

3. A. Eldar, M. B. Elowitz, Nature 467, 167 (2010).4. R. Losick, C. Desplan, Science 320, 65 (2008).5. L. Cai, C. K. Dalal, M. B. Elowitz, Nature 455, 485

(2008).6. E. Rotem et al., Proc. Natl. Acad. Sci. U.S.A. 107,

12541 (2010).7. M. Hecker, J. Pané-Farré, U. Völker, Annu. Rev. Microbiol.

61, 215 (2007).8. W. G. Haldenwang, R. Losick, Nature 282, 256

(1979).9. O. A. Igoshin, M. S. Brody, C. W. Price, M. A. Savageau,

J. Mol. Biol. 369, 1333 (2007).10. T. M. Gruber, C. A. Gross, Annu. Rev. Microbiol. 57,

441 (2003).11. M. J. Kazmierczak, S. C. Mithoe, K. J. Boor, M. Wiedmann,

J. Bacteriol. 185, 5722 (2003).12. U. Lorenz et al., Microbes Infect. 10, 217 (2008).13. M. Hecker, U. Völker, Mol. Microbiol. 29, 1129

(1998).14. J. C. Locke, M. B. Elowitz, Nat. Rev. Microbiol. 7, 383

(2009).15. S. Zhang, W. G. Haldenwang, J. Bacteriol. 187, 7554

(2005).16. A. L. Hodgkin, A. F. Huxley, J. Physiol. 117, 500 (1952).17. G. M. Süel, J. Garcia-Ojalvo, L. M. Liberman,

M. B. Elowitz, Nature 440, 545 (2006).18. M. B. Elowitz, S. Leibler, Nature 403, 335 (2000).19. G. M. Süel, R. P. Kulkarni, J. Dworkin, J. Garcia-Ojalvo,

M. B. Elowitz, Science 315, 1716 (2007).20. A. Goldbeter, D. E. Koshland Jr., Proc. Natl. Acad.

Sci. U.S.A. 78, 6840 (1981).21. G. J. Melen, S. Levy, N. Barkai, B. Z. Shilo, Mol. Syst. Biol.

1, 2005.0028 (2005).22. Z. Cheng, F. Liu, X. P. Zhang, W. Wang, Biophys. J.

97, 2867 (2009).23. J. C. Ray, O. A. Igoshin, PLOS Comput. Biol. 6, e1000676

(2010).

24. S. Alper, L. Duncan, R. Losick, Cell 77, 195(1994).

25. M. Acar, A. Becskei, A. van Oudenaarden, Nature 435,228 (2005).

26. E. Kussell, S. Leibler, Science 309, 2075 (2005).27. T. Schweder, A. Kolyschkow, U. Völker, M. Hecker,

Arch. Microbiol. 171, 439 (1999).28. A. Dufour, W. G. Haldenwang, J. Bacteriol. 176, 1813

(1994).29. M. S. Brody, K. Vijay, C. W. Price, J. Bacteriol. 183,

6422 (2001).30. A. A. Wise, C. W. Price, J. Bacteriol. 177, 123

(1995).Acknowledgments: We thank C. Price and D. Rudner for

providing strains. We thank A. Eldar, R. Kishony, C. Price,N. Wingreen, J. Levine, and other members of M.B.E’slaboratory for helpful discussions. Work in M.B.E’slaboratory was supported by NIH grants R01GM079771and P50 GM068763, U.S. National Science FoundationCAREER Award 0644463, and the Packard Foundation.J.C.W.L was supported by the International HumanFrontier Science Program Organization and the EuropeanMolecular Biology Organization.

Supporting Online Materialwww.sciencemag.org/cgi/content/full/science.1208144/DC1Materials and MethodsSOM TextFigs. S1 to S20Table S1ReferencesMovies S1 and S2

10 May 2011; accepted 1 September 2011Published online 6 October 2011;10.1126/science.1208144

Transgenerational EpigeneticInstability Is a Source ofNovel Methylation VariantsRobert J. Schmitz,1,2 Matthew D. Schultz,1,2,3 Mathew G. Lewsey,1,2 Ronan C. O’Malley,2

Mark A. Urich,1,2 Ondrej Libiger,4 Nicholas J. Schork,4 Joseph R. Ecker1,2,5*

Epigenetic information, which may affect an organism’s phenotype, can be stored and stablyinherited in the form of cytosine DNA methylation. Changes in DNA methylation can producemeiotically stable epialleles that affect transcription and morphology, but the rates of spontaneousgain or loss of DNA methylation are unknown. We examined spontaneously occurring variationin DNA methylation in Arabidopsis thaliana plants propagated by single-seed descent for 30generations. We identified 114,287 CG single methylation polymorphisms and 2485 CG differentiallymethylated regions (DMRs), both of which show patterns of divergence compared with the ancestralstate. Thus, transgenerational epigenetic variation in DNA methylation may generate new allelicstates that alter transcription, providing a mechanism for phenotypic diversity in the absence ofgenetic mutation.

Cytosine methylation is a DNA base mod-ification with roles in development anddisease in animals as well as in silencing

transposons and repetitive sequences in plantsand fungi (1). In plants, CG methylation is com-monly found within gene bodies (2–5), whereasnon-CG methylation, CHG and CHH (where His A, C, or T), is enriched in transposons and re-petitive sequences (1). The RNA-directed DNAmethylation (RdDM) pathway targets both CGand non-CG sites for methylation and is com-

monly associated with transcriptional silencing(6). This pathway can also target and silenceprotein-coding genes, giving rise to epigenetic al-leles or so-called epialleles that can be heritablethrough mitosis and/or meiosis (7, 8) and can bedependent on the methylation of a single CG di-nucleotide (9).

Two meiotically heritable epialleles result-ing in morphological variation are the peloric(Linaria vulgaris) and colorless non-ripening(Solanum lycopersicum) loci (10, 11). Both show

spontaneous epigenetic silencing events withintheir respective populations (10, 12). However, thefrequency at which such spontaneous meioticallyheritable epialleles naturally arise in populationsis unknown. Although epiallelic variation has beenidentified between genetically diverse populationswithin Arabidopsis thaliana (13), it is unclearwhether these identified epialleles are due tounderlying genetic variation. Epialleles have al-so been artificially generated after mutagenesisor because of mutations in the cellular com-ponents required for the maintenance of DNAmethylation (14–16).

An A. thaliana (Columbia-0) population, theMA lines, derived by single-seed descent for 30generations (17) was used to examine the extentof naturally occurring variation in DNA methyla-tion and the frequency at which spontaneous epi-alleles emerge over time. We used the MethylC-Seqmethod (3) to determine the whole-genome baseresolution DNA methylomes for three ancestral

1Plant Biology Laboratory, The Salk Institute for BiologicalStudies, La Jolla, CA 92037, USA. 2Genomic Analysis Labo-ratory, The Salk Institute for Biological Studies, La Jolla, CA92037, USA. 3Bioinformatics Program, University of Californiaat San Diego, La Jolla, CA 92093, USA. 4The Scripps Transla-tional Science Institute and the Department of Molecular andExperimental Medicine, The Scripps Research Institute, LaJolla, CA 92037, USA. 5Howard Hughes Medical Institute, TheSalk Institute for Biological Studies, 10010 North Torrey PinesRoad, La Jolla, CA 92037, USA.

*To whom the correspondence should be addressed. E-mail:[email protected]

www.sciencemag.org SCIENCE VOL 334 21 OCTOBER 2011 369

REPORTS

on

Nov

embe

r 6,

201

1w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from


MA lines (numbers 1, 12, and 19) and fivedescendant MA lines (numbers 29, 49, 59, 69,and 119) (fig. S1). We refer to lines 1, 12, and 19as ancestors throughout this study, althoughthey are not direct ancestors because they arethree generations removed from the originalfounder line (fig. S1). These specific descendantlines were selected because their genomes havebeen sequenced and they have a known level ofspontaneous mutation (18). Biological replicates(sibling plants) for each leaf methylome weresequenced to an average of ~34-fold coverage,which allowed for an average per line exami-nation of 39,897,093 (96.35%) uniquely mappedcytosines and 5,307,077 (98.39%) uniquely mappedCGs (table S1).

A total of 1,730,761 CGs were methylated(mCGs) in at least one MA line (Fig. 1A), andabout 91% of the covered mCGs were invar-iably methylated across all eight lines (19). Thevariable mCGs revealed a set of 114,287 high-confidence CG single methylation polymorphisms(SMPs) that showed a consensus of the meth-ylation status of CG dinucleotides between bi-ological replicates (Fig. 1A). Next, a referenceMA founder DNA methylome was created bypooling the completely conserved mCG sitecalls for all ancestral MA lines and used to de-termine the frequency of discordant CG-SMPsites within the descendant population (Fig. 1B).Within the descendant lines, ~1.62% of the CGmethylome shows susceptibility to dynamic ac-quisitions and losses of mCGs over time (tableS2). On average, ~66,000 methylated CG-SMPs(mCG-SMPs) were identified for each ances-tral and descendant line (fig. S2). Although thetotal number of mCG-SMPs was similar be-tween all lines, the conservation of these poly-morphisms among and between ancestral anddescendant populations was different (Fig. 1Cand table S3). A pairwise comparison of bothpopulations for methylation conservation, esti-mated by global similarity of mCG-SMP sites(19), revealed that all of the ancestral lines arehighly similar (table S4). Descendant lines showedgreater similarity in CG-SMPs methylation sta-tus to ancestral lines than to other descendantlines (table S4).

We calculated an estimate of the epimutationrate per generation in this population by usinglinear regression and TREE PUZZLE, which re-vealed 704 and 2876 methylation changes eachgeneration, respectively (19). We estimated a lowerbound of the epimutation rate with the linearregression results, which revealed 4.46 × 10−4

methylation polymorphisms per CG site per gen-eration (P < 0.0000216) (table S5). This findingcontrasts with the previously reported spontane-ous genetic mutation rate of 7 × 10−9 base sub-stitutions per site per generation for these sameMA lines (18). The TREE PUZZLE analysis re-vealed higher estimated epimutation rates in earliergenerations (19). One possible source of this var-iation could be due to seed age, storage, and/orselection for seed survival. Therefore, although

DNA methylation is predominantly static overrelatively long periods of time, changes in cyto-sine methylation do occur and at a frequencygreater than that of mutation observed at theDNA sequence level.

By using CG-SMPs derived from both an-cestral and descendant populations, we carriedout a genome-wide analysis of differentially meth-ylated regions (DMRs) and identified 2485 CG-DMRs that ranged in size from 11 to 1110 basepairs (bp) (Fig. 2A and table S6). Hierarchicalclustering of CG-DMRs in this population, cal-culated solely on the basis of the methylationdensity, revealed that the ancestral lines segregateas an independent cluster from the descendantlines (Fig. 2B and fig. S3). Multivariate distance-based regression (MDMR) (20, 21) confirmedthis finding, indicating a statistically significant(P < 0.00005) association between ancestor ordescendant status and methylation density of theCG-DMR profiles. The ancestor or descendantstatus explained 47% of the variance in the dis-similarity in methylation density of CG-DMRsbetween pairs of samples, indicating that, overtime, there is a divergence of DNA methylationpatterns in both formation and elimination of CG-DMRs. Furthermore, the genome-wide locationsof these CG-DMRs were not uniformly distributed(P < 2.20 × 10−16), because 60.5% (1504/2485)

were found in genic regions compared with 3.3%(82/2485) and 36.2% (899/2485) located in in-tergenic regions and transposons, respectively(Fig. 2B).

Next, we performed a genome-wide surveyfor nonCG-DMRs and uncovered a total of 284among all eight lines (table S7). In general, thenonCG-DMRs were largely localized to inter-genic regions (141/284) of the genome, becauseonly 57/284 overlapped with genes and 86/284overlapped with transposons. The size ranges ofthe nonCG-DMRs were similar to those of theCG-DMRs because the vast majority occurred insmaller segments of the genome (10 to 682 bp).Therefore, variation in DNA methylation ap-pears to occur in all three methylation sequencecontexts.

CG methylation is present within gene bodiesand is enriched toward the 3′ end (2–5), whereasCG and nonCG methylation is associated withheterochromatin, transposons, and repetitive se-quences (1). In agreement with these findings, weobserved that the 3′ portion of genes containedthe greatest source of CG-DMRs and that themajority of nonCG-DMRs were enriched out-side of the gene bodies (Fig. 2C). Furthermore,we observed a ~twofold depletion of CG-DMRsin exons compared with introns (Fig. 2D). Thegenome-wide distributions of CG-SMPs, CG-DMRs,

Fig. 1. Epigenetic variation of CG-SMPs. (A) An example of a CG-SMP. Gold lines indicate CG methyl-ation, maroon rectangle indicates the untranslated regions, and green rectangles indicated exons. (B) Abreakdown of the methylation distribution of CG dinucleotides among all samples. (C) A heatmap indi-cating the number of CG-SMPs that differ between two samples (table S3).

21 OCTOBER 2011 VOL 334 SCIENCE www.sciencemag.org370

REPORTS

on

Nov

embe

r 6,

201

1w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from


and nonCG-DMRs were depleted in hetero-chromatic regions in the genome (Fig. 2, Eand F). These depletions were mostly observed atthe pericentromeres and centromeres (Fig. 2, Eand F, and figs. S4 and S5). CG-DMRs areenriched in transposons located in euchromatinbut depleted in transposons present near thecentromere. Because the centromeric regions ofthe genome contain the highest density of DNAmethylation (Fig. 2, E and F), these observationscombined with the observations that CG-DMRsare enriched in intron sequences may indicatethat DNA methylation that is associated with nu-cleosomes (22) (i.e., exons or tightly packagedchromatin in the pericentromeres and centro-meres) may be maintained at a higher fidelityand that DNA methylation not associated withnucleosomes may undergo greater epigeneticdrift.

A genome-wide screen for DMRs simulta-neously occurring in all three methylation sequencecontexts (C-DMRs are CG, CHG, and CHH)was performed to assess the extent of epiallelicvariation that is characteristic of RdDM acrossthe MA population. In total, 72 C-DMRs wereidentified, of which functional categorization

revealed that two-thirds overlapped with trans-poson and intergenic sequences whereas aboutone-third overlapped with gene bodies and pro-moters (Fig. 3A and table S8). To determinewhether transposition-induced methylation couldpotentially give rise to the methylated C-DMRs(mC-DMRs) (23), genomic DNA encompassingall C-DMRs was amplified and compared in allancestral and descendant lines. In every case,the observed amplicon size was identical forall MA lines and was equal to the expected sizeof the locus (table S8), indicating that theseC-DMRs are unlinked to cis-genetic variationlocated within 500 bp, a distance that would beexpected to reveal methylation induced by trans-poson insertions at these loci (23). Additionally,none of the genetic variants identified by genomeresequencing of this population (18) overlappedwith any of these C-DMRs. Lastly, restrictionenzyme digestion and Southern blot analyseswere performed to rule out the possibility thatcopy number variants were the cause of spon-taneous epiallele formation, as is the case for thePAI epialleles (24). In all cases examined, the ob-served hybridization pattern and gene copy num-ber were identical for each of the MA lines

(fig. S6). Therefore, we conclude that the 72C-DMRs represent a set of spontaneously occur-ring epialleles within the MA lines, because theywere not associated with any genetic variation.

By using a set of C-DMRs that exhibitedan identical methylation status (fig. S7), we de-termined the frequency of discordance of theancestral state with the descendant lines and foundthat 29 of the C-DMRs were highly variable (>1descendant line was discordant with the ances-tral state) (Fig. 3B). C-DMRs discordant in onlyone of the five descendant lines were the mostfrequent class, but there was an unexpectedlyhigh number of C-DMRs (63%) that were dis-cordant in more than one descendant (Fig. 3B).Within the set of 576 C-DMRs identified (eightlines by 72 C-DMRs), 7 were discordant betweenthe biological replicates (table S8). These datasuggest that, although many C-DMRs representthe formation of spontaneous epialleles, a smallsubset may reflect the presence of “hotspots”(metastable epialleles).

We sequenced small RNA (smRNA) pop-ulations for all eight lines and found thatsmRNAs [represented as RPKCMs (reads perkilobase of each C-DMR per million reads) in

Fig. 2. CG-DMRs diverge over time and are enriched ingene bodies. (A) Example CG-DMR present in an unmeth-ylated state in both replicates of line 69. (B) A heatmaprepresentation of a two-dimensional hierarchical cluster-ing based on DMRs. Columns represent samples. Rowsindicate DMRs. The column to the left of the heatmapindicates the genomic location of the DMR (blue, genebody; gold, transposon; gray, intergenic; red, transposonin gene body). (C) The average distribution of CG-DMRs

(red) and nonCG-DMRs (blue) across gene bodies (from the start of the 5′ UTR to the end of the 3′ UTR, including 500 bp up- and downstream). (D) CG gene-body DMRs are specifically depleted in exons. (E) Genome-wide distributions of mCG (red), CG-SMPs (green), and CG-DMRs (blue) across chromosome I. (F)Genome-wide distributions of methylated nonCGs (mnonCG, red) and nonCG-DMRs (green) across chromosome I. The centromere is indicated by the pinkvertical bar for (E) and (F).


REPORTS

on

Nov

embe

r 6,

201

1w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from


tables S9 to 12] were associated with an in-crease in the average methylation density ofC-DMRs (Fig. 3C). Furthermore, this associationresembled a binary switch, because the mostdensely methylated C-DMRs contained abun-dant 24-nucleotide (nt) smRNAs (Fig. 3C).

Of the eight previously documented plantepialleles resulting in phenotypic variation, all af-fected transcriptional output of the differentiallymethylated locus (9–11, 23–28). mRNA abun-dance was measured in all eight lines with quan-titative reverse transcription polymerase chainreaction (qRT-PCR) at eight C-DMRs that over-lapped with protein-coding regions. In four ofthese genes, the gain or loss of DNA methylationwas correlated with a large decrease or increasein mRNA abundance, respectively, and with thepresence of 24-nt smRNAs at each silenced epi-allele (Fig. 3, D to F, and fig. S8). These find-ings reveal that changes in epiallelic state canlead to major effects on transcriptional output(fig. S9).

We also observed that the methylation sta-tus of one C-DMR resulted in alternative pro-moter usage of ACTIN RELATED PROTEIN 9(At5g43500) (fig. S10C). The loss of DNA meth-ylation within the 5′ untranslated region (UTR)of the At5g43500.1 isoform led to an increase in

mRNA expression, whereas expression of iso-form At5g43500.2, with a transcriptional startsite located further downstream, was unaffected(fig. S10, D and E).

Although epialleles can have major impactson phenotypic diversity, until now their identi-

fication was not trivial. Even more puzzling isthe origin of “pure” alleles, which are definedby their formation in the absence of any geneticvariation in cis or trans (8). One route to epi-allele formation may be the failure to correctlymaintain the proper methylation status through-

-1 0 1 2 3 4 5 6 7 8

A B

Transposonsn = 27

Intergenicn = 21

Genesn = 14

Promotersn = 7

ncRNAsn = 2

Pseudogenen = 1

# of descendant lines discordant with ancestral state

02468

1012141618

1 2 3 4 5

MethylatedUnmethylated

D E

1 rep11 rep2

19 rep119 rep212 rep112 rep229 rep129 rep249 rep149 rep259 rep159 rep269 rep169 rep2

119 rep1119 rep2

Anc

esto

rsD

esce

ndan

ts

Log2 fold change in mRNA levels of At5g24240 (relative to line 1)At5g24240

At5g242501

19

12

29

49

59

69

119

Num

ber

of C

-DM

Rs

C

F

24nt23nt22nt21nt

1

19

12

29

49

59

69

119

smRNA levels at At5g24240 C-DMR (RPKCMs)

0 2 4 6 8 10 12 14 16

mC-DMR density quantiles (%)

Ave

rage

sm

RN

A R

PK

CM

s

0

2

4

6

8

10

12

14

10 20 30 40 50 60 70 80 90 100

21nt22nt23nt24nt

Fig. 3. Epiallelic variation at protein-coding loci is associated with transcrip-tional variation. (A) Classification of C-DMRs and their genomic locations. (B)The number of descendant lines discordant with the ancestral C-DMR stateand the C-DMR methylation status. The black portions of the bar indicate thedescendant C-DMRs that became methylated, whereas the white portionsindicate regions that became unmethylated, compared with the ancestral pop-ulation. (C) The 24-nt smRNA levels are associated with increasing methyla-tion density. The 24-nt smRNA RPKCMs for all 576 C-DMRs (8 MA lines by

72 C-DMRs) were ranked and binned into 10% quantiles, and then the aver-age mC densities were plotted. (D) A representative C-DMR at At5g24240 inwhich both biological replicates of descendant line 59 were unmethylated. (E)qRT-PCR analysis of At5g24240 reveals >50-fold increase in mRNA abundancein unmethylated line 59. Error bars indicate SEM. (F) The 24-nt smRNAs areenriched specifically in the MA lines that are transcriptionally silenced in (E)for the At5g24240 locus with the exception of line 59, which is abundantlyexpressed in (E).

0

10

20

30

40

50

60

met1 ddc

PartiallyMethylated

# of mC-DMRs that become

unmethylated in

# of C-DMRs that become

re-methylated in

rdd

Not methylated

in

Col-0

Num

ber

of C

-DM

Rs

Fig. 4. Methylation status of all 72 epialleles in methylation and demethylation mutant backgrounds.Most of the epialleles become unmethylated in met1-3, whereas a smaller number become remeth-ylated in the DNA demethylase triple mutant rdd.

21 OCTOBER 2011 VOL 334 SCIENCE www.sciencemag.org372

REPORTS

on

Nov

embe

r 6,

201

1w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from


out epigenetic reprogramming that occurs post-fertilization (29, 30). It is noteworthy that 63 ofthe 72 C-DMRs overlap with regions previouslyshown to have altered methylation patterns inmethylation enzyme mutants (Fig. 4) (3). Of the14 C-DMRs that overlap with genes, 5 becomereexpressed in met1-3 and 1 transcript becomessilenced in rdd (3). These results suggest that afailure to faithfully maintain genome-wide meth-ylation patterns by MET1 and/or RDD is likelyone source of spontaneous epiallele formation.

Regardless of their origin, the majority of epi-alleles identified in this study are meiotically sta-ble and heritable across many generations in thispopulation. Understanding the basis for such trans-generational instability and the mechanism(s)that trigger and/or release these epiallelic stateswill be of great importance for future studies.

References and Notes1. J. A. Law, S. E. Jacobsen, Nat. Rev. Genet. 11, 204 (2010).2. S. J. Cokus et al., Nature 452, 215 (2008).3. R. Lister et al., Cell 133, 523 (2008).4. X. Zhang et al., Cell 126, 1189 (2006).5. D. Zilberman, M. Gehring, R. K. Tran, T. Ballinger,

S. Henikoff, Nat. Genet. 39, 61 (2007).6. S. W.-L. Chan et al., Science 303, 1336 (2004).7. J. Paszkowski, U. Grossniklaus, Curr. Opin. Plant Biol. 14,

195 (2011).

8. E. J. Richards, Nat. Rev. Genet. 7, 395 (2006).9. K. Shibuya, S. Fukushima, H. Takatsuji, Proc. Natl. Acad.

Sci. U.S.A. 106, 1660 (2009).10. P. Cubas, C. Vincent, E. Coen, Nature 401, 157 (1999).11. K. Manning et al., Nat. Genet. 38, 948 (2006).12. A. J. Thompson et al., Plant Physiol. 120, 383 (1999).13. M. W. Vaughn et al., PLoS Biol. 5, e174 (2007).14. F. Johannes et al., PLoS Genet. 5, e1000530 (2009).15. F. K. Teixeira et al., Science 323, 1600 (2009);

10.1126/science.1165313.16. A. Vongs, T. Kakutani, R. A. Martienssen, E. J. Richards,

Science 260, 1926 (1993).17. R. G. Shaw, D. L. Byers, E. Darmo, Genetics 155, 369

(2000).18. S. Ossowski et al., Science 327, 92 (2010).19. Additional experiments and descriptions of methods used

to support our conclusions are presented as supportingmaterial on Science Online.

20. C. M. Nievergelt et al., Am. J. Med. Genet. B. Neuropsychiatr.Genet. 141B, 234 (2006).

21. M. A. Zapala, N. J. Schork, Proc. Natl. Acad. Sci. U.S.A.103, 19430 (2006).

22. R. K. Chodavarapu et al., Nature 466, 388 (2010).23. J. Liu, Y. He, R. Amasino, X. Chen, Genes Dev. 18,

2873 (2004).24. J. Bender, G. R. Fink, Cell 83, 725 (1995).25. S. Melquist, B. Luff, J. Bender, Genetics 153, 4017

(1999).26. S. E. Jacobsen, E. M. Meyerowitz, Science 277, 1100

(1997).27. H. Saze, T. Kakutani, EMBO J. 26, 3641 (2007).28. W. J. Soppe et al., Mol. Cell 6, 791 (2000).29. R. A. Mosher et al., Nature 460, 283 (2009).30. R. K. Slotkin et al., Cell 136, 461 (2009).

Acknowledgments: We thank M. White, R. Lister, M. Galli,and R. Amasino for discussions; R. Shaw and E. Darmofor seeds; J. Nery for sequencing operations; andM. Axtell for Southern blot protocol. R.J.S. was supportedby an NIH National Research Service Award postdoctoralfellowship (F32-HG004830). M.D.S. was supported bya NSF Integrative Graduate Education and ResearchTraineeship grant (DGE-0504645). M.G.L. was supportedby an European Union Framework Programme 7Marie Curie International Outgoing Fellowship(project 252475). O.L. and N.J.S. are supported byNIH/National Center for Research Resources grantnumber UL1 RR025774. This work was supported bythe Mary K. Chapman Foundation, the NSF (grantsMCB-0929402 and MCB1122246), the Howard HughesMedical Institute, and the Gordon and Betty MooreFoundation (GBMF) to J.R.E. J.R.E. is a HHMI–GBMFInvestigator. Analyzed data sets can be viewedat http://neomorph.salk.edu/30_generations/browser.html. Sequence data can be downloaded from NationalCenter for Biotechnology Information Sequence ReadArchive (SRA035939). Correspondence and requests formaterials should be addressed to J.R.E. ([email protected]).

Supporting Online Materialwww.sciencemag.org/cgi/content/full/science.1212959/DC1Materials and MethodsSOM TextFigs. S1 to S11Tables S1 to S16References

22 August 2011; accepted 7 September 2011Published online 15 September 2011;10.1126/science.1212959

Computation-Guided BackboneGrafting of a Discontinuous Motifonto a Protein ScaffoldMihai L. Azoitei,1* Bruno E. Correia,1,2* Yih-En Andrew Ban,1† Chris Carrico,1,3

Oleksandr Kalyuzhniy,1 Lei Chen,4 Alexandria Schroeter,1 Po-Ssu Huang,1 Jason S. McLellan,4

Peter D. Kwong,4 David Baker,1,5 Roland K. Strong,3 William R. Schief1,6,7‡

The manipulation of protein backbone structure to control interaction and function is achallenge for protein engineering. We integrated computational design with experimental selectionfor grafting the backbone and side chains of a two-segment HIV gp120 epitope, targeted by thecross-neutralizing antibody b12, onto an unrelated scaffold protein. The final scaffolds bound b12 withhigh specificity and with affinity similar to that of gp120, and crystallographic analysis of a scaffoldbound to b12 revealed high structural mimicry of the gp120-b12 complex structure. The methodcan be generalized to design other functional proteins through backbone grafting.

Computational protein design tests ourunderstanding of protein structure andfolding and provides valuable reagents

for biomedical and biochemical research; long-term goals include the design of field- or clinic-ready biosensors (1), enzymes (2), therapeutics (3),and vaccines (4, 5). A major limitation has beenan inability to manipulate backbone structure;most computational protein design has involvedsequence design on predetermined backbone struc-tures or with minor backbone movement (1–5).Accurate backbone remodeling presents a sub-stantial challenge for computational methodsowing to limited conformational sampling andimperfect energy functions (6).

Novel recognition modules (7), inhibitors (8, 9),enzymes (2), and immunogens (4, 5, 10, 11) havebeen designed by grafting functional constel-lations of side chains onto protein scaffolds ofpredefined backbone structure. In all cases, therestriction to using predetermined scaffold back-bone structures limited the complexity of thefunctional motifs that could be transplanted. Forexample, the de novo enzymes could accommo-date grafting of only three or four catalytic groups,whereas many natural enzymes have six or more(12), and the immunogens were limited to con-tinuous (single-segment) epitopes even thoughmost antibody epitopes are discontinuous (involv-ing two or more antigen segments) (13, 14).

To address the challenge of incorporating back-bone flexibility modeling into grafting design, wedeveloped a hybrid computational-experimentalmethod for grafting the backbone and side chainsof functional motifs onto scaffolds (Fig. 1). Wetested this method by grafting a discontinuousHIV gp120 epitope, targeted by the broadly neu-tralizing monoclonal antibody b12 (15), ontoan unrelated scaffold. b12 binds to a conservedepitope within the CD4-binding site (CD4bs) ofgp120 (16), an area of great interest for vaccinedesign. We focused on transplantation of twosegments from gp120: residues 365 to 372, knownas the CD4b (CD4 binding) loop (17), and resi-dues 472 to 476, known as the ODe (outer domainexit) loop (16). The b12-gp120 interaction in-volves six or seven backbone segments on gp120(16), but 60% of the buried surface area on gp120lies on the CD4b and ODe loops, and a Rosettaenergy calculation (18) suggested that these two

1Department of Biochemistry, University of Washington, Seattle,WA 98195, USA. 2Ph.D. Program in Computational Biology,Instituto Gulbenkian de Ciência, Oeiras, Portugal. 3Divison ofBasic Sciences, Fred Hutchinson Cancer Research Center, Seattle,WA 98109, USA. 4Vaccine Research Center, National Institute ofAllergy and Infectious Diseases, Bethesda, MD 20892, USA.5Howard Hughes Medical Institute, University of Washington,Seattle, WA 98195, USA. 6IAVI Neutralizing Antibody Center,The Scripps Research Institute, La Jolla, CA 92037, USA. 7De-partment of Immunology and Microbial Science, The ScrippsResearch Institute, La Jolla, CA 92037, USA.

*These authors contributed equally to this work.†Present address: Arzeda Corporation, Seattle, WA 98102,USA.‡To whom correspondence should be addressed. E-mail:[email protected]


REPORTS

on

Nov

embe

r 6,

201

1w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from


www.sciencemag.org/cgi/content/full/science.1212959/DC1

Supporting Online Material for

Transgenerational Epigenetic Instability Is a Source of Novel Methylation Variants

Robert J. Schmitz, Matthew D. Schultz, Mathew G. Lewsey, Ronan C. O’Malley, Mark A. Urich, Ondrej Libiger, Nicholas J. Schork, Joseph R. Ecker*

*To whom correspondence should be addressed. E-mail: [email protected]

Published 15 September 2011 on Science Express DOI: 10.1126/science.1212959

This PDF file includes:

Materials and Methods SOM Text Figs. S1 to S11 Tables S1 to S16 References

2

Supporting Online Material Materials and Methods SOM Text References Figs. S1 to S11 Tables S1 to S16 Materials and Methods Plant material

All seeds used in these experiments were descendants of a Columbia (Col-0) accession and have been previously described by Shaw et. al. (17). All eight lines were grown in replicate under long day conditions within a single tray. Leaf tissue was flash frozen in liquid nitrogen at approximately the 10-leaf stage within a 15-minute period of time. DNA was isolated using a Qiagen Plant DNeasy kit (Qiagen, Valencia, CA) following the manufacturer’s recommendations. RNA was isolated using the Qiagen Plant RNeasy kit (Qiagen) following the manufacturer’s instructions. MethylC-Seq library construction

Approximately two micrograms of genomic DNA was sonicated to ~100 bp using the Covaris S2 System using the following parameters: cycle number = 6, duty cycle = 20%, intensity = 5, cycles/burst = 200 and time = 60 seconds. Sonicated DNA was purified using Qiagen DNeasy minielute columns (Qiagen). Each sequencing library was constructed using the NEBNext DNA Sample Prep Reagent Set 1 (New England Biolabs, Ipswich, MA) according to the manufacturer’s instructions with the following slight modifications. Methylated adapters were used in place of the standard genomic DNA adapters from Illumina (Illumina, San Diego, CA). Ligation products were purified with AMPure XP beads (Beckman, Brea, CA). DNA (450 ng) was bisulfite treated using the MethylCode Kit (Invitrogen, Carlsbad, CA) following the manufacturer’s guidelines and then PCR amplified using Pfu Cx Turbo (Agilent, Santa Clara, CA) instead of using the Phusion Taq included in the NEBNext kit using the following PCR conditions (2 minutes at 95C, 4 cycles of 15 seconds at 98C, 30 seconds at 60C, 4 minutes at 72C and 10 minutes at 72C). Small RNA library construction

Approximately 200 mg of finely ground tissue from single rosette plants containing 10 leaves was used for small RNA isolation following the instructions provided within the Ambion mirVana miRNA Isolation Kit (Ambion, Austin, TX). Small RNAs isolated from this kit were EtOH precipitated and loaded into a 15% TBE-UREA gel (Life Technologies, Carlsbad, CA). smRNAs were excised from the gel from the

3

range of ~15-50 nucleotides in length. These smRNAs were used for library construction following the protocol provided in the TruSeq Small RNA Sample Preparation Kit (Illumina). Sequencing

MethylC-Seq libraries were sequenced using the Illumina HiSeq 2000 (Illumina) as per manufacturer’s instructions. Sequencing of libraries was performed up to 101 cycles. Image analysis and base calling were performed with the standard Illumina pipeline version RTA 2.8.0. Sequencing analysis

Fastq files were aligned to TAIR10 using Bowtie (31) and custom algorithms were used for identification of mC sites as described previously (32). Generating the CG-SMPs

In each of the samples, methylated cytosines (mCs) were identified using the same method described in (32). Only positions where the mC called in each biological replicate of a particular sample agreed (i.e., both positions were either methylated or unmethylated) and only those in the CG context were considered. nonCG-SMPs were not queried as the levels of methylation at these sites are much more variable compared to CGs which is likely due to mechanism by which they are maintained. From this list of potential CG-SMPs, all positions were removed where all 8 individuals had the same methylation state. Additionally, any site that did not have coverage in all samples and all biological replicates was removed. A site was considered covered in a sample if a methylated cytosine had been called for that position or if that position had coverage of four or more sequencing reads. This list of CG-SMPs was used for all subsequent analysis included in Figure 1. Calculating the “epimutation rate” using linear regression

To estimate the number of SMPs that arise per generation, the number SMPs was regressed against the generation number of each individual. To calculate the number of SMPs in a given sample, a reference methylome was created. The methylation status of the reference methylome was determined by examining the three ancestor samples and finding the majority methylation status at each CG position (e.g., a site with two methylated ancestors and one unmethylated ancestor would cause that site to be methylated in the reference methylome). The number of SMPs in a sample was determined by counting the number of differences between this probable founder methylome and the methylomes of each sample at all sites where there was coverage in both replicates of all eight samples as well as where both replicates agreed (see the Generating CG SMPs section for determination of coverage). A simple linear relationship between the number of generations and the number of SMPs was assumed (i.e., Number of SMPs = Generation Number * β1 + β0). The linear regression was performed in R as implemented by the lm function. When all samples were included, this regression yielded an epimutation rate per generation (β1) of 788 with a standard error term of ± 109 as well as an adjusted R2 of .87. However, when the regression was repeated removing one sample at a time, the regression significantly improved with the removal of sample 69

4

(B1 = 703.66; standard error = 59.27; adjusted R2 = 0.9524) and changed little with the exclusion of other samples (see table S5). It is unclear if this difference exists for biological or technical reasons; it seems reasonable to exclude this sample for the purposes of this regression as it is likely confounding the results. Based on the epimutation rate per generation calculated using linear regression, we estimated the epimutation rate per site per generation by normalizing the estimate from the linear regression by the number of sites that had coverage in all 8 samples and both replicates. This calculation yielded an epimutation rate per site per generation for all samples of 4.99 * 10-4 as well as an error of ± 6.91* 10-5 and an epimutation rate per site per generation for all samples except 69 of 4.46 * 10-4 as well as an error of ± 3.75* 10-5. Calculating the “epimutation rate” using Tree puzzle

To estimate the rate of epimutations per generation (e.g., number of CGs that change methylation status per generation), we used the program Tree Puzzle (33) to create a phylogram of our 8 samples. As input to this program, we used the 1,109,132 sites for which we had coverage in all 8 samples and for which the replicates agreed in their methylation state (i.e., both replicates were methylated or both replicates were unmethylated). The tree was generated using the default settings of the program (fig. S11). The branch lengths for each node were multiplied by the number of sites input to Tree Puzzle (again, 1,109,132) to obtain an expected number of changes along each branch from the root. This expected number was then divided by the number of generations that separated the node from the root to generate the expected number of epimutations per generation (table S13). The average number of epimutations per generation was calculated by averaging the aforementioned epimutation rates along each branch (with the exception of 69, see Linear Regression), which was 2,876 epimutations per generation. Generating the CG-SMPs heatmap

All pairwise comparisons of the eight samples were considered for each CG-SMP. The total number of differences (i.e., sites where one sample is methylated at a CG-SMP and the other is not) is shown in the heatmap with deeper red intensity indicating greater dissimilarity. Statistical analysis

To determine the distribution of CG-DMRs in genomic features the expected proportion of CG-DMRs in genes, transposons and intergenic regions was calculated by summing the number of bases covered by these genomic contexts and normalizing that sum by 119 Mb. The observed proportions of CG-DMRs in each of these contexts were tested against the expected proportion using a chi-square test (table S14). Furthermore, a Pearson product moment correlation test was used to determine the strength of linear relationship of the methylation status (expressed as a binary string of methylation statuses at each site) at CG-SMPs between pairs of samples. Multivariate distance-based regression (MDMR (21, 34)) tests the hypotheses that the distance (greater or lesser) between individuals is associated with additional variables. The Euclidean distance was used to assess the distance between strains in terms of methylation status patterns (expressed as the fraction of reads containing a methylated cytosine divided by the length

5

of the DMR). These distances were then tested for association with ancestral versus descendant status. To assess the distance between methylation densities of the DMRs the Pearson's product moment correlation-based distance was used. These distances were then tested for association with the location of DMRs (e.g., genic, exonic, intronic, intergenic and in transposons). P-values were determined via permutation tests (number of permutations = 1000). A Python implementation of MDMR based on the program DISTLM developed by Marti J. Anderson (35) was used. The program can be found at http://www.stsiweb.org/index.php/infrastructure/software_data/multivariate_distance_matrix_regression_mdmr/. The R package pvclust was used to calculate bootstrap values for the clusters in Figure 2B. The values in green above each part of the tree represent the bootstrap probability that the cluster does exist in our data set and the values in red represent the approximately unbiased (AU) p-values. The null hypothesis for both of these tests is that the cluster does not exist in our data set. Consequently, a high percentage indicates a high confidence that the cluster does indeed exist. Red rectangles indicate clusters that have an AU value greater than 95%. Not only do all of the replicates cluster strongly with one another as expected, the ancestors cluster strongly together in a group that is separate from the descendants. Identification of DMRs

DMRs were identified using the methylPipe package in R (36). Each specific methylation context (CG, CHG and CHH) was scanned genome-wide requiring at least 10 mC differences within a 100 bp window. The 100 bp window size is an initial query which is later reduced to the first and last cytosine in the DMR (which can be less than 100 bp). The methylation level of the sites within a window was then compared across all samples using a using a Kruskal- Wallis test. Next, these potential DMRs were consolidated by joining neighboring DMRs that occur within 50 bp of each other. The P-values of joined DMRs were combined using Fisher’s Method. The P-values of these joined DMRs were then adjusted for multiple hypotheses testing with the Benjamini-Hochburg method as implemented in R, and any DMR with an adjusted P-value below 0.01 was kept. Furthermore, a stringent requirement of an 8-fold difference in methylation density between the least methylated and most methylated sample was also required. The list of mC DMRs was determined by finding the intersection of the DMRs in all three contexts. The final list of CG-, nonCG- and C-DMRs can be found in table S6-8. DMR distribution across gene bodies

To calculate the relative density of DMRs across gene bodies, DMRs (both CG and nonCG) were overlapped with protein coding genes from the TAIR10 reference. For our gene annotations, we used the file found at: ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_gff3/TAIR10_GFF3_genes.gff. The list of nonCG DMRs was created by taking the intersect of DMRs in the CHG and CHH context. These overlaps were then used to calculate the density of these DMRs within genes by dividing each gene into 12 bins (10 evenly spaced bins in the gene body and a 500 bp bin upstream and downstream of each gene) and calculating the density of DMRs within each of those bins (# bp of DMR overlap / #

6

of bp within a bin across all genes). The densities for a particular DMR class (i.e., CG or nonCG) were then normalized by the minimum density within that class. Distribution of DMRs in introns and exons

A list of introns and exons was created using the TAIR10 GFF file obtained at ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_gff3/TAIR10_GFF3_genes.gff. Next, the number of bases that overlapped between CG-DMRs or nonCG-DMRs and either exons or introns was calculated. This count of bases was then normalized by the total number of CGs/nonCGs in each feature type. Genome-wide distribution of methylation variation

Counts of SMPs, CG-DMRs, nonCG-DMRs, mnonCGs, mCGs, mappable nonCGs, and mappable CGs were generated and plotted in 100 equally sized bins across each chromosome. Mappable CGs/nonCGs were defined as those that were covered by at least one read in one of our 16 samples. The SMP, CG-DMR, nonCG-DMR, mnonCG and mCG counts in each bin were normalized by the count in the respective mappable CG/nonCG bin. These normalized counts were then scaled to one by dividing all elements within a dataset (i.e., within SMPs, CG-DMRs, nonCG-DMRs, m-nonCGs, or mCGs) by the maximum value in their respective dataset. These scaled and normalized counts were then plotted for each chromosome. Alignment of smRNA reads

smRNA reads were first processed to remove the 3’ adapter sequence and smRNAs greater than 16 bp in length were aligned to the TAIR10 reference genome using the Bowtie alignment algorithm using the following parameters - -solexa-quals -e 1 -l 20 -n 0 -a -m 1000 --best --nomaqround. Reads were retained that contained perfect matches within the genome and that did not have more than a thousand locations. Analysis of sequence variants

Primer sets for all 72 of the C-DMRs identified were designed to encompass the entire DMR in addition to an extra 500 bp flanking the DMR. All primer sets can be found in table S15. Southern blot analysis

Genomic DNA was isolated from single plants using the Qiagen DNeasy kit following the manufacturer’s protocol (Qiagen). Two micrograms of genomic DNA was digested with enzyme (5 units/ug) and incubated for 12 hours at 37C. Digested DNA was EtOH precipitated on ice for 30 minutes and resuspended in 20ul of TE. Samples were incubated at 65C for 10 minutes and then loaded into a 0.7% TAE agarose gel and run overnight at a rate of <I V/cm. The gel was soaked in Alkaline Transfer Buffer for 15 minutes two times and transferred overnight to Amersham Hybond XL (GE Healthcare, Piscataway, NJ). The membrane was crosslinked using a UV Stratalinker with the default settings. The membrane was incubated in 10ml of the Sigma Perfect-Hyb buffer (Sigma, St. Louis, MO) for 1 hour at 65C. Probes were prepared following the manufacturer’s instructions included with the NEBlot Kit (New England Biolabs) and then cleaned with Sephadex G-50 spin columns (GE Healthcare, Piscataway, NJ). The probe was incubated

7

with the membrane overnight and then washed with 20 ml of low stringency buffer (2X SSC, 0.1% SDS) for 5 minutes once and then twice with high stringency buffer (0.5X SSC, 0.1% SDS) for 20 minutes and then a final time in ultra-high stringency buffer (0.1X SSC, 0.1% SDS) for 2 minutes. Finally, membranes were exposed overnight to Biomax film (Carestream Health, Rochester, NY). RNA expression analysis

RNA abundance was assessed by quantitative real-time (qRT)-PCR, as described in Lewsey et. al. (37). Primers against transcripts of interest are listed in table S16. Data were analyzed using LinRegPCR (38, 39) and qBasePlus (40) software, to give efficiency-corrected relative fold changes in RNA abundance with correct propagation of errors. SOM Text Epialleles that did not alter mRNA levels

Four C-DMRs showed little or no correlation between mRNA and methylation levels, these genes all showed a strong correlation between methylation density and 24nt smRNAs (fig. S9). It should be noted, however, that these genes also contained the smallest and least dense C-DMRs of those tested and may only be expressed in a tissue-specific manner, similar to FWA (41).

Explanation of possible sources of epiallele formation and stability

By sequencing of the methylomes of eight MA lines, approximately 1.7% variation in DNA methylation was observed across five descendent lines separated by 30 generations; this amount of variation is ~5 orders of magnitude greater than the measured genetic variation observed in these same lines. Although full genome sequencing of the MA population did not uncover a single genetic variant that arose independently multiple times in these descendant lineages (18), our analysis of variation at the level of DNA methylation has identified numerous sites that are discordant in multiple descendant lines compared with their ancestral population. Furthermore, none of the 114,287 SMPs, 284 nonCG-DMRs or 72 C-DMRs and only one of the 2,485 CG DMRs overlapped with the previously identified mutations identified by resequencing these MA lines (18). Additionally, the possibility that these C-DMRs are due to local cis-linked variation (transposon insertions) was ruled out by genomic characterization of these regions. Furthermore, copy number variants that could act in cis or trans were also eliminated as a possible source of epiallele formation indicating that C-DMRs can arise independently of such mutations. Therefore, while mutations found in the MA lines were randomly distributed (with the exception of G:C - A:T transitions (18)), the variation in DNA methylation and the spontaneous formation of epialleles is likely constrained to specific sequences or chromosomal contexts (Fig. 2E and F and fig. S4 and S5).

One possible source of naturally occurring epialleles is through the RNAi pathway (15). Interestingly, one of the C-DMRs that extends into protein-coding genes occurred at a locus with overlapping sense and antisense transcripts (fig. S8), possibly targeting this region for RdDM silencing. Expression of overlapping transcripts in the same cell and at the same developmental time could result in formation of dsRNA which

8

may trigger the production of small RNAs that direct DNA methylation to these target genes (1).

With only seven occurrences out of a possible 576 events of a change in the methylation status found between biological replicates (siblings), we conclude that these identified epialleles, in large part, are meiotically heritable. As shown by their functional effects on transcription, these novel epialleles, which arose spontaneously over 30 generations (~four years in chronological time), have a significant potential to alter the phenotype of the host organism. References 1. J. A. Law, S. E. Jacobsen, Nat Rev Genet 11, 204 (2010). 15. F. K. Teixeira et al., Science 323, 1600 (2009). 17. R. G. Shaw, D. L. Byers, E. Darmo, Genetics 155, 369 (2000). 18. S. Ossowski et al., Science 327, 92 (2010). 21. M. A. Zapala, N. J. Schork, Proc Natl Acad Sci U S A 103, 19430 (2006). 31. B. Langmead, C. Trapnell, M. Pop, S. L. Salzberg, Genome Biol 10, R25 (2009). 32. R. Lister et al., Nature 462, 315 (2009). 33. H. A. Schmidt, K. Strimmer, M. Vingron, A. von Haeseler, Bioinformatics 18,

502 (2002). 34. C. M. Nievergelt, O. Libiger, N. J. Schork, PLoS Genet 3, e51 (2007). 35. B. H. McArdle, M. J. Anderson, Ecology 82, 290 (2001). 36. R. Lister et al., Nature 471, 68 (2011). 37. M. G. Lewsey et al., Mol Plant Microbe Interact 23, 835 (2010). 38. C. Ramakers, J. M. Ruijter, R. H. Deprez, A. F. Moorman, Neurosci Lett 339, 62

(2003). 39. J. M. Ruijter et al., Nucleic Acids Res 37, e45 (2009). 40. J. Hellemans, G. Mortier, A. De Paepe, F. Speleman, J. Vandesompele, Genome

Biol 8, R19 (2007). 41. T. Kinoshita et al., Science 303, 521 (2003).

References and Notes

1. J. A. Law, S. E. Jacobsen, Nat Rev Genet 11, 204 (2010).

2. S. J. Cokus et al., Nature 452, 215 (2008).

3. R. Lister et al., Cell 133, 523 (2008).

4. X. Zhang et al., Cell 126, 1189 (2006).

5. D. Zilberman, M. Gehring, R. K. Tran, T. Ballinger, S. Henikoff, Nat. Genet. 39, 61 (2006).

6. S. W. Chan, X. Zhang, Y. V. Bernatavichute, S. E. Jacobsen, PLoS Biol. 4, e363 (2006).

7. S. W. Chan et al., Science 303, 1336 (2004).

8. J. Paszkowski, U. Grossniklaus, Curr Opin Plant Biol, (2011).

9. E. J. Richards, Nat. Rev. Genet. 7, 395 (2006).

10. K. Shibuya, S. Fukushima, H. Takatsuji, Proc Natl Acad Sci U S A 106, 1660 (2009).

11. P. Cubas, C. Vincent, E. Coen, Nature 401, 157 (1999).

12. K. Manning et al., Nat Genet 38, 948 (2006).

13. A. J. Thompson et al., Plant Physiol 120, 383 (1999).

14. F. Johannes et al., PLoS Genet 5, e1000530 (2009).

15. F. K. Teixeira et al., Science 323, 1600 (2009).

16. A. Vongs, T. Kakutani, R. A. Martienssen, E. J. Richards, Science 260, 1926 (1993).

17. R. G. Shaw, D. L. Byers, E. Darmo, Genetics 155, 369 (2000).

18. S. Ossowski et al., Science 327, 92 (2010).

19. Additional experiments and descriptions of methods used to support our conclusions are presented as supporting material on Science Online.

20. C. M. Nievergelt et al., Am J Med Genet B Neuropsychiatr Genet 141B, 234 (2006).

21. M. A. Zapala, N. J. Schork, Proc Natl Acad Sci U S A 103, 19430 (2006).

22. R. K. Chodavarapu et al., Nature 466, 388 (2010).

23. J. Liu, Y. He, R. Amasino, X. Chen, Genes Dev 18, 2873 (2004).

24. J. Bender, G. R. Fink, Cell 83, 725 (1995).

25. H. Yi, E. J. Richards, Genetics 183, 1227 (2009).

26. S. E. Jacobsen, E. M. Meyerowitz, Science 277, 1100 (1997).

27. H. Saze, T. Kakutani, EMBO J., (2007).

28. W. J. Soppe et al., Mol. Cell 6, 791 (2000).

29. R. A. Mosher et al., Nature 460, 283 (2009).

30. R. K. Slotkin et al., Cell 136, 461 (2009).

31. B. Langmead, C. Trapnell, M. Pop, S. L. Salzberg, Genome Biol 10, R25 (2009).

32. R. Lister et al., Nature 462, 315 (2009).

33. H. A. Schmidt, K. Strimmer, M. Vingron, A. von Haeseler, Bioinformatics 18, 502 (2002).

34. C. M. Nievergelt, O. Libiger, N. J. Schork, PLoS Genet 3, e51 (2007).

35. B. H. McArdle, M. J. Anderson, Ecology 82, 290 (2001).

36. R. Lister et al., Nature 471, 68 (2011).

37. M. G. Lewsey et al., Mol Plant Microbe Interact 23, 835 (2010).

38. C. Ramakers, J. M. Ruijter, R. H. Deprez, A. F. Moorman, Neurosci Lett 339, 62 (2003).

39. J. M. Ruijter et al., Nucleic Acids Res 37, e45 (2009).

40. J. Hellemans, G. Mortier, A. De Paepe, F. Speleman, J. Vandesompele, Genome Biol 8, R19 (2007).

41. T. Kinoshita et al., Science 303, 521 (2003).

30 Generations

1 12 19

29 49 59 69 119

Col-0

rep 1 rep 2 rep 1 rep 2 rep 1 rep 2

rep 1 rep 2 rep 1 rep 2 rep 1 rep 2 rep 1 rep 2 rep 1 rep 2

Supplemental Figure 1. A scheme of the generation of the mutation acccumulation population usedin this study. A single founder line was used to generate this population. The three replicate ancestral lines (1, 12, 19) are separated from the original found by three generations.The five replicate descendant lines are 30 generations removed from the original founder line.

Descendants

Ancestors

Ancestors Descendants

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

1 19 12 29 49 59 69 119

Unmethylated SMP

Methylated SMP

Tota

l Num

ber o

f SM

Ps

Supplemental Figure 2. Total number of methylated and unmethylated SMPs per line.The number of methylated SMPs per line are not dependent on ancestral or descendantstatus. Red portions indicate unmethylated SMPs and blue portions indicate methylatedSMPs.

29_r

1

29_r

2 59_r

1

59_r

2 119_

r1

119_

r2

69_r

1

69_r

2

49_r

1

49_r

2

1_r1

1_r2

12_r

1

12_r

2

19_r

1

19_r

2

0.5

1.0

1.5

2.0

2.5

Cluster dendrogram with AU/BP values (%)

Cluster method: completeDistance: euclidean

Hei

ght

100

100100100 99

100100

100100

100

92

787070

au

100

100100100 99

100100

10099

100

83

342729

bp

1

234 5

67

89

10

11

121314

edge #

Supplemental Figure 3. Results from clustering using the R package pvclust. Values in red represent the approximately unbiased (AU) values and values in green represent bootstrap probability values. Red rectangles indicate portions of the tree that have AU values above 95. Based on these rectangles, one can see that the ancestors group together away from the descendants.

Supplemental Figure 4. Genome-wide distribution of mCGs, CG-SMPs and CG-DMRs for eachchromosome. Each chromosome was broken down into 100 equally spaced bins. The CG-SMP, CG-DMR, and mCG counts in each bin were normalized by the number of mappable CGs withineach bin. Red shaded boxes indicate positions of the centromeres for each chromosome. Redlines indicate mCGs, green lines indicate CG-SMPs and blue lines indicate CG-DMRs.

0.2

0.4

0.6

0.8

1.0

Chromosome II

Nor

mal

ized,

sca

led

coun

t

0 2 6 10 14 18Position (Mb)

0.2

0.4

0.6

0.8

1.0

Chromosome III

Nor

mal

ized,

sca

led

coun

t

0.2

0.4

0.6

0.8

1.0

Chromosome IV

Nor

mal

ized,

sca

led

coun

t

0.2

0.4

0.6

0.8

1.0

Chromosome V

Nor

mal

ized,

sca

led

coun

t

4 8 12 16 0 2 6 10 14 18Position (Mb)

4 8 12 16

0 2 6 10 14 18Position (Mb)

4 8 12 16 0 2 6 10 14 18Position (Mb)

4 8 12 16

20 22

2220 24

mCGCG-SMPCG−DMR

mCGCG-SMPCG−DMR

mCGCG-SMPCG−DMR

mCGCG-SMPCG−DMR

0.0

0.2

0.4

0.6

0.8

1.0

Chromosome II

Nor

mal

ized,

sca

led

coun

t

0 2 6 10 14 18

non-mCGnonCG−DMR

4 8 12 16 20Position (Mb)

0.0

0.2

0.4

0.6

0.8

1.0

Chromosome V

Nor

mal

ized,

sca

led

coun

tm-nonCGnonCG−DMR

0 2 6 10 14 18Position (Mb)

4 8 12 16 2220 24

non-mCGnonCG−DMR

1816 20 22

0.0

0.2

0.4

0.6

0.8

1.0

Chromosome III

Nor

mal

ized,

sca

led

coun

t

0 2 6 10 144 8 12Position (Mb)

0.0

0.2

0.4

0.6

0.8

1.0

Chromosome IV

Nor

mal

ized,

sca

led

coun

t

non-mCGnonCG−DMR

0 2 6 10 14 184 8 12 16Position (Mb)

Supplemental Figure 5. Genome-wide distribution of m-nonCGs and nonCG-DMRs for eachchromosome. Each chromosome was broken down into 100 equally spaced bins. The nonCG-DMR,and m-nonCG counts in each bin were normalized by the number of mappable nonCGs withineach bin. Red lines indicate m-nonCGs, green lines indicate nonCG-DMRs.

1 12 19 29 49 59 69 119 1 12 19 29 49 59 69 119

10 kb

8 kb

6 kb5 kb4 kb

3 kb

2 kb

1.5 kb1 kb

10 kb

8 kb

6 kb5 kb4 kb

3 kb

2 kb

1.5 kb1 kb

Supplemental Figure 6. Southern blot analysis of the C-DMR region overlapping with At5g24240(A) and the C-DMR overlapping with At3g01345 (B). Arrows indicate the expected size offragments from genomic digestions with (BamHI - A) and (EcoRI - B). The higher product observed in (B) is a homologous sequence that is invariably present in all eight lines.

A B

Line 1

Line 12Line 19

12

77

467 7

12

Supplemental Figure 7. A Venn diagram representing the overlap of C-DMRs between the threeancestral lines studied (1, 12, 19). In total, 46 C-DMRs are found in agreement in all three lines.Each ancestral line contains discordant C-DMRs among each other indicating possible hotspots ofepiallelic variation.

-1 0 1 2 3 4 5 6 7 8

D E

F G1

19

12

29

49

59

69

119

Ance

stor

sD

esce

ndan

ts

At1g53490 Log2 fold change in mRNA abundance of At1g53490 (relative to line 29)

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

Supplemental Figure 8. (A) An example of a C-DMR at At3g01345 that has lost methylation in all biological replicates of descendant lines 29, 49, 69 and 119. (B) qRT-PCR analysis of At3g01345 reveals >500-fold increase in mRNA abundance in the unmethylated lines. (C) The 24nt smRNAs are associated with transcriptional silencing in each of the corresponding MA lines. Interestingly, 21nt smRNAs appear in the MA lines that are transcriptionally active. (D and E) Examples of C-DMRs that overlap with protein-coding regions. qRT-PCR results that reveal a strong correlation between an absence of the mC-DMR and an increase in mRNA abundance. (E), smRNA levels at each C-DMR. 24nt smRNA levels correlate with mC-DMRs and less abundant mRNA levels.

At1g53480

-1 0 1 2 3 4 5 6 71

19

12

29

49

59

69

119

Ance

stor

sD

esce

ndan

ts

At1g53480

Log2 fold change in mRNA abundanceof At1g53480 (relative to line 19)

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

8

H

0 0.5 1 1.5 2.0 2.5

24nt

23nt

22nt

21nt


1

19

12

29

49

59

69

119

0 2 4 6 8 10 12--2

A BAt3g01345

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

1191229

495969

119

Log2 fold change in mRNA levels of At3g01345 (relative to line 19)

C

24nt23nt22nt21nt

1191229

495969

119

smRNA levels at At3g01345 C-DMR (RPKCMs)0 0.5 1.0 1.5 2.0 2.5

-0.2 0 0.2 0.4 0.6 0.8 1.0

-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

-0.5 0 0.5 1 1.5 2 2.5

A E

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

tsLog2 fold change in mRNA abundance

of At2g44450 (relative to line 29)At2g44450

1

19

12

29

49

59

69

119

B F1

19

12

29

49

59

69

119

At3g22770Log2 fold change in mRNA abundance

of At3g22770 (relative to line 1)

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

-1.0 -0.5 0 0.5 1.0 1.5 2.0 2.5

C G1

19

12

29

49

59

69

119

Log2 fold change in mRNA abundance of At5g24250 (relative to line 49)

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

At5g24250

D H1

19

12

29

49

59

69

119

Log2 fold change in mRNA abundance of At5g66300 (relative to line 29)

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

At5g66300

Supplemental Figure 9. (A-D), Examples of C-DMRs that overlap with protein-coding regions visualizedin DNA methylation tracks of ancestral and descendant lines. (E-H), qRT-PCR results that reveal no correlation between an absence of the mC-DMR and an increase in mRNA abundance. (I-L), 24nt smRNA levels associate with mC-DMRs

0 0.2 0.4 0.6 0.8 1.0

1196959492912191

24nt23nt22nt21nt

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

1196959492912191

24nt23nt22nt21nt

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

1196959492912191

24nt23nt22nt21nt

0 2 4 6 8 10 12 14 16

1196959492912191

24nt23nt22nt21nt




smRNA levels atAt2g44450 C-DMR (RPKCMs)I

J

K

L

-1 0 1 2 3 4 5 61

191229

495969

119

-0.5 0 0.5 1.0 1.5 2.0 2.51 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

At4g14548

Log2 fold change in ncRNA levels of At4g14548 (relative to line 29)

119

12

29

4959

69119

Ance

stor

sD

esce

ndan

ts

A B

D

At5g43500.1

At5g43500.2

At5g43500.1

At5g43500.2 E

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

1 rep11 rep2


119 rep1119 rep2

Ance

stor

sD

esce

ndan

ts

C

-1 0 1 2 3 4 5 6

Log2 fold change in mRNA levels of At5g43500.2 (relative to line 1)

Log2 fold change in mRNA levels of At5g43500.1 (Relative to line 1)

Ance

stor

sD

esce

ndan

ts

1191229495969119

Ance

stor

sD

esce

ndan

ts

Supplemental Figure 10. Epiallelic regions overlapping ncRNAs and alternative transcriptional start sites. (A) An example of a C-DMR at a ncRNA (At4g14548). DNA methylation is absent from both biological replicates of line 1. (B) qRT-PCR analysis of At4g14548 reveals increased mRNA abundance in the line that has lost DNA methylation. (C) A C-DMR occurring near the transcriptional start site of one of two splice variants. A zoomed in view of the C-DMR reveals an additional region of variation in DNA methylation (outlined with red box). (D) qRT-PCR analysis of the At5g43500.2 splice variant shows no correlation between mRNA expression and methylation state. (E) The methylation status of this region is associated with the transcriptional output of At5g43500.1. Error bars indicate standard error of the mean (s.e.m.).

Supplemental Figure 11. An estimate of the rate of epimutations per generation (e.g., number of CGs that change methylation status per generation), was generated using the program Tree Puzzle to create a phylogram of all8 lines.

Table S1. MethylC-Seq data set details. The non-conversion percentage is a measure of the bisulfite conversion reaction efficiency. A 1% non-conversion rate indicates that 99% of unmethylated Cs were converted by the reaction. The non-conversion rate is determined as described by Lister et al 2008 Cell (3).

Sample Non-conversion % Mapped Reads Genome Coverage Strand Coverage Col-0 1 rep1 2.36% 43270451 34.90725459 17.45362729 Col-0 1 rep2 0.58% 42580881 34.35096282 17.17548141

Col-0 12 rep1 0.76% 45094999 36.37915886 18.18957943 Col-0 12 rep2 0.66% 47295293 38.15418595 19.07709297 Col-0 19 rep1 0.87% 41886862 33.79108195 16.89554097 Col-0 19 rep2 0.66% 43027485 34.7112484 17.3556242 Col-0 29 rep1 0.75% 45684189 36.8544718 18.4272359 Col-0 29 rep2 0.63% 43861030 35.38368807 17.69184403 Col-0 49 rep1 2.91% 42534353 34.31342763 17.15671382 Col-0 49 rep2 1.03% 39304942 31.7081885 15.85409425 Col-0 59 rep1 0.74% 43077476 34.75157728 17.37578864 Col-0 59 rep2 0.80% 38724635 31.24004168 15.62002084 Col-0 69 rep1 0.84% 40166651 32.40334871 16.20167435 Col-0 69 rep2 0.74% 40098234 32.34815516 16.17407758

Col-0 119 rep1 0.77% 44582169 35.96544726 17.98272363 Col-0 119 rep2 1.07% 26017354 20.98878978 10.49439489

Table S2. Variation in CG methylation within the descendant lines. The number of variable CGs was calculated by looking for sites with coverage in all 5 descendant samples that did not completely agree (i.e., were not all methylated or all unmethylated). This number was then divided by the number of CGs present in the Arabidopsis genome on both strands (6,269,413).

Variable CGs Percentage of CGs that vary out of all CGs Difference Omit 69_2 77270 1.23% 0.38%

Omit 119_1 88356 1.41% 0.21% Omit 59_2 86519 1.38% 0.24% Omit 29 87171 1.39% 0.22%

Omit 49_1 87502 1.40% 0.22% All 101256 1.62% 0.00%

Table S3. Number of dissimilar CG-SMPs used in the heatmap construction in Figure 1D. This table summarizes the number of dissimilar sites of all pairwise comparisons of the samples. A site is considered dissimilar if one sample has a methylated CG at a particular position and the other has an unmethylated CG. Only positions that agreed within replicates and that were covered in all 8 samples were considered.

19 1 12 49 29 59 69 119 19 0 17894 15289 30418 30871 31971 42567 29263 1 17894 0 16349 31756 31949 33053 43879 30879 12 15289 16349 0 29379 29702 30278 41148 27592 49 30418 31756 29379 0 40749 41755 50797 39535 29 30871 31949 29702 40749 0 41520 51866 39698 59 31971 33053 30278 41755 41520 0 52174 40232 69 42567 43879 41148 50797 51866 52174 0 50286 119 29263 30879 27592 39535 39698 40232 50286 0

Table S4. Pearson correlation coefficients of the methylation status of CG MPs for pairs of samples. Samples labeled 19, 1, 12 represent the ancestral lines while the labels 49, 29, 59, 69 and 119 represent the ancestral lines. All pairwise correlations were high statistically significant.

19 1 12 49 29 59 69 119

19 1 0.7 0.7 0.5 0.4 0.4 0.3 0.5 1 0.7 1 0.7 0.4 0.4 0.4 0.2 0.4 12 0.7 0.7 1 0.5 0.5 0.5 0.3 0.5 49 0.5 0.4 0.5 1 0.3 0.3 0.1 0.3 29 0.4 0.4 0.5 0.3 1 0.3 0.1 0.3 59 0.4 0.4 0.5 0.3 0.3 1 0.1 0.3 69 0.3 0.2 0.3 0.1 0.1 0.1 1 0.1 119 0.5 0.4 0.5 0.3 0.3 0.3 0.1 1

Table S5. Estimation of the "epimutation rate"

MA line

Number of

SMPs

Excluded Sample

B1 Std Error

p-value B0 Std Error p-value Adjusted R2

1 9477 1 818.4 122.7 0.00055 3340.3 3013.2 0.31 0.8613 12 6872 12 790.2 127.1 0.0008 4233.1 3119.7 0.224 0.8433 19 8417 19 806.9 125.4 0.00067 3703.6 3079.2 0.274 0.8523 29 26404 29 808.1 121.3 0.00055 4259.2 2669 0.162 0.861 49 26215 49 809.7 120.9 0.00054 4255.6 2658.6 0.161 0.8624 59 27286 59 800.5 123 0.00063 4275.9 2706.5 0.165 0.8551 69 38464 69 703.66 59.27 2.16E-05 4488.05 1303.79 0.014 0.9524 119 24820 119 821.8 116.1 0.0004 4229.1 2554.2 0.149 0.8752

All 788 109 0.00017 4303 2526 0.132 0.865

Table S6. CG DMRs

Chr Start End chr1 255844 255926 chr1 381260 381452 chr1 441191 441410 chr1 526043 526112 chr1 532230 532422 chr1 587342 587438 chr1 660152 660264 chr1 732525 732611 chr1 841846 841962 chr1 888548 888624 chr1 975288 975354 chr1 1211824 1211853 chr1 1341709 1341834 chr1 1539757 1539833 chr1 1592112 1592281 chr1 1883984 1884317 chr1 1983260 1983456 chr1 2158138 2158214 chr1 2255350 2255398 chr1 2257767 2257897 chr1 2290414 2290470 chr1 2314296 2314401 chr1 2314756 2314878 chr1 2396729 2396868 chr1 2415147 2415253 chr1 2415897 2416012 chr1 2427801 2427843 chr1 2527004 2527111 chr1 2620053 2620107 chr1 2634475 2634652 chr1 2846075 2846120 chr1 2877666 2877943 chr1 2883508 2883581 chr1 2918884 2919057 chr1 2935802 2935857 chr1 2978435 2978477 chr1 2987099 2987249

chr1 3390557 3390711 chr1 3616994 3617023 chr1 3895472 3895632 chr1 3932806 3933083 chr1 3939907 3940034 chr1 3944521 3944711 chr1 3961365 3961438 chr1 3977452 3977647 chr1 3990038 3990225 chr1 4041832 4041904 chr1 4046727 4046803 chr1 4047435 4047603 chr1 4052999 4053047 chr1 4115107 4115275 chr1 4261545 4261683 chr1 4267390 4267438 chr1 4269263 4269352 chr1 4455855 4456044 chr1 4464498 4464679 chr1 4474807 4474862 chr1 4505858 4506156 chr1 4570778 4570936 chr1 4603628 4603758 chr1 4636414 4636596 chr1 4640722 4640795 chr1 4677426 4677475 chr1 4725010 4725087 chr1 4780177 4780207 chr1 4966064 4966133 chr1 5305498 5305638 chr1 5336214 5336295 chr1 5519896 5520087 chr1 5537915 5537982 chr1 5616834 5617051 chr1 5699867 5699935 chr1 5810548 5810606 chr1 5866297 5866403 chr1 5866669 5866807 chr1 6007734 6007848 chr1 6056491 6056543 chr1 6117648 6117748



























































chr5 24141080 24141189 chr5 24160569 24160705 chr5 24232325 24232500 chr5 24294217 24294322 chr5 24308659 24308828 chr5 24329624 24329803 chr5 24410994 24411044 chr5 24452336 24452373 chr5 24509929 24510113 chr5 24535618 24535699 chr5 24567268 24567425 chr5 24603872 24604070 chr5 24631943 24631994 chr5 24637347 24637389 chr5 24750809 24750998 chr5 24927736 24927886 chr5 24930084 24930122 chr5 25115002 25115054 chr5 25121744 25121781 chr5 25143850 25144022 chr5 25166778 25166856 chr5 25734873 25735062 chr5 25855444 25855851 chr5 25860926 25861108 chr5 26060649 26060729 chr5 26480996 26481144 chr5 26774761 26774926 chr5 26866608 26866642 chr5 26944429 26944511

Table S7. nonCG-DMRs

Chromosome Start of DMR

End of DMR

chr1 1592356 1592637 chr1 3573146 3573222 chr1 4966297 4966422 chr1 5866838 5866859 chr1 6263209 6263221 chr1 6722164 6722262 chr1 7138304 7138375 chr1 7974347 7974450 chr1 8282978 8283281 chr1 8643708 8643811 chr1 9202795 9202876 chr1 9203037 9203047 chr1 9573094 9573181 chr1 9630612 9630620 chr1 10363137 10363294 chr1 10545210 10545361 chr1 11027392 11027452 chr1 11044210 11044387 chr1 11066880 11066949 chr1 11318367 11318510 chr1 12083782 12083973 chr1 12097179 12097271 chr1 12166344 12166406 chr1 12525133 12525226 chr1 12542482 12542546 chr1 12642585 12642632 chr1 12693175 12693239 chr1 13024754 13024945 chr1 13065620 13065820 chr1 13106450 13106611 chr1 13351839 13351879 chr1 13353183 13353218 chr1 14018970 14019221 chr1 15935722 15935752 chr1 16466637 16466762 chr1 16492341 16492387 chr1 16738607 16738665 chr1 16790647 16790737 chr1 16840668 16840694 chr1 16840809 16840838 chr1 16881679 16881947 chr1 16925272 16925570 chr1 16938736 16938817

chr1 17229354 17229433 chr1 17229682 17229790 chr1 17560824 17560959 chr1 17640323 17640387 chr1 17695664 17695775 chr1 17714774 17715037 chr1 17873776 17873899 chr1 18078715 18078850 chr1 18212446 18212474 chr1 18212597 18212655 chr1 18759206 18759300 chr1 18955917 18955988 chr1 18973998 18974100 chr1 19171707 19171747 chr1 19320020 19320070 chr1 19700826 19700849 chr1 19713510 19713543 chr1 19744226 19744319 chr1 19899550 19899605 chr1 19963491 19963504 chr1 19963717 19963999 chr1 20422773 20422835 chr1 20777878 20777892 chr1 21249653 21249882 chr1 21849552 21849719 chr1 23755331 23755556 chr1 23856245 23856312 chr1 24395779 24395842 chr1 24728427 24728472 chr1 24733496 24733627 chr1 24948387 24948444 chr1 25057352 25057418 chr1 25215972 25216009 chr1 25723697 25723885 chr1 25750883 25750996 chr1 26185822 26185877 chr1 26634937 26634964 chr1 26761012 26761054 chr1 26771355 26771486 chr1 27059834 27059863 chr1 28470353 28470447 chr1 28515015 28515044 chr1 28515360 28515539 chr1 28515657 28515706 chr1 28515971 28516056 chr2 184697 184717





chr5 18216903 18216918 chr5 18781966 18782066 chr5 18929632 18929737 chr5 19198300 19198974 chr5 19830066 19830111 chr5 22013604 22013625 chr5 22703647 22703678 chr5 22804802 22804948 chr5 23592748 23592853 chr5 24019046 24019168 chr5 26481059 26481282

Table S8. C-DMRs

Chromosome Start of DMR

End of DMR

Replicates Discordant

1 6721879 6722331 No 1 7138302 7138494 Yes 1 9573098 9573237 No 1 12083824 12083955 No 1 12097125 12097392 No 1 12642562 12642676 No 1 12693068 12693494 No 1 13024899 13025017 No 1 13065621 13065813 No 1 13106344 13106654 No 1 14018983 14019250 No 1 15935702 15935801 Yes 1 16790636 16790824 No 1 16881902 16882300 No 1 17229326 17229416 Yes 1 18078584 18078783 No 1 19744227 19744336 No 1 19963111 19964044 No 1 21249682 21249966 No 1 23755482 23755655 No 1 23856241 23856305 No 1 25215925 25215993 No 1 26185816 26186008 Yes 1 28515048 28516112 No 2 2228985 2229108 No 2 2232080 2232216 No 2 10241165 10241316 No 2 10591404 10591529 No 2 11812783 11812961 No 2 12623205 12623326 No 2 18343591 18343672 No 3 129070 129620 No 3 1848987 1849151 No 3 7253885 7254003 No 3 8048211 8048423 No 3 10782032 10782078 Yes 3 11023752 11023813 No

3 11097266 11097400 Yes 3 11101052 11101159 No 3 12460173 12460346 No 3 14095664 14095826 No 3 15080339 15080442 No 3 15796135 15796984 No 3 17520356 17520587 No 4 5323339 5323500 No 4 5965397 5965593 No 4 5987665 5988307 No 4 6551569 6551829 No 4 7089697 7089756 No 4 7843434 7843652 No 4 8088616 8088806 No 4 13536498 13536663 No 4 17413090 17413185 No 5 486678 486859 No 5 491553 491845 No 5 503109 503284 No 5 3752051 3752582 No 5 7039542 7039625 No 5 8081538 8081752 No 5 8233963 8234252 No 5 9818859 9819021 No 5 10807842 10807960 No 5 13834763 13834799 No 5 14089098 14089483 No 5 15154323 15154464 No 5 16602741 16603118 No 5 16995015 16995176 No 5 17469569 17469734 Yes 5 19198293 19199027 No 5 19830052 19830235 No 5 22804946 22805144 No 5 26480972 26481169 No

Table S9. A table of C-DMRs and corresponding 21nt smRNA RPKCMs chr start end 1 12 19 29 49 59 69 119

1 6721879 6722331 0 0 0 0 0 0.10027538 0 0 1 7138302 7138494 0 0 0.08118849 0 0 0 0 0 1 9573098 9573237 0 0 0 0 0 0 0 0 1 12083824 12083955 0.37083371 0.92708427 0 0 0.13517849 0 1.30934008 0.19095558 1 12097125 12097392 0.18194463 0 0.05838273 0 0.19897058 0.16975457 0.12848206 0 1 12642562 12642676 0 0 0 0 0 0 0 0 1 12693068 12693494 0 0 0 0 0 0 0.16105498 0 1 13024899 13025017 0.20584413 0 0 0 0 0 0 0 1 13065621 13065813 0 0 0 0.07246398 0 0 0 0 1 13106344 13106654 0 0 0 0 0 0 0 0.16138827 1 14018983 14019250 0 0 0.05838273 0 0 0 0 0 1 15935702 15935801 0 0 0.31491293 0.14053621 0.35774509 0 0.69302445 0 1 16790636 16790824 0 0 0.0829159 0.07400577 0 0.1205438 0 0 1 16881902 16882300 0 0 0 0 0 0 0 0 1 17229326 17229416 0 0 0 0 0 0 0 0 1 18078584 18078783 0 0 0 0 0 0 0 0 1 19744227 19744336 0.89136176 0 0.57204367 0 0 0 0 0.45899415 1 19963111 19964044 0.07810163 0.2343049 0.0668304 0.02982441 0.11388027 0.02428964 0.40444996 0.08043467 1 21249682 21249966 0 0 0 0.09797947 0 0 0 0 1 23755482 23755655 0 0 0 0 0 0 0.59487937 0 1 23856241 23856305 0 0 0 0 0 0.35409743 0 0 1 25215925 25215993 0 0 0 0 0 0 0 0 1 26185816 26186008 0.5060335 1.26508375 0 0 0.09223116 0 0 0 1 28515048 28516112 0.09131432 0.04565716 0.01465055 0.01307621 0.04992965 0.04259819 0.16120634 0.02351051 2 2228985 2229108 0 0 0 0 0 0 0 0 2 2232080 2232216 0 0 0 0 0 0 0 0

2 10241165 10241316 0 0 0 0 0 0 0 0.3313269 2 10591404 10591529 0 0 0 0 0 0 0 0 2 11812783 11812961 0 0 0.3502964 0.0781634 0 0.12731593 0 0.28106945 2 12623205 12623326 0 0 0.12882802 0 0.14635026 0 0 0 2 18343591 18343672 0 0 0 0 0 0 0 0 3 129070 129620 0 0.04416292 0 0.22766867 0.45075881 0.04120406 0.4366054 0.31837503 3 1848987 1849151 0 0 0 0 0.32393382 0.27636872 1.88257555 0 3 7253885 7254003 0 0 0.1321033 0 0 0 0 0 3 8048211 8048423 0 0 0 0.39376656 0 0 0 0 3 10782032 10782078 0 0 0 0.30245837 0 0 0 0 3 11023752 11023813 0 0 0 0 0 0 0 0 3 11097266 11097400 0.36253146 0.36253146 0.23265955 1.66126389 1.85012946 2.19857506 3.58407419 1.30676319 3 11101052 11101159 0 0 0 0 0 0 0 0 3 12460173 12460346 0 0 0 0 0 0 0 0 3 14095664 14095826 0 0 0 0.08588324 0 0 0 0 3 15080339 15080442 0 0 0 0 0 0 0.66611088 0 3 15796135 15796984 0.11443867 0.05721934 0.09180324 0.13110092 0.08343172 0 0.60609028 0.05892858 3 17520356 17520587 0.73604872 0 0 0.66252786 0.30663865 0 1.18804191 0.86632662 4 5323339 5323500 0 0 0 0 0 0 0 0 4 5965397 5965593 0 0 0 0 0 0 0 0 4 5987665 5988307 0.07566856 0.03783428 0 0 0.08274945 0 0.10686826 0 4 6551569 6551829 0 0.09342157 0.17986373 0.32107119 0 0.26148733 0.13194119 0.09621223 4 7089697 7089756 0 1.23506481 0.26420661 0.707445 0.60028414 0.38410568 0.58143576 0.42398612 4 7843434 7843652 0 0.11142022 0 0.19146447 0.48738666 0.31186562 0 0 4 8088616 8088806 0 0 0 0 0 0 0 0 4 13536498 13536663 0 0 0 0.08432173 0 0 0 0.15160716 4 17413090 17413185 0 0 0 0.29290705 0.18640402 0 0 0 5 486678 486859 0.40259019 0 0.34449039 0.15373575 0.29350909 0.37561716 0.94764392 0.13820542 5 491553 491845 0.24955077 0 0.53384212 0 0.30322572 0.77610395 0.93985507 0.25700529 5 503109 503284 0.27759552 0 0.08907537 0.47702006 0.70833528 0 1.96026915 0.42883168 5 3752051 3752582 0 0.04574314 0.02935629 0 0 0 0.32301987 0.04710957 5 7039542 7039625 0 0 0 0 0 0 0 0 5 8081538 8081752 0 0 0 0 0.08274945 0 0 0

5 8233963 8234252 0.50428252 0.50428252 0.05393837 0.3851373 0.67402146 0 0.11870142 0 5 9818859 9819021 0 0 0 0 0.109311 0 0.21175747 0.1544147 5 10807842 10807960 0 0 0 0 0 0 0 0 5 13834763 13834799 0 0 0 0 0 0 0.95290861 0 5 14089098 14089483 0.06308989 0.25235956 0.04048881 0.07227577 0 0.05886295 0.26730943 0.12994899 5 15154323 15154464 0 0 0 0 0 0.16072507 0 0 5 16602741 16603118 0.06442867 0 0.16539194 0.07380947 0 0.24044812 0.27298178 0.1990598 5 16995015 16995176 0 0 0 0 0 0 0 0 5 17469569 17469734 0.58883898 0.73604872 0.18894776 0.08432173 0.96591175 2.19755008 0 0.75803579 5 19198293 19199027 0 0.19855265 0 0.37910314 0.45839136 0.37049976 0.65431327 0.57937068 5 19830052 19830235 0 0 0.34072546 0.38013894 0 0 0 0 5 22804946 22805144 0 0 0 0 0 0 0 0 5 26480972 26481169 0 0 0 0 0 0 0 0


1 6721879 6722331 0 0 0 0 0 0 0 0 1 7138302 7138494 0 0 0 0 0 0 0 0.3908622 1 9573098 9573237 0 0 0 0 0 0 0 0 1 12083824 12083955 0.92708427 1.29151564 0.11899382 0.21241351 0.13517849 0 0.78560405 0.38191116 1 12097125 12097392 0 0 0.05838273 0 0.06632353 0 0 0 1 12642562 12642676 0.21306674 0 0 0 0 0 0 0.21943141 1 12693068 12693494 0 0 0 0 0 0 0.24158247 0 1 13024899 13025017 0 0 0 0 0 0 0 0 1 13065621 13065813 0 0 0 0 0 0 0 0 1 13106344 13106654 0 0 0 0 0.05712381 0 0.11066036 0 1 14018983 14019250 0.09097231 0 0 0 0 0 0 0 1 15935702 15935801 0 0 0.15745646 0 0.17887255 0 0.34651222 0 1 16790636 16790824 0 0 0 0.07400577 0 0 0 0 1 16881902 16882300 0 0 0 0 0 0 0.17238548 0 1 17229326 17229416 0 0 0 0.30917967 0 0 0 0 1 18078584 18078783 0 0 0 0 0 0 0 0.12570443 1 19744227 19744336 0.22284044 0 0.42903275 0 0 0 0.31472211 0.45899415 1 19963111 19964044 0.02603388 0.06044607 0.083538 0.02982441 0.03796009 0.02428964 0.07353636 0.05362311

1 21249682 21249966 0 0 0 0 0 0.0797966 0 0 1 23755482 23755655 0 0 0 0.24126737 0 0 0.39658624 0 1 23856241 23856305 0 0.44059518 0 0 0.27669347 0 0 0 1 25215925 25215993 0 0 0 0 0 0 0 0 1 26185816 26186008 0 0 0.08118849 0 0.09223116 0 0 0 1 28515048 28516112 0.02282858 0.05300393 0.02930111 0.01307621 0 0.02129909 0.09672381 0.02351051 2 2228985 2229108 0 0 0 0 0 0 0 0 2 2232080 2232216 0 0 0.11461904 0.1023021 0 0.16663408 0 0 2 10241165 10241316 0 0 0 0 0 0 0 0 2 10591404 10591529 0 0 0 0 0 0 0 0 2 11812783 11812961 0.13645847 0 0.1751482 0.3126536 0 0 0 0.14053473 2 12623205 12623326 0 0 0 0.22996835 0 0 0 0.41347407 2 18343591 18343672 0 0 0 0 0 0 0 0 3 129070 129620 0.04416292 0 0.05668433 0 0.09659117 0.04120406 0.1247444 0.13644644 3 1848987 1849151 0 0.17193958 0 0.16967177 0.64786763 0.55273744 1.04587531 0 3 7253885 7254003 0 0 0 0 0 0 0 0 3 8048211 8048423 0 0 0 0 0 0 0 0 3 10782032 10782078 0 0 0 0 0 0 0 0 3 11023752 11023813 0 0.46226379 0 0 0 0 0 0 3 11097266 11097400 0.90632865 0.63130056 0.11632978 0.51914497 0.52860842 1.01472695 0.7680159 0.37336091 3 11101052 11101159 0 0 0 0 0 0.21179659 0 0 3 12460173 12460346 0 0 0 0 0 0 0 0 3 14095664 14095826 0 0 0 0 0 0 0 0 3 15080339 15080442 0 0 0 0 0 0 0.99916631 0 3 15796135 15796984 0.05721934 0.0332133 0.0367213 0.1147133 0.16686343 0 0.20203009 0.02946429 3 17520356 17520587 0.84119854 0 0 0.90344708 0.38329831 0 0.59402095 0.75803579

4 5323339 5323500 0 0 0 0 0 0 0 0 4 5965397 5965593 0 0 0 0 0 0 0 0 4 5987665 5988307 0 0 0.02428067 0.02167147 0.13791575 0.03529943 0.10686826 0 4 6551569 6551829 0 0.54227099 0.11990915 0.32107119 0 0.08716244 0.26388239 0.19242447 4 7089697 7089756 0.41168827 0 0.79261983 0.47163 1.20056827 0 0.58143576 0 4 7843434 7843652 0 0.38804713 0 0.12764298 0.40615555 0.20791041 0.15736106 0.22949707 4 8088616 8088806 0 0 0 0 0 0 0 0 4 13536498 13536663 0 0 0 0.08432173 0 0 0 0 4 17413090 17413185 0 0 0.32817242 0.14645353 0 0.23854984 0 0 5 486678 486859 0.13419673 0 0.51673558 0 0 0.6260286 0.94764392 0.27641084 5 491553 491845 0 0 0.48045791 0 0.24258058 0.38805197 0.46992754 0.42834214 5 503109 503284 0.27759552 0.96679171 0.35630148 0.79503343 0.50595377 0 0.98013457 0.14294389 5 3752051 3752582 0.04574314 0.212415 0 0.05240333 0 0 0.38762384 0 5 7039542 7039625 0 0 0 0 0.213354 0 0 0 5 8081538 8081752 0 0 0 0 0 0 0 0 5 8233963 8234252 0.42023543 0.29271375 0.16181512 0.09628433 0.24509871 0 0.11870142 0.34623088 5 9818859 9819021 0 0 0 0.08588324 0.218622 0 0.21175747 0 5 10807842 10807960 0 0 0 0 0 0 0 0 5 13834763 13834799 0 0 0 0 0 0 0 0 5 14089098 14089483 0 0 0 0 0 0.1177259 0 0.19492349 5 15154323 15154464 0 0 0 0 0 0 0 0.17741263 5 16602741 16603118 0 0 0 0 0 0.24044812 0.09099393 0 5 16995015 16995176 0 0 0 0 0 0 0 0 5 17469569 17469734 0.29441949 1.02538514 0.94473878 0.08432173 0.32197058 1.64816256 0 0.30321432 5 19198293 19199027 0 0.07683404 0 0.15164125 0.24125861 0.3087498 0.28041997 0.2385644 5 19830052 19830235 0 0.15408793 0.2555441 0.30411115 0 0 0 0

5 22804946 22805144 0 0 0 0 0 0 0 0 5 26480972 26481169 0 0 0 0 0 0 0 0


1 6721879 6722331 0.05373807 0 0 0 0.03917784 0.15041307 0.53126763 0.11068664 1 7138302 7138494 0.12650837 0.44059518 0.08118849 0 0 0 0 0 1 9573098 9573237 0 0 0 0 0 0.16303766 0 0 1 12083824 12083955 2.22500225 3.44404171 0.47597527 0.63724054 0 0.86497081 2.35681214 0.38191116 1 12097125 12097392 0.54583389 0.10561083 0.23353094 0.1563268 0.19897058 0.25463186 0 0.09368982 1 12642562 12642676 0.21306674 0.24735168 0.13673851 0 0.15533668 0.79516615 0 0 1 12693068 12693494 0 0 0 0 0 0 0.80527488 0 1 13024899 13025017 0 0 0 0.1179075 0 0.57615852 0.29071788 0 1 13065621 13065813 0 0.14686506 0.24356547 0.57971188 0 0 0 0 1 13106344 13106654 0 0.09096159 0 0 0.97110482 0 0.9959432 0.72624719 1 14018983 14019250 0 0 0 0 0 0 0.12848206 0 1 15935702 15935801 0 0 2.51930343 0.98375349 0.71549018 2.0602032 1.73256112 2.02142877 1 16790636 16790824 0.51680017 0.44996954 0.49749542 0.5180404 0.28258056 0.24108761 0 0.5322379 1 16881902 16882300 0 0 0 0 0 0 0.17238548 0 1 17229326 17229416 0 0 0 3.55556619 0 0 0 0 1 18078584 18078783 0 0 0 0 0 0 0 0 1 19744227 19744336 4.45680879 0 1.716131 0 0 0 4.72083166 2.06547367 1 19963111 19964044 0.36447429 0 0.20049119 0.23859524 0.03796009 0.14573785 0.11030453 0.21449244 1 21249682 21249966 0 0 0 0.24494868 0 0.0797966 0 0 1 23755482 23755655 0 0 0 0.88464703 0 0 2.37951746 0.14459642 1 23856241 23856305 0 0.88119036 0 0.65217586 0.27669347 0 0 0.3908622 1 25215925 25215993 0 0 0 0 0 0 0 0 1 26185816 26186008 1.26508375 1.76238072 0.81188489 0 0.09223116 0 0 0 1 28515048 28516112 0.02282858 0.29152162 0.04395166 0 0.29957789 0.27688821 0.80603172 0.47021017

2 2228985 2229108 0 0 0 0 0.14397059 0 0 0.40675091 2 2232080 2232216 0 0.20733891 0 0.1023021 0 0 0.25224052 0.36787031 2 10241165 10241316 0 0 0.10323305 0 0 0 0 1.15964416 2 10591404 10591529 0 0 1.74587727 0.33391404 0 0 0.27443768 0.4002429 2 11812783 11812961 1.9104186 1.26732995 0.4378705 0.78163399 0 0 0 0.14053473 2 12623205 12623326 0 0.46608416 0 0 0.58540106 0.3745824 1.98457 1.2404222 2 18343591 18343672 0 0 0 0 0 0 0 0 3 129070 129620 0 0.05126926 0.08502649 0.02529652 0.16098529 0.04120406 0.1247444 0.04548215 3 1848987 1849151 0 0 0 3.05409185 2.80742642 8.1528773 9.62205283 0 3 7253885 7254003 0 0 0.39630991 0.235815 0 0.19205284 0 0 3 8048211 8048423 0 0 0.0735292 0 0 0 0 0 3 10782032 10782078 0 0 0.33887369 0 0 0 0 0 3 11023752 11023813 0 0 0 0 0 0 0 0 3 11097266 11097400 2.3564545 4.62953741 2.79191462 11.2135313 7.00406154 12.3458446 16.3843391 9.52070325 3 11101052 11101159 0 0 0 0 0 0 0 0 3 12460173 12460346 0 0 0 0 0 0 0 0 3 14095664 14095826 0 0 0 0.17176648 0 0 0.21175747 0 3 15080339 15080442 0 0 0 0 0 0 1.99833263 0 3 15796135 15796984 0.51497402 0.86354579 0.51409814 0.85215598 0.29201101 0 3.6365417 0.47142862 3 17520356 17520587 4.20599271 0 0 4.21608639 1.99315122 0 6.97974621 3.68188813 4 5323339 5323500 0 0 0 0 0 0 0.21307273 0 4 5965397 5965593 0 0 0 0.07098513 0 0 0.17502403 0 4 5987665 5988307 0.03783428 0.17568904 0.12140335 0.28172914 0.46891354 0.42359318 0.74807779 0.6234313 4 6551569 6551829 0.09342157 0.1084542 0.95927323 0.74916612 0.13621832 0.87162443 1.58329431 0.76969788 4 7089697 7089756 1.64675308 3.34553628 1.58523966 4.24467003 3.60170482 2.68873977 4.65148611 3.81587508 4 7843434 7843652 0.22284044 2.58698087 0.07150546 0.57439342 0.89354221 2.49492498 2.04569372 0.45899415 4 8088616 8088806 0 0 0 0 0 0 0 0 4 13536498 13536663 0 0 0 0.50593037 0 0 0.41581467 0.60642863 4 17413090 17413185 0 0.59364403 0.49225863 0.29290705 0.37280804 0.95419938 12.9996796 0.2633177 5 486678 486859 2.41554112 0 3.10041347 0 1.17403638 3.75617158 7.01256504 2.62590299 5 491553 491845 1.74685536 0 2.08198427 0.04764755 1.57677374 1.94025986 5.87409419 1.88470542 5 503109 503284 2.35956191 1.61131951 2.76133651 2.70311368 4.55358395 0.12949849 2.54834989 3.71654119 5 3752051 3752582 0.32020199 0.3186225 0 0 0 0 0 0.18843828

5 7039542 7039625 0 0 0 0 0.426708 0 0 0 5 8081538 8081752 0 0 0 0 0 0 0 0 5 8233963 8234252 2.01713007 3.02470877 0.53938374 1.15541191 1.53186696 0.07841604 2.37402838 0.77901948 5 9818859 9819021 0.44980755 0 0.28867018 0.42941621 0 0.83934204 0.84702988 0.1544147 5 10807842 10807960 0 0 0 0 0 0 0 0 5 13834763 13834799 0 0 0 0 0 0 0 0 5 14089098 14089483 0.12617978 0.14648359 0.04048881 0.10841365 0 0 0 0.19492349 5 15154323 15154464 0 0 0 0 0 0 0 0.17741263 5 16602741 16603118 0.193286 0.149592 0.33078387 0.33214262 0.09394367 0.42078421 0.45496963 0.06635327 5 16995015 16995176 0 0 0 0.43208339 0 0 0.21307273 0.1553738 5 17469569 17469734 2.79698515 3.24705296 2.55079472 0 3.64899993 6.45530336 0.20790733 1.8192859 5 19198293 19199027 0.03309211 2.49710619 0.10618658 1.25104035 1.39929994 2.46999839 3.78566964 1.60178952 5 19830052 19830235 0.39819029 0.92452759 3.40725463 2.50891698 0 0 0 0.13669498 5 22804946 22805144 0 0 0 0.07026811 0 0 0 0 5 26480972 26481169 0 0 0 0 0 0.11503673 0.52240675 0


1 6721879 6722331 0.05373807 0 0 0.16121421 0 0.21495228 0.10747614 0.10747614 1 7138302 7138494 0.12650837 0.5060335 0.25301675 0 0.12650837 0 0.25301675 0.37952512 1 9573098 9573237 0 0 0 0 0 0.34949076 0 0.34949076 1 12083824 12083955 7.78750788 11.1250113 5.00625507 4.63542136 0.37083371 2.78125281 5.19167192 2.22500225 1 12097125 12097392 0.72777851 0.90972314 1.2736124 1.63750166 1.09166777 1.45555703 0 0.45486157 1 12642562 12642676 0.85226694 1.06533368 0.42613347 0.42613347 1.27840042 2.3437341 0.63920021 0.85226694 1 12693068 12693494 0.05701786 0 0 0 0.05701786 0 3.36405368 0 1 13024899 13025017 0 0.20584413 0.20584413 0 0 1.64675308 0 0.20584413 1 13065621 13065813 0 0 0 0.63254187 0 0 0 0 1 13106344 13106654 0 0.3134143 0.47012144 2.19390007 1.8021322 0 1.4887179 0.86188931 1 14018983 14019250 0 0.27291694 0 0.09097231 0 0 0.09097231 0 1 15935702 15935801 0.49069915 0.24534957 14.9663241 17.6651694 9.07793427 9.32328384 15.2116736 8.34188554 1 16790636 16790824 1.80880059 1.42120046 1.16280038 0.77520025 2.19640072 1.03360034 0 0.51680017 1 16881902 16882300 0 0 0.06102917 0 0 0 0.48823332 0.24411666 1 17229326 17229416 0 0.53976906 0.26988453 0.53976906 0 0.53976906 0.8096536 0.53976906 1 18078584 18078783 0 0 0 9.52054983 0.12205833 0 0 0 1 19744227 19744336 18.0500756 0 14.0389477 0.22284044 0 0 7.79941538 7.3537345 1 19963111 19964044 0.67688082 0.15620327 1.35376164 2.31701512 0.10413551 1.11945674 0.07810163 0.83308409 1 21249682 21249966 0 0 0 0.85526788 0.17105358 0.42763394 0.7697411 0 1 23755482 23755655 0 0 0 3.6504613 0.14040236 0 3.51005895 0 1 23856241 23856305 1.51810049 4.93382661 0 1.13857537 1.89762562 1.89762562 1.51810049 1.51810049 1 25215925 25215993 0 0 0 0 0 1.07160035 0 0 1 26185816 26186008 6.83145222 11.7652788 15.0544966 0 6.9579606 0 0 0.25301675 1 28515048 28516112 0.25111437 1.14142894 0.433743 0.09131432 2.0545721 1.5066862 1.41537189 1.25557184

2 2228985 2229108 0.19747649 0 0.39495297 0 0.78990595 0.19747649 0 0.19747649 2 2232080 2232216 0.17860006 0.71440023 0.53580017 0.71440023 0 0.17860006 0 1.07160035 2 10241165 10241316 0 0 0.16085833 0.16085833 0 0 0 9.65149983 2 10591404 10591529 0 0 0.19431686 0 0 0 0.19431686 1.94316863 2 11812783 11812961 3.00208637 1.77396013 7.50521593 2.45625249 0 0.13645847 0.13645847 2.8656279 2 12623205 12623326 0 1.40518393 2.20814617 2.81036786 1.00370281 1.20444337 3.81407066 1.00370281 2 18343591 18343672 0 0 0 0 0 0 0.89961511 0 3 129070 129620 0.22081462 0.61828093 0.66244385 0.04416292 0.08832585 0.26497754 0.04416292 0 3 1848987 1849151 0 0 0 15.6993807 6.22050934 23.8452858 22.8085343 0 3 7253885 7254003 0.20584413 0 1.64675308 1.44090894 2.26428548 1.64675308 0.82337654 0.20584413 3 8048211 8048423 0.22914724 0 0.34372087 0.57286811 0 0 0.22914724 0.11457362 3 10782032 10782078 0 0 0.52803495 1.58410486 2.11213982 3.69624468 0.52803495 0.52803495 3 11023752 11023813 0 0.39819029 0 0 0 0 0 0 3 11097266 11097400 17.5827759 16.4951815 6.16303484 43.3225096 23.0207478 21.9331534 15.2263214 16.3139158 3 11101052 11101159 0 0.22700568 0.22700568 0.22700568 0 0.22700568 0 0 3 12460173 12460346 0.14040236 0 0.14040236 0.14040236 0.14040236 0.14040236 0 0 3 14095664 14095826 0 0 0 2.09910192 0 0 0.74967926 0.14993585 3 15080339 15080442 0 0 0.23582144 0.23582144 0 0 6.60300021 0 3 15796135 15796984 2.77513777 3.14706345 3.89091481 11.5296961 4.00535348 0 14.4478822 2.34599275 3 17520356 17520587 16.5085214 0 0 43.2165751 18.8218174 0 21.7660123 21.6608625 4 5323339 5323500 0 0 0 0.30173426 0 0.30173426 0.15086713 0.15086713 4 5965397 5965593 0.12392657 0 0.37177971 0.37177971 0 0 0.74355943 0 4 5987665 5988307 0.87018845 0.41617708 1.51337121 2.30789109 2.61056534 1.58903977 4.0104337 2.00521685 4 6551569 6551829 0.56052941 2.24211765 7.66056865 14.2000785 0.65395098 6.44608825 3.83028432 6.72635296 4 7089697 7089756 9.88051847 9.05714193 30.0532437 48.5792158 46.5207745 25.1129844 33.3467498 20.9961018 4 7843434 7843652 1.1142022 4.45680879 1.55988308 4.45680879 5.57101099 7.9108356 5.45959077 2.89692571 4 8088616 8088806 0 0 0 0 0 0 0 0.38352012 4 13536498 13536663 0 0 0 3.09140464 0 0 0.88325847 1.03046821 4 17413090 17413185 0.51136017 3.068161 11.2499237 7.41472241 5.36928175 14.8294448 0 3.068161 5 486678 486859 14.2248533 0 11.5409187 0.13419673 6.30724625 8.72278737 15.4326238 9.52796774 5 491553 491845 3.6600779 0.08318359 5.24056609 0.24955077 2.99460919 6.07240198 6.40513633 3.32734355 5 503109 503284 7.21748349 5.96830366 10.1322364 10.5486297 17.0721244 0.27759552 16.7945289 4.99671934 5 3752051 3752582 1.50952366 3.84242385 0.64040398 0.7776334 0 0.22871571 4.3455984 0.86911968

5 7039542 7039625 0 0 0.29264588 0.29264588 5.56027169 0.29264588 0 0 5 8081538 8081752 0 0.22700568 0.68101704 0 0.11350284 0.5675142 0.11350284 0.34050852 5 8233963 8234252 8.15256736 10.5058858 7.22804941 13.699675 9.07708531 0 6.89186107 7.39614359 5 9818859 9819021 2.24903777 0 1.19948681 3.14865288 3.74839628 1.64929436 1.49935851 1.19948681 5 10807842 10807960 0 0 0 0 0 0.20584413 0.20584413 0 5 13834763 13834799 0 0 0 0 0 0.67471133 0 2.69884532 5 14089098 14089483 0.18926967 0.37853934 0.94634836 0.37853934 0.31544945 1.07252814 0 0.75707869 5 15154323 15154464 0.51680017 0 0.17226672 0 0 0.17226672 0 0 5 16602741 16603118 0.45100068 0.70871535 1.28857336 0.77314402 0.45100068 1.6107167 0.70871535 0.25771467 5 16995015 16995176 0 0 0 1.05606991 0.45260139 0 0.60346852 0 5 17469569 17469734 11.1879406 11.3351504 37.0968557 0.14720974 20.1677351 52.8482984 1.91372668 15.7514427 5 19198293 19199027 0.23164476 5.29473742 0.3640132 8.43848776 5.46019796 9.92763266 6.28750069 4.26888204 5 19830052 19830235 0.66365049 2.38914176 21.9004661 16.8567224 0.2654602 0.2654602 0 0 5 22804946 22805144 0 0.12267479 0 0.61337394 0 0.36802436 0 0 5 26480972 26481169 0.1232975 0 0 2.95914005 0 1.84946253 3.69892506 0.86308251

Table S13. Estimation of Epimutation Rate using Tree puzzle

Node Generations

From Founder Branch Length From Tree Root

(changes / site) Expected Number of Changes

(changes) Rate of Change (changes per

generation)

12 3 0.01385 15361.4782 5120.492733

19 3 0.01482 16437.33624 5479.11208

1 3 0.01556 17258.09392 5752.697973

49 31 0.02217 24589.45644 793.2082723

59 31 0.02647 29358.72404 947.0556142

29 31 0.02767 30689.68244 989.9897561

119 31 0.02933 32530.84156 1049.381986

69 31 0.03327 36900.82164 1190.349085

Table S14. Chi-square test for enrichment of DMRs in specific genomic contexts

CG DMRs

Basespace Expected Percent Expected Count Observed Percent Observed Count P-value

Genes 60452741 50.80% 1262.395474 60.50% 1504 2.2E-16

Intergenic 953533 0.80% 19.91201265 36.20% 899

Transposons 57593726 48.40% 1202.692514 3.30% 82

nonCG DMRs

Basespace Expected Percent Expected Count Observed Percent Observed Count P-value

Genes 60452741 50.80% 145.29 20.07% 57 2.2E-16

Intergenic 953533 0.80% 2.29 49.65% 141

Transposons 57593726 48.40% 138.42 30.28% 86

Table S15. Primers sets for amplicons spanning C-DMRs

Chromosome Coordinate F primer R primer

chr1 6721879 CCGCAACGATATTTTGTTTGTAATGCTTGT TCTCGTAGATAAAGTAGTCGACAT

chr1 7138302 ATCAAATATGTCTCCTTTGACCAAAGACCAAGAT ATATCACATAAGTTATAACTTATT

chr1 9573098 GAAGGTTGAAAAGTTAATATCGGTTGACGTAA ACGGGGTAGAATACTTGTATATTA

chr1 12083824 TAACAATCTCAATAAATGTCTTTTGCAACT CAAGTTTTTATGGGATTTAGATATT

chr1 12097125 TATGGTATATTCCTTACAAATATTATAGTT GCTGCAAAAATGTATATGTAGATT

chr1 12642562 CCATTGAGAAACGCATTAGAAATATCAAGCTGA TACTTAATCATCACATTTATCGA

chr1 12693068 ACTTTTAAGTTGTTCATATCTTTGAAGTT TTGGAGGTGAAATTTAGACAAAT

chr1 13024899 GAATAAAAAGGTTGACGAATATGTTTTCCAA TCCCACTACGCTCGCAAATGCAT

chr1 13065621 TCTTACAAGAGATTGATAAATAACATA CTCTGATAGCACTTCATCGGTTAA

chr1 13106344 ATGGTAGTTTGGAGTTTTTCTTAT ATCTTTATTTGTATAGATTTGTCTGA

chr1 14018983 TCATTAAAATCATTATTTCGCCAA GAGAAATTTCACAAACCGGAGATA

chr1 15935702 GTTAGTTGGGCTTTGGTCCTATTGGT TTGGAGGGGTTAAAAACGGTTGTGTT

chr1 16790636 ACAACATTAACATTTTAGTTGATA GAATTCAAATAATACGTAACGTAT

chr1 16881902 CAAATTTAAAACTTCATTAGTCCACT AACATAATTCATACTTGGCTATTGAT

chr1 17229326 ATGGAATACCAATCGGAGTAATT GATTGGCGGTTCTAATGAGCGTAGAT

chr1 18078584 GAGTATATTTCATACCTAATTAT TCATAAGTAATATCGACCTTTCCA

chr1 19744227 TCATACGATTACAAATCCAATAAGA GTGTTAAGTGTCATGACCAAGAGT

chr1 19963111 AACCGAAATTCACTGTATTTTTCCA GGTGGCTCAGTGTTAAGAAGGTACA

chr1 21249682 CGTCATTAAACACGTGTTAGCTGA AAGACTAGGATTTGGAGCGACCACGAT

chr1 23755482 AGATATTTATAGTTTTCTTATAATTATCT ATTCCAAAGTTTAAGATTGAATTCGGT

chr1 23856241 TTCAGTGTTCAGCATGTTTACACTA TCTGTTTATTCAACATAATCTTACGT

chr1 25215925 ATAGTGGTGTATATCACATAACTA CTTTGGTCTGAGACTCTGAGTCGT

chr1 26185816 GCTCGACCGTGACCCGATCATA TACCAAGCTGGTAAAAGTATATAGAT

chr1 28515048 TTCTAGCCGAATTGAACCGAACAAAAAT TAGGGTAAACTCTTCTTATTTCTCT

chr2 2228985 CGAATACTCTGAATCCGGCTTTGA TAATTTAGGGCGTGAAACAAGCTGGT

chr2 2232080 CTATATACTTGCAACAAAGCAAATTA GTTGTAATTGTAATTAAAATTATAA

chr2 10241165 CCATAGTATGTTTGCTTATATTGTGA TTAATCTTTAGGCCAAAGCTTTTAAA

chr2 10591404 AAGTAACCAATTGTGCATAAGCAAAT GCAAATACAACATGATTACTACCGGA

chr2 11812783 GTTAACGTTTGTTTATTTGTTTAT TCAACTGTAACGAAATGTGCTTTAT

chr2 12623205 CTCGAGAGAGAACCAAAGATGATTTGA TGATAAGGGAGTTCAATTCAATTTTA

chr2 18343591 CAAAGACGGATGTAAGAGATAT GTTTCATATTCACTGGGCTTTTCTATT

chr3 129070 CAGAGAAAAATCTCACATCGTTACGA TGAAAACAGACCTGGGCCTTAGAGGATA

chr3 1848987 AGGTGGGGATTTGGTGAAGTGAGGCCT TAGAGTCACATGATTACAAAGGCCGA

chr3 7253885 ATGTAATATATGAAATTATAGTGCTTTAA TATAATAGGATTGGTACAAAGGTGGT

chr3 8048211 AAAGCATCTTCAGATTCAACAAACTGA GAATTTTAGGGTTTATTTGATGAGAGTCA

chr3 10782032 ATGTGATAAAACGTACATATAATTCA AGGAAGATGGAGGCGGATCCGCCATTGAA

chr3 11023752 CCGTATTAATTTAGATATATCCTTAA TTCCTACCCATACAATGAGAGAATGCTTA

chr3 11097266 TAATGTTATCTCCGTTGAGCGTTGTAGA GAACTTATTCTAATCAACGGATTCATAT

chr3 11101052 ACAAATGGGAATCAGAAACAAAAACATGA AGATCATAATTTCCAACAAAAATTTAT

chr3 12460173 TACGACGAGGATATCATTGTGAATTA GAGTGTGTCAATCGACACTCTCGACTTGA

chr3 14095664 AGACCACCGCCGTTGTCTAGAACAACGA TAGAGTGTTAGTTACCTAATTATAATAT

chr3 15080339 ATCGGTCATAAATCTTGAAGTACTGCT CTTCTCAAAATTCAAACCTCCACTCCAA

chr3 15796135 TCACCTAACTTGTGCCCCAACTTTA TGGACTTTATCTAGAAGATTTTCAACT

chr3 17520356 ACTAACAGTACAACGGCGGCGCCAAA TAAATGAAGTTACATGTCATTTAGCT

chr4 5323339 ATGTAGGCACAACATGTACTATA GTATTTCTTAGTTGATATTTGTTTCAA

chr4 5965397 GATCAACTTATCAATTTCTATACATA ACCTGATGTCGCCACCTCCTTAGCT

chr4 5987665 TGGTATCAGAGCTATTATACCTTA TGCCGGTTTCATCAGAGACACGGAGA

chr4 6551569 AAGAAGAAGTATTGTGATATAATCTGA GACATATTATATAATACGGAATATGA

chr4 7089697 GTTTAGTTTGATGTCGACTTGGTTCA CAATTTACATAGGTTTCAGGTTTGTATG

chr4 7843434 TAAAGCGAAGGTTAGATCTGCTTTT TATGTGAACCACTGAAAGACATGA

chr4 8088616 TGGAATGGGATTGAACTTGAAGGCA CAAATGATCTTTATCATTTTCGTGT

chr4 13536498 TGGTTCAAGTTCGGAATTGAT ACTAAATAATATAGGGATAGATATCT

chr4 17413090 TGTTTCAAAAATTGTATATACGAGGT ATAAGAAGCCCATGCTACTGACATGCGT

chr5 486678 AGAAGGAGACTAATTGGGATTAAAACA ATTAATTATATCTCGGCCTTTGGGCTTT

chr5 491553 AGACTAATTGGGATTAAAACAAATA TATGGGCCAATTCGGACGGCTCTCGCTT

chr5 503109 GAGATACCCATTAAAGATCGTGTTTTCA AGATGATTAGTTTCAAAATTTGAGTGAT

chr5 3752051 AAGTTTGATAGCGGGCCATGCCATTA TATGCTTCCGAGGATGTGTTTGAGA

chr5 7039542 TCACATTTAGAGATAAATATTAAAATTGT GATGAATTCGTGACAAGGGGTCAACTAT

chr5 8081538 AGATTGACTTGAAGGGTTAATTTTGCA ACCATGAATTTAGTTACCTGACATCGGT

chr5 8233963 GATACAATTCATCATCCATGATCGGAA CATTTACACCAATTGTAAACAAAGA

chr5 9818859 AATATCTCGGTAATATTTATTAGGTTA ATACACAGTAACGTGTGTCAGAGACGA

chr5 10807842 TTGATTTTGTTTCCTGGTGAGTGCCTA AGAAGAAAGATATTAACTTGGCAG

chr5 13834763 TTTTGTGGACAGTTACGTCGCACTA TGATCAATACCGAAATTTCCCAACCAT

chr5 14089098 TTGCTGGATTCGCCAAGGGAATTCA TGAGAACGTCGACGGCACACGACGAGCT

chr5 15154323 ACCCAAACCGATATTTAATCCTGTTT TCATCAGATGGATCAATCTAGCTAAT

chr5 16602741 TGAGAAACCAAGCAGCCATGGAATCCCT ATTTCAGCCTCAATCTTAGTTGTAAT

chr5 16995015 CTTTGACGGCGGTAGGTTTGTCACCGGA ACATCATTATGTAGCGCACAATGCAGA

chr5 17469569 AGATAAGTATCCACATATATACC CTTGGATCAAAGAGATAGCAAACCTAA

chr5 19198293 ACAAATTAACAATTATCAAACTAGCCAA CATGCTCCACGACACCACTTTGGTCT

chr5 19830052 GGATAACATCCTTCATAATGATGGA ACAAATTAAATTTAGTTAAGATA

chr5 22804946 CGGCCCTGATTTTAGCCATAGT TCATCAAAGTCAATATTAGTAGCTTAAT

chr5 26480972 TCGTACCCTATCCTACATCTCTCTA TCCCTTGATAGAGTCGGTTGATCGGGT

Table S16. A table of qRT-PCR primer sets used in this study. Gene Sequence

AT5G24250 CCGTGTGGAGAAGTTCACGGCG ACAGCGGCGAATTTTGTGGGGA AT5G24240 ACCCGTGCAGGGACTTGGTCT TCCAGGGCTGGACTCAACGCT AT1G53480 AGAAATTCCGGCGGCGGCTC TGCTGCCGTCGTCGAGAGTCA AT1G53490 ATCTCCGCTCGCCGTCGACT TTGGTGGTCGGTTTCGGCCG AT3G22770 GCGTGTCTCTCAAGGGAAATGCCT AACGGAAGAGGCAAAAGTGGTCCAA AT3G01345 TAGAGCCGTTTGCTCGCCGG GCGAAACATGGGAACACGGGAGA AT4G14548 TGGTCGTTCCAACCATGCCAGAG ACGGCGGTGAGACAAACCAACAC AT2G44450 GCAAAGGATGTTCCCTGCTCCACC AGCCAATCTGATGCAGCCTTTGGAC AT5G66300 CTGGAAAGCAACGGGCCGGG GGCCATTAGGCGCTCGACCTC AT5G02320 TGTGGTTGCAGTCCCGGTGC CGCTGCCAAAACAACCTTTCGCC AT5G02370 TGGTGGATTCGAAAACCCCTGCG GCCGCCGATTACGGAAACGC AT5G42500 ACCGCCGTCAAAGTTGCGGA ACTCTCCCGCCGTAAAAGCCA AT5G43500.1 GCCATTGCAGCGGGCTTACG CGGCCCGTCGTAGCCTTGGA AT5G43500.2 TGCTCCAGCGAAGAACTTAGCGA TCTGAGTGGGGGCAACAGTTTTCA

Transgenerational Epigenetic Instability Is a Source of Novel …signal.salk.edu/publications/Schmitz_etal_SOM.pdf · 2011. 12. 21. · DOI: 10.1126/science.1212959 Science 334, 369

Documents