Top Banner
Accepted Article This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process which may lead to differences between this version and the Version of Record. Please cite this article as an 'Accepted Article', doi: 10.1111/tpj.12726 This article is protected by copyright. All rights reserved. Received Date : 24-Sep-2014 Revised Date : 03-Nov-2014 Accepted Date : 06-Nov-2014 Article type : Original Article Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis Hongxing Yang 1, 2, ‡ , Fang Chang 1, ‡, * , Chenjiang You 1, 3 , Jie Cui 1 , Genfeng Zhu 1 , Lei Wang 1, 3 , Yu Zheng 4 , Ji Qi 1, * , Hong Ma 1, 3, * 1 State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Institute of Plant Biology, Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai 200433, China 2 Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai 201602, China 3 Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, China 4 New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA These authors contributed equally to this work. * For correspondence: Hong Ma (Phone: 86-21-51630532; e-mail: [email protected]) Ji Qi (Phone: 86-21-51630534; e-mail: [email protected])
41

Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Feb 22, 2023

Download

Documents

Ziliang Zhou
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process which may lead to differences between this version and the Version of Record. Please cite this article as an 'Accepted Article', doi: 10.1111/tpj.12726 This article is protected by copyright. All rights reserved.

Received Date : 24-Sep-2014 Revised Date : 03-Nov-2014 Accepted Date : 06-Nov-2014 Article type : Original Article Whole-genome DNA methylation patterns and complex associations with

gene structure and expression during flower development in Arabidopsis

Hongxing Yang1, 2, ‡, Fang Chang1, ‡, *, Chenjiang You1, 3, Jie Cui1, Genfeng Zhu1, Lei Wang1, 3,

Yu Zheng4, Ji Qi1, *, Hong Ma1, 3, *

1State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for

Genetics and Development, Institute of Plant Biology, Center for Evolutionary Biology,

School of Life Sciences, Fudan University, Shanghai 200433, China

2Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai

201602, China

3Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering,

Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, China

4New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA

‡ These authors contributed equally to this work.

* For correspondence:

Hong Ma (Phone: 86-21-51630532; e-mail: [email protected])

Ji Qi (Phone: 86-21-51630534; e-mail: [email protected])

Page 2: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Fang Chang (Phone: 86-21-51630534; e-mail: [email protected])

Running Title: Arabidopsis floral DNA methylomes and transcriptomes

Keywords: cytosine methylation, Arabidopsis DNA methylome, MspJI, RNA-seq, flower

development

SUMMARY

Flower development is a complex process requiring proper spatiotemporal expression of

numerous genes. Accumulating evidence indicates that epigenetic mechanisms, including

DNA methylation, play essential roles in modulating gene expression. However, few studies

have examined the relationship between DNA methylation and floral gene expression on a

genomic scale. Here we present detailed analyses of DNA methylomes of single-base

resolution for three Arabidopsis floral periods: meristems, early flowers, and late flowers. We

detected 1.5 million methyl-cytosines and estimated the methylation levels for 24,035 genes.

We found that very many cytosine sites were methylated de novo from the meristem to the

early flower and many sites were demethylated from early to late flowers. A comparison of

the transcriptome data of the same three periods revealed that the methylation and

demethylation processes were correlated with expression changes of >3,000 genes, many of

which are important for normal flower development. We also found different methylation

patterns for three sequence contexts (mCG, mCHG, mCHH) and in different genic regions,

potentially playing different roles in gene expression.

Page 3: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

INTRODUCTION

Flowers are angiosperm reproductive structures and develop from the floral meristem. The

cells derived from the floral meristem undergo division and differentiation to form four types

of flower organs, including the reproductive organs, stamens and the pistil. Meiosis generates

haploid spores that then develop into pollen grains and embryo sacs with sperms and egg,

respectively, for fertilization and seed production. Flower development requires the normal

function of receptor like protein kinases (RLK) and ligands, transcription factors, enzymes,

and other molecules (Ma 2005, Ge et al., 2010, Chang et al., 2011). Epigenetic mechanisms

play essential roles by modulating the expression of numerous genes through histone

modification, chromatin remodeling, microRNA, and DNA methylation. Studies in the past

decade revealed that these epigenetic pathways are important for normal gene expression,

while DNA methylation is also known for genome stability (Chan et al., 2005, Law and

Jacobsen 2010, Gan et al., 2013).

Although conserved across eukaryotes, DNA methylation in plants has several unique

features regarding the pattern of methylation, the methylation machinery, and demethylation

enzymes in non-dividing cells (Chan et al., 2005). For example, mammalian DNA

methylation is mostly at CG sites by DNA methyltransferase Dnmt1 and homologues. In

contrast, plant DNA methylation occurs at CG, CHG (H=A/T/C) and CHH sites; in

Arabidopsis thaliana, methylation at these three types of sites is carried out, respectively, by

DNA METHYLTRANSFERASE 1 (MET1), the plant specific CHROMOMETHYLASE 3

(CMT3), and DOMAINS REARRANGED METHYLTRANSFERASEs (DRMs) (Chan et

al., 2005). Each of the three types of DNA methylation is crucial for development and

Page 4: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

response to environmental stresses (Bird 2002, Chan et al., 2005, Goll and Bestor 2005, He et

al., 2011, Jullien et al., 2012, Song et al., 2013).

Consistent with the role of DNA methylation in genome stability, a substantial portion of the

methylated cytosines in Arabidopsis are found in genomic regions comprising of repetitive

sequences such as transposable elements (TEs) (Initiative 2000, Martienssen and Colot 2001,

Lippman et al., 2004, Chan et al., 2005). Exogenous repetitive sequences such as transgenes

can also be methylated and also induce methylation and consequent transcriptional silencing

of homologous sequences in the transgenic lines (Mette et al., 2000, Soppe et al., 2000,

Zilberman et al., 2004, Chan et al., 2005). Case studies and genome-wide analysis in

Arabidopsis indicated that DNA methylation in promoter regions is often associated with

transcriptional gene silencing (Park et al., 1996, Stam et al., 1998, Jones et al., 1999, Zhang

et al., 2006, Zilberman et al., 2007). Although genome-wide DNA methylation has been

investigated for vegetative tissues, little is known about the patterns of DNA methylation and

their association with gene expression at different stages in plant development. More

specifically, the relationship between DNA methylation and gene expression during flower

development has not been reported.

Whole genome bisulfite sequencing (BS-seq) as a gold standard method has been applied in

many studies of DNA methylomes (Feng et al., 2010, Zemach et al., 2010), including

Arabidopsis (Cokus et al., 2008, Lister et al., 2008) and tomato (Zhong et al., 2013), it

requires deep sequencing coverage for confident DNA methylation calling thus not cost-

efficient. The recently identified methylation-dependent endonucleases MspJI have both low

specificity in recognition sites and fixed cut distances: it recognizes 5-methylcytosine or 5-

Page 5: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

hydroxymethylcytosine in the context of CNN(G/A) and cleaves both strands at fixed

distances (N12/N16-17, mainly 16 bp) away from the modified cytosine at the 3’-side; these

properties not only increase the number of detectable mCs, but also enable the determination

of mCs at the single-base resolution (Zheng et al., 2010, Cohen-Karni et al., 2011, Horton et

al., 2012, Huang et al., 2013). In Arabidopsis, 48.8% of all cytosines and guanines are part of

CNNR/YNNG sites, among which 90.2% of methylated sites could be detected by the MspJI-

seq procedure employed here.

Here, we present detailed analyses of DNA methylomes during Arabidopsis flower

development, using methylation-dependent MspJI DNA digestion combined with high-

throughput sequencing (MspJI-seq). We analyzed three periods during flower development:

the meristem, early flower development (organogenesis), and late flower development

(maturation) and detected many more methylated cytosines in the second period than both the

first and third periods, suggesting de novo methylation at many sites during organogenesis,

followed by demethylation at many sites. These likely developmental stage-dependent

methylation and demethylation activities are correlated with the changes in the expression

levels of over 3000 genes, including many genes important for flower development.

Moreover, the methylation patterns and the potential influences on transcription vary

significantly across sequence contexts and genic regions. Our study provides valuable

insights into the possible functions of DNA methylation during floral development.

Page 6: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

RESULTS

The DNA methylation landscape during Arabidopsis floral development

To survey the DNA methylation and gene expression during flower development, we sampled

wild-type early flowers of stages 1-9 (E), late flowers of stages 10-12 (L), and the meristems

(M) from ap1 cal double mutant plants, as this mutant is arrested at the inflorescence

meristem stage (Bowman et al., 1993, Ferrandiz et al., 2000), and has been used previously

as a source of meristems for transcriptomics (Gomez-Mena et al., 2005, Wellmer et al., 2006,

Kaufmann et al., 2010). Using high-throughput sequencing of the DNA fragments from

MspJI digestion (Table S1), we identified 1,565,127 cytosines that were potentially

methylated during flower development. We randomly selected 9 regions of 8 genes that

contained identified mC sites to conduct real-time PCR-based validation and the results

confirmed the methylation status detected by MspJI-seq (Figure S1, Table S2). To further

validate the technical repeatability of our MspJI-seq data, we obtained methylation data for

44 randomly selected genes, including the same 8 genes aforementioned, and 39 randomly

selected genomic regions, by performing BS-seq experiments on the same tissues; we

observed good consistency between MspJI-seq and BS-seq (Figure S2).

Among these mC sites we identified, 453,066 were in the CG dinucleotide context, including

207,115 that were previously detected in Arabidopsis seedlings (Cokus et al., 2008). In

addition, we detected 425,428 mCHG and 685,100 mCHH sites (Figure 1a); these mC sites

accounted for 17.5%, 13.7% and 5.2%, respectively, of the three types of cytosine sites (in

the CNNR context) that could potentially be digested and detected by MspJI-seq. Among

detected methylation sites, ~73% (1,141,758) were in exons, 3% (46,208) within introns, 8%

(125,325) in putative promoter regions (1kb region upstream of transcription start sites), and

Page 7: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

16% (250,303) in intergenic regions. Interestingly, we found a slightly higher percentage of

mCHH sites (17%) in intergenic regions, compared with mCG (15.2%) and mCHG (15.0%)

sites (chi-squared test P<1e-60) (Figure 1b), indicating that symmetric and non-symmetric

methylation sites were not evenly distributed between genic and intergenic genomic

sequences.

To obtain an overview of the detected DNA methylation, we compared the methylation levels

of each 100kb window throughout the genome in different sequence contexts (mCG, mCHG,

mCHH) and genic regions (exon, intron, promoter or intergenic). First we normalized the

methylation level as reads per kilo-base of cytosines of CNNR sites (each site counts as 1 bp)

per million of mapped reads (RKCM) (see Materials and methods). We found that

heterochromatic regions (centromeres and pericentromeric zones) had relatively high

methylation levels for all three the sequence contexts (Figure 1c), and highly stable

methylation status between the three floral developmental periods, consistent with the high

frequency of transposable elements (TEs) in these regions (Figure 1d and Figure S3). Hence,

methylation at the heterochromatic regions is least correlated with development, reminiscent

of the discovery that DNA methylation around centromere and pericentromeric regions varies

least among different Arabidopsis populations (Schmitz et al., 2013). On the contrary,

methylation levels in euchromatin zones were relatively low but more variable between

different mC sites: mCG sites were at higher levels than mCHG and mCHH, consistent with

previous findings (Cokus et al., 2008, Lister et al., 2008); lower percentages of mCG sites

were stage-specific than the mCHG and mCHH sites; furthermore, exons were most highly

methylated, particularly for mCG sites.

Page 8: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Difference in methylated sites between developmental stages

The number of mC sites increased by 8% from meristems (1,000,123) to early flowers

(1,080,179), then decreased slightly in late flowers (1,074,245). The increasing trend was true

for all sequence contexts (mCG, 6.4%; mCHG 7.2%; and 9.8% for mCHH). The mCs detected

in early flowers, but not in meristems were 96,708 for mCG, 95,240 for mCHG, and 178,087

for mCHH sites, respectively, significantly outnumbered mCs found in floral meristems but

not in early flowers (mCG, 77052; mCHG, 74861; mCHH, 138,066) (Figure 1e-g), indicating

extensive new methylation in early flower development. The newly methylated sites in early

flowers are associated with a large number of genes, among which 6570 genes contained

mCG, 4570 contained mCHG, and 5602 contained mCHH sites (supported by 5 reads or more).

From early to late flowers, only the number of mCG sites was slightly increased (by 2.1%),

whereas the numbers of mCHG and mCHH sites were both slightly decreased by 1.7% (Figure

1e-g).

The proportion of tissue-specific mC sites seemed to also vary across different sequence

contexts. In particular, 144,299 (31.8%) of mCG sites, 147,328 (34.6%) of mCHG sites, and

301,267 (44.0%) of mCHH sites were detected as specific to one of the three floral tissues. On

the other hand, 32.7% of mCHH sites were detected in all three tissues, compared with 43.4%

and 46.7% of mCHG and mCG sites, respectively, observed for all tissues. Therefore, we

found more between-tissue variations in number of mCHH than that of mCHG sites, which in

turn were slightly more variable than that of mCG sites (Figure 1e-g).

Page 9: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Distinct gene methylation patterns during floral development

To investigate the relationship between gene expression and DNA methylation during floral

development, we located the positions of mC sites relative to individual genes. In total, we

found 24,035 genes with at least one mC site supported by 5 reads or more (Table S3).

Regarding the three sequence contexts, we found 20,569, 17,409 and 18,746 genes containing

at least one mCG, mCHG and mCHH site, respectively. The overall methylation at gene level in

Arabidopsis flowers could be compared with recently published DNA methylome data

(Schmitz et al., 2013) from mixed stages of Arabidopsis flowers, within which >20,000 genes

were detected to have at least one mC site. Over 83% of these genes could also be found in

our list of methylated genes, representing a high level of consistency even when different

ecotypes have been used in these two studies (Col-0 for Schmitz et al. and Landsberg erecta

for this analysis).

We then divided genes into different classes according to their coding potential, namely

protein-coding, microRNA (miRNA), other non-coding RNA (ncRNA), pseudogenes, and

transposable element (TE) genes, and observed distinct methylation patterns for different

classes of genes. About 70% of annotated protein-coding genes (19,182 of 27416) were

methylated in one or more of the floral tissues, forming the largest group of methylated genes

(79.81%) (Figure S4). Only 15% of these protein-coding genes were methylated specifically

in one of the three tissues (meristem, 1518, ~7.9%; early flower, 667, ~3.5%; late flower,

715, ~3.7%) (Figure 2a, b). Furthermore, protein-coding genes were more likely to be

methylated at mCG sites than at mCHG and mCHH sites (mCG: 16,187, ~59%, mCHG, 13,060,

~48%, mCHH, 14,438, ~53%; chi-squared test P<1e-100).

Page 10: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Consistent with the role of DNA methylation in silencing TEs (Zhang et al., 2006, Cokus et

al., 2008, Law and Jacobsen 2010), close to 92% annotated TE genes (3,588 out of 3,903)

were methylated in one or more floral tissues, with only ~2.4% (93) being specific to one of

the tissues (Figure 2a, b). We found no significant differences in methylations at different mC

sequence contexts for TE genes (for all comparisons, chi-squared test P>0.1). Besides TE

genes, the Arabidopsis genome is annotated to contain 31,118 TEs, most of which don’t carry

genes but may impair genome integrity after activation by transposases or reverse

transposases (Law and Jacobsen 2010). We thus examined the methylation of these TEs in

floral tissues. Contrast to the broad methylation of TE genes, we found only 24% of TEs

(7516 in 31,188) were methylated in meristems. Previous studies revealed strong associations

between TE methylation and actions of siRNAs (Lister et al., 2008, Ahmed et al., 2011). To

check how often siRNAs participated in the methylation of TEs in floral tissues, we obtained

the siRNA sequencing data for Arabidopsis seedlings (Chodavarapu et al., 2010), and found

~40% TEs were potential targets of siRNAs (Figure S5). This relatively low hit ratio may be

ascribed to tissue-specific expression of some siRNA species, or that a substantial proportion

of TEs was not methylated via siRNAs, or that silencing of TEs might be primarily achieved

via silencing of TE genes.

Genes of different classes also showed differences in the patterns of methylation variation

during floral development. Significantly higher proportions of mCG-containing protein-

coding genes (42.1%) than mCHG-containing (28.7%) and mCHH-containing (23.2%)

protein-coding genes were differentially methylated between meristems and early flowers.

Similar patterns were observed for pseudogenes, miRNA, and ncRNA genes, though with

lesser extent (data not shown), suggesting similar DNA methylation-related regulatory

mechanisms that might reflect common evolutionary origin and/or functional constraints.

Page 11: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

In contrast, more mCHH-containing TE genes showed methylation variation than mCG or

mCHG TE genes. A closer look found that, for TE genes, methylations at mCHH sites were

primarily decreased, while those at mCG sites were mainly increased (Figure 2c).

Concordantly, TEs mainly showed increased methylations, which primarily occurred at mCG

sites and mCHH sites, although only 2124 TEs (6.8%) were differentially methylated between

meristem and early/late flower development stages (Figure S5). Taken together, TEs tended

to become hypermethylated as floral development proceeded, possibly because of greater

needs to protect the reproductive cells against the mutagenic activities of TEs (Slotkin et al.,

2009, Yang et al., 2011). Furthermore, CG methylation might play critical roles in the

increased methylation of TEs as well as TE genes. TEs can be classified into different

families, which may have different methylation patterns during floral development. Indeed,

the LTR/Gypsy family of retrotransposons, occupying 13.4% of all TEs in Arabidopsis,

accounted for 36.0% of methylated TEs, whereas the RC/Helitron type of DNA transposons,

representing 41.5% of all TEs, occupied only 11.7% of methylated TEs in the meristem

(Figure S5), indicating that retrotransposons were more tightly controlled by methylation than

DNA transposons. Furthermore, some TE families were more likely to be differentially

methylated than others. In brief, differentially methylated retrotransposons mainly fall into

the families of LTR/Copia, LTR/Gypsy and Line/L1, with little preference in mC context class

observed; Differentially methylated DNA transposons primarily came from the families of

HAT and DNA/others (Figure S6).

Finally, we examined relative methylation levels (RKCM) across transcribed regions and the

surrounding genomic regions. We found similar profiles as previously reported, which were

also quite similar between different floral stages. For example, the methylation for protein-

coding genes mainly occurred at mCG sites, and confined within genic regions. The

Page 12: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

methylations for TE genes showed similar patterns across sequence contexts, i.e. primarily

occurred within genic regions, dramatically dropped immediately at transcription start site

(TSS) and after transcription end site (TES), high again at adjacent regions further upstream

of TSS and downstream of TES (Figure 2d-f). Methylations in the regions of pseudogenes

showed similar patterns as TE genes, with levels intermediate between TE genes and protein-

coding genes, consistent with recent observations (Cokus et al., 2008). These results

suggested a primary role of MET1 (for mCG) in the methylation maintenance of protein-

coding genes, and comparable contributions between MET1, DRM1/2, and CMT2/3 (for

mCHG and mCHH) to the methylation of pseudogenes and TE genes (Chan et al., 2005,

Stroud et al., 2014).

DNA methylation at different genic regions differentially correlates with gene

expression

DNA methylation in promoter regions is often associated with transcriptional silencing

(Zhang et al., 2006), but recent studies have revealed distinct relationships between gene

expression and methylation in different genic regions (Brenet et al., 2011). To revisit this

question, we measured gene expression profiles by RNA-sequencing for the same three

tissues used for DNA methylation studies. Transcripts of 26,764 genes were detected in at

least one floral tissue, with 5,768 of them significantly differentially expressed between the

three tissues (see Experimental Procedures). To examine the relationships between expression

and methylation, genes were sorted into three equal-sized groups according to either

expression levels, or methylation intensities of a specific genic region (exon, intron, 1kb

upstream/downstream of TSS). We found a general trend of negative associations between

Page 13: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

expression levels and methylation levels for all mC contexts and gene regions, especially gene

body regions (Figure 3a-c).

We then compared the low and high expression genes for normalized methylation levels in

different genic regions, with exons and introns classified according to their positions relative

to TSS and TES sites. Consistent with previous findings (Zhang et al., 2006, Zemach et al.,

2010), genes in high expression group were relatively hypomethylated at all mC contexts in

the 1kb upstream and downstream regions (Figure 3d-f) and first exons of genes in the high

expression group were methylated at lower levels than first exons of genes in the low

expression group, for all mC contexts (Brenet et al., 2011, Chuang et al., 2012). Notably,

methylation level in the first intron showed even more significant negative correlation to

expression levels. Hence, regions near TSS could closely participate in methylation-

dependent transcriptional silencing. Furthermore, internal exons of genes with high

expression tended to be significantly highly methylated at mCG sites, but lowly methylated at

mCHG and mCHH sites; internal introns exhibited little or no association between expression

and methylation levels (Figure 3d-f). These observations suggested that the influences of

methylation on gene expression varied depend on genic regions and sequence contexts.

DNA methylations at different sequence contexts differentially associated with gene

expression levels during floral development

By comparing raw read numbers and RKCM values for each gene at each of the three

sequence contexts (Table S3, Figure S7), we identified 11,880 genes with statistically

significant methylation variations between meristem and the early flower, indicating that

Page 14: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

DNA methylation in gene body (body-methylation) is broadly regulated across genome

(Figure 4a, Table S4). In addition, 2503 genes showed significant methylation variations in

their putative promoter regions during the same developmental period, consistent with the

relatively poor methylation of promoter regions (promoter-methylation) (Zhang et al., 2006).

Further comparison revealed that only 1235 genes were simultaneously differentially

methylated in both body and promoter regions during early flower development (Figure 4a),

suggesting that body-methylation and promoter-methylation were largely regulated

separately, possibly by independent mechanisms.

We identified 3,067 genes that showed significant variations in both methylation and gene

expression (termed co-differential genes hereafter) during floral development, with only 10%

(317 out of 3067) differentially methylated across all the three sequence contexts (Figure 4b),

suggesting sequence context-dependent effects of methylation on transcription. Moreover,

among the 3986 genes differentially expressed between meristems and early flowers, 2117

contained mCG sites, and 1048 (49.5%) of them were co-differential at mCG sites, compared

with 34.5% (601) of 1744 mCHG-containing genes and 26.9% (509) of 1894 mCHH-

containing genes being co-differential (Figure 4c, Table S5). We also observed that,

significantly higher proportion of transcriptionally up-regulated genes were co-differential

between meristem and early or late flowers in comparison with genes that were down-

regulated at the same periods, consistent across all mC sequence contexts (Figure 4c). Hence,

variation in DNA methylation levels could likely function as a signal affecting transcription

during early flower development.

Page 15: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Body-methylation variations could serve as important transcriptional regulatory signals

during floral development

Our dataset provides clues to understand how the variation in methylation level can affect

development. Among the co-differential genes are several important floral developmental

regulators, including SEP1, LEUNIG, and SEEDSTICK (Figure 5a-c). SEP1 encodes a

MADS-box protein important for determining floral organ identity (Pelaz et al., 2000), and

was found to have dramatically elevated methylation and transcription levels during early

flower development. LEUNIG encodes an important repressor of AG, which is required for

the identity of stamens and carpels and for meristem determinacy (Mizukami and Ma 1992,

Conner and Liu 2000, Sridhar et al., 2004). We also found co-differential TE genes, including

AT1G64270 encoding the transposase for the Mutator-like DNA transposons. We found no

methylation in the putative promoter of AT1G64270, and its transcribed region was

dramatically demethylated during the early flower development, which could have resulted in

its transcriptional suppression (Figure 5d). Hence, gene region demethylation could also

function to silence TEs whose mutagenic activities might destabilize the genome during

reproduction.

Interestingly, we found the gain of methylation at the 3’ exons and the demethylation at 5’

part of the gene DEMETER-LIKE 1 (DML1) was correlated with its transcriptional reduction

(RNA-seq RPKM: M, 46.3; E, 21.8; L, 21.2; Figure 5e). DML1 could function as

transcriptional repressor and a demethylase, consistent with the genome-wide up-regulation

of important floral development regulators and the massive de novo methylation during early

flower development (Gong et al., 2002, Agius et al., 2006).

Page 16: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Methylated genes exhibited distinct patterns among sequence contexts/stages and were

enriched for diverse biological processes

To discover any shared methylation patterns among the 3,067 methylated genes, we

performed consensus clustering on the normalized methylation levels for the three mC

sequence contexts in three tissues (Figure 6a), resulting 11 different clusters of distinct

methylation patterns. Genes in different clusters showed significantly different average

methylation levels and patterns across floral development stages (Figure 6b). For example,

genes in cluster I were highly methylated for mCG, mCHG and mCHH sites, with higher levels

in the early and late flower tissues than in the flower meristem, but those in cluster II were

relatively high for mCG, intermediate for mCHH, and low for mCHG, all in the same manner

for each developmental stage (Figure 6b). Clusters III to VI showed similar levels for two of

the three sequence contexts. Clusters VII had higher levels in the meristems than the other

two tissues, for all the three sequence contexts, whereas cluster VIII showed similar

developmental pattern for only mCG sites. Clusters IX to XI were similar in having increased

methylation levels from the meristem to both early and late flower development, but differed

in sequence contexts: cluster IX was similar for all three sequence contexts, whereas clusters

X and XI had increased methylation levels for only mCG and mCHH sites, respectively

(Figure 6b). These results support the hypothesis that demethylation and de novo methylation

processes by different methyltransferases during floral development largely occur

independently from each other.

We next performed gene ontology (GO) enrichment analysis to explore the associations

between methylation variation and gene functions. The enriched GO categories of methylated

genes suggested possible involvement in diverse biological processes, including meristem

Page 17: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

development, floral organ development, mitosis, meiosis as well as plant body pattern

specification (Figure 6c, Table S6, Table S7). Several genes known for floral organ

development and reproduction processes were found among these genes, including MSH7,

VIM3, CHR42, NUA (Figure 5f, Figure S8). Therefore, DNA methylation reprogramming

might contribute significantly to the regulation of floral development in Arabidopsis.

DISCUSSION

We used MspJI-seq to survey the genome-wide DNA methylation during Arabidopsis flower

development and uncovered a large number of potential de novo methylation sites in early

flower development and demethylation sites in late flower development, supporting the idea

that extensive de novo methylation as well as demethylation likely occurred during flower

development. Different methylation patterns for three sequence contexts (mCG, mCHG, mCHH)

and in different genic regions potentially have different effects on gene expression. The

whole-genome DNA methylation and gene expression patterns and derived hypotheses of

their interactions reveal more complex relationships than expected. Future functional studies

are needed to elucidate the control of DNA methylation and gene expression and to

understand the biological functions and mechanisms for regulating floral genes.

The ap1 cal double mutant has been used widely to obtain relatively large amount of

meristems (Wellmer et al., 2006) (Kaufmann et al., 2010, Wuest et al., 2012), because the

mutant is arrested at an inflorescence meristem stage (Bowman et al., 1993, Ferrandiz et al.,

2000). However, we cannot rule out possible effects of the mutations on the DNA

methylation and gene expression patterns and need to be cautious in interpreting the results

Page 18: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

on the differences between the ap1 cal meristem and the early and late flowers. Efficient

methods for separating and collecting wild-type meristems are needed for further studies to

confirm the differences relating to the meristems.

Different methylation patterns between mC sequence contexts and genic regions

The results here allow comprehensive analyses of many aspects of DNA methylation

patterns. The ~1.5 million methylation sites from three flower tissues included 453,066 mCG,

425,428 mCHG and 685,100 mCHH sites, representing a dramatic increase of 37% in the

detection of mCHH sites from the previously reported ~500 thousand mCHH sites in young

flowers (Lister et al., 2008). Previously CHH methylation was found to play important roles

in various plant developmental processes in Arabidopsis endosperm, maize and cotton fiber

(Hsieh et al., 2009, Gent et al., 2013, Jin et al., 2013). The relatively high proportion of

newly methylated mCHH sites reported here (Figure 1d) and correlation of >1000 genes with

variation in CHH methylation (Figure 4b) suggest that CHH methylation might be important

for floral gene expression.

In addition, our observations that methylation patterns and developmental changes in

methylation vary depending on sequence contexts and gene classes suggest complex

relationships between sequences/genes, methylation and developmental gene functions. For

example, the number of mCs increased from the meristems to early flowers for all sequence

contexts, but only mCG sites increased in number from the early to late flowers. The period

from meristem to early flower involves mainly organ identity specification and

organogenesis, whereas the subsequent period includes much of the floral organ growths, as

Page 19: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

well as gametophyte development. Therefore, our results suggest that gene expression

associated with mCG sites might play a greater role in floral organ growth. The findings that

protein-coding genes showed preferential enrichment of mCG sites and depletion of mCHG

and mCHH sites in the transcribed regions are consistent with previous results (Zhang et al.,

2006, Cokus et al., 2008, Feng et al., 2010, Zemach et al., 2010). In contrast, TE genes and

pseudogenes showed similar frequencies mCs of different sequence contexts across

transcribed and nearby genomic regions, with mCs enriched for transcribed regions and

depleted near the transcription start sites (Figure 2d-f). Therefore, protein-coding genes and

other types of genes are likely affected by DNA methylation in different manners. Moreover,

few genes have methylation levels and expression levels simultaneously changed at all

sequence contexts, suggesting influences on expression by mCs of different sequence contexts

tended to be unrelated to each other (Figure 4b).

Our findings also support the idea that the relationships between DNA methylation and gene

expression levels are more complicated than previously thought (Zhang et al., 2006, Suzuki

and Bird 2008, Ball et al., 2009). Promoter methylation is often linked with gene expression

suppression, whereas the role of gene body methylation is far more uncertain, with both

positive and negative relationships reported (Zhang et al., 2006, Zilberman et al., 2007, Li et

al., 2008, Zemach et al., 2010). These contradictory findings can be explained when

considering the observations that DNA methylation interact with other factors, including

histone modifications and small interfering RNAs, to determine transcriptional status (Li et

al., 2008, Stroud et al., 2014). In addition, the effect of DNA methylation on expression could

also depend on genic regions or sequence contexts. In humans, methylation of the first exons

was more significantly associated with gene silencing than that of nearby promoter (Brenet et

al., 2011). Similarly, our observation in Arabidopsis of strong negative relationships between

Page 20: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

DNA methylations of the first exons and expression for all sequence contexts (Figure 3)

suggests that the relevant mechanisms might be conserved between animals and plants. The

even stronger negative effects of the first intron methylation presented here are consistent

with the fact that first introns of eukaryotic genes often carry regulatory elements for

transcription (Majewski and Ott 2002, Bradnam and Korf 2008, Bieberstein et al., 2012). We

also found that the methylation at mCG sites of internal exons tended to be positively

correlated with gene expression, unlike the mCHG and mCHH sites of internal exons and all

sequence contexts of internal introns (Figure 3). Therefore, our separate analyses regarding

sequence contexts and genic regions revealed that the effects of DNA methylation on gene

expression were not only position (genic region)-dependent, but also sequence context-

dependent, allowing further understanding of the relationships between methylation and gene

expression.

Implications of functional roles of DNA methylation in flower development

Changes in genome-wide DNA methylation levels have been associated with plant

development, including global demethylation in the endosperm of the developing seed

compared with the embryo (Gehring et al., 2009, Hsieh et al., 2009) and lower methylation in

the central cell of the female gametophyte than somatic cells (Jullien and Berger 2010).

Previous studies also suggested that DNA methylation level tends to increase during

development from seedling through vegetative stages to floral stages (Ruiz-Garcia et al.,

2005). Our analysis of differential methylation between three floral tissues revealed that more

sites are methylated de novo than demethylated from meristem to early flowers, whereas

similar numbers of sites were methylated and demethylated from early to late flower

development (Figure 1e-g). These findings suggest that plant DNA methylation is under

Page 21: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

critical control during development and their proper repatterning is important for the

developmental program.

The importance of DNA methylation on development have been demonstrated by genetic

studies of methyltransferase genes or chromatin remodelers, including MET1 and DDM1,

with effects on embryogenesis, meristem identity and flowering time (Finnegan et al., 1996,

Kakutani et al., 1996, Ronemus et al., 1996, Xiao et al., 2006). The effects of DNA

methylation are at least in part via modulating transcription of developmental regulators

(Xiao et al., 2006, Li et al., 2008, Gehring et al., 2009, Hsieh et al., 2009). In addition, DNA

methylation regulates genes playing critical roles during flower development; for example,

hypermethylation of SUPERMAN and AG phenocopies the corresponding mutants (Jacobsen

et al., 2000). Our results that DNA methylation is associated with changes in the expression

of over 3000 genes suggest that methylation affects genes with diverse roles during flower

development, such as regulation of early flower development (33 genes) and pollen

development (21 genes) (Table S6). In addition, the expression of regulatory genes is

probably affected by methylation, including genes for transcriptional control (201), chromatin

organization (29), and signal transduction (56).

In particular, changes in DNA methylation was linked to expression changes of several key

regulators. For instance, the expression of floral regulators SEP1 and SEEDSTICK could be

affected by DNA methylation (Figure 5a,c). An effect of DNA methylation on the C function

gene AG was previously reported when hypermethylated epi-alleles of AG were found in

plants with reduced global methylation (Jacobsen et al., 2000). Our data suggest that DNA

methylation could indirectly influence the expression of AG in wild-type plants, similar to the

Page 22: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

case of the FLC gene (Finnegan et al., 2005). Decreased methylation in early and late flowers

could have resulted in down-regulation of LEUNIG (Figure 5b), a negative regulator of AG

(Sridhar et al., 2004). In addition, methylation might also function to activate BLH9 and/or

suppress PERIANTHIA (Figure S8a), both encoding transcription factors that are required for

proper floral expression of AG (Bao et al., 2004, Das et al., 2009). On the other hand, the

methyl-cytosine binding proteins VIM1, VIM2, VIM3, involved in hypermethylation of the

flowering-time gene FWA and its subsequent suppression (Woo et al., 2008), were observed

to have high gene expression levels that might also be affected by methylation (Figure S8b),

suggesting deeper involvement of DNA methylation in flowering transition.

Our results also suggest that DNA methylation might impact other cellular and developmental

processes during reproduction, such as pollen tube growth, mitosis and meiosis. For example,

GO enrichment analysis revealed the overrepresentation of 21 genes associated with pollen

development in Cluster 6 of methylated genes (Figure 6, Table S7) and the correlated changes

in methylation and expression between the three floral tissues/stages of genes participating in

meiosis, including AtMSH7, AtMSH4, AtDMC1 and AtSMC3 (Figure 5f, Table S4),

suggesting the DNA methylation could be involved in the floral development program as well

as in the embryogenesis and other aspects of plant development. Our analyses provide

insights into the possible roles of DNA methylation in gene expression and generate

important resources for further investigation of the genetic pathway that regulating the flower

development.

Page 23: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

EXPERIMENTAL PROCEDURES

Plant growth, tissues collection, and DNA isolation

Arabidopsis thaliana Landsberg ecotype that were homozygous for the erecta mutation (Ler)

and ap1 cal mutant (also in Ler) plants were grown in a plant growth room at 22℃ in soil

with 16 hours light/8 hours dark cycles. The meristems of the ap1 cal mutant plants, the early

flowers (stages 1-9), and the late flowers (stages 10-12) were collected separately (Figure S9).

The materials collected for each sample were from many individuals and of large amounts

(2g), to control the effect of biological variations. Genomic DNAs were extracted using the

DNeasy Plant Mini Kit (QIAgen), precipitated by 2 volume of EtOH with 1/10 volume of 3

M NaAc (pH5.2), and resuspended in Tris-EDTA (pH=8.0) buffer to the final concentration

of 1 μg/μl.

MspJI digestion and DNA recovery

Five μg of each DNA sample was added to 90 μl of well-mixed NE Buffer 4, BSA, and

nuclease-free digestion mix before the addition of 20 units of MspJI enzyme (New England

Biolabs), for effective digestion without obvious DNA degradation (Figure S10a), with

incubation at 37℃ for 16 hours.

After digestion, the DNAs were separated in a 20% polyacrylamide (PAA, arc/bis: 29:1) gel

(50mA, 2.5 hours). The PAA gel pieces with DNA around 32 bp (Figure S9a) were excised

from the PAA gel and crushed and transferred into sterile microfuge tubes containing 300ul

of buffer (0.3M Sodium Acetate, pH 7.5, with 0.1mM EDTA), and were shaken overnight at

Page 24: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

37˚C. Then the gel was spun down by centrifugation and the supernatant was collected. DNA

was precipitated by adding 2 μl of glycogen and 2 volumes of 100% ethanol into each tube

and placing the tubes at -80˚C for 30 min, then centrifugation for 30 min at 14000 RPM at

4˚C. The DNA pellets were washed with 70% ethanol twice before resuspension with Tris-

EDTA (pH=8.0) buffer.

Construction of DNA methylation fragment library for sequencing

The recovered DNA samples were used to construct sequencing libraries according to the

fragment library preparation protocol of the SOLiD System Library Preparation Guide with

some modifications. Both ends of recovered DNA were repaired with the END Polishing

Enzyme 1 and 2 (Reagents of SOLiDTM Fragment Library Construction Kit, S3100102), and

purified with the QIAquick Nucleotide Removal Kit (Qiagen) using the spin columns from

the MinElute Reaction Cleanup Kit (Qiagen). After the purification, the P1 and P2 Adaptors

(Reagents of SOLiDTM Fragment Library Construction Kit) were ligated to the DNA

fragments, and followed by another purification step using the QIAquick Nucleotide

Removal Kit. Recovered DNA libraries were nick-translated and amplified for 10 cycles with

PCR, then purified with QIAquick Nucleotide Removal Kit. Purified DNA libraries were

visualized in 4% agarose gel electrophoresis and the bands about 100 bp (Figure S10b) were

recovered using the QIAquick Gel Extraction Kit (Qiagen). The resulted libraries were

analyzed by Bioanalyzer (Agilent, Santa Clara, USA), and 0.5pM of each DNA library was

used to perform the emulsion PCR reactions following the Templated Bead Preparation

Guide. The 35-bp sequence reads were obtained using SOLiDTM 3 system, and were

subsequently aligned against the Arabidopsis reference genome sequences (TAIR10,

Page 25: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

http://www.arabidopsis.org) using the software Bioscope (version 1.3) provided by Life

Technologies (see Methods S1). The overall mapping rates were ~61% (Table S1).

DNA methylome analysis

We developed an in-house program to identify SOLiD reads supporting methyl-cytosines. In

brief, six cases of paired sequence patterns were inferred based on the MspJI digestion

properties, with each case resulting in DNA fragments that could be recovered (Figure S11,

Methods S2). Aligned reads that can be classified into one of the six patterns were identified

as methyl-reads, each supporting two potential mCs. MspJI cut double-stranded DNA at fixed

distances downstream of the recognized sequence (mCNNR), with the cleavage on the reverse

strand wobbling by one nucleotide (16 or 17) (Zheng et al., 2010). Methyl-reads were

counted separately for each of the two cleavage patterns (N12/N16 or N12/N17) resulting in an

N16/N17 ratio of ~3:1 (Table S8). The count of reads supporting methyl-cytosines was

calculated for each genomic feature (genes, exons etc.) and sequence context separately, and

normalized as reads per kilo-base of potentially digested CNNR sites per million of mapped

reads (RKCM). Fisher’s exact test was performed on the read counts in two different tissues

for a gene, and was defined as differentially methylated when the p-value was <0.001 and the

ratio of RKCM values was >2 fold-change.

Gene ontology enrichment analysis were performed using the R package topGO (Alexa and

Rahnenfuhrer 2010). The graphs in Figure 5 displaying the DNA methylation and gene

transcription profiles were produced using the R package Gviz. Consensus clustering on the

log2-transformed RKCM values of differentially methylated genes was performed by the R

Page 26: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

package clusterCons (Simpson et al., 2010), using the algorithm of partitioning (clustering)

of the data into ‘k’ clusters “around medoids” (PAM).

Total RNA isolation and transcriptome analysis

All three tissues were frozen in liquid nitrogen immediately after collection. Total RNAs

were extracted using Plant RNase Mini kit (Qiagen). More than 2 μg of total RNA with

A260/280 between 1.8 to 2.0 from each sample were used to make the libraries which were

deep sequenced by Illumina™ Hi-seq 2000 system, to obtain the 100 bp-long paired-end

reads. All reads with less than 2 mismatches were mapped to the Arabidopsis genome

(TAIR10) by TopHat (Trapnell et al., 2009). Expression values calculation and differential

expression analysis were conducted using GFOLD (Feng et al., 2012) with default

parameters.

ACKNOWLEDGEMENTS

We thank Dr. Zhiyi Sun from New England Biolabs for discussion about computational

identification of the MspJI digested methylation sites. We appreciate the data kindly provided

by Dr. Steven Jacobsen and Matteo Pellegrini. This work was supported by the Ministry of

Science and Technology of the People’s Republic of China (MOST 2012CB910503), the

National Natural Science Foundation of China (31130006 and 31371330), and the start-up

funds from Fudan University to FC.

Page 27: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

SUPPORTING INFORMATION

SUPPORTING FIGURES

Figure S1. Validation of selected methyl-cytosine sites by PCR experiments.

Figure S2. Methylation profiles determined by MspJI-seq and BS-seq were consistent for

most of the randomly selected genes and genomic regions.

Figure S3. Density of genes and TEs across Arabidopsis genome.

Figure S4. Percentages of genes of each class among all methylated genes.

Figure S5. Methylation of TEs during Arabidopsis floral development.

Figure S6. The enrichment of TEs of different families in different mC sequence contexts.

Figure S7. The distribution of normalized methylation levels of each mC context for genes of

different classes.

Figure S8. Examples of genes with correlated variations in methylation and expression

levels.

Figure S9. Phenotypes of the three Arabidopsis floral stages used for experiments.

Figure S10. MspJI digestion and DNA library recovery.

Figure S11. Identification of mCs based on MspJI-seq.

SUPPORTING TABLES

Table S1. Summary of SOLiD reads sequenced and mapped against the Arabidopsis

reference genome and reads that were identified as arising from MspJI digestion for each

possible recognition site pattern.

Table S2. Primers used in PCR experiments for selected gene regions digested by MspJI.

Table S3. The DNA methylation and expression levels for genes in Arabidopsis flowers.

Table S4. Arabidopsis genes differentially methylated and differentially expressed during

floral development.

Page 28: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Table S5. Statistics of differentially expressed and methylated genes between flower

meristems and early flowers.

Table S6. Significantly enriched biological processes for each gene cluster in Figure 6.

Table S7. Number of genes annotated to enriched GO terms for each gene cluster in Figure 6.

Table S8. The relative frequencies of the wobble cut positions of MspJI.

SUPPORTING EXPERIMENTAL PROCEDURES

Methods S1. Mapping of SOLiD short sequencing reads

Methods S2. Identification of methyl-cytosines (mCs) based on MspJI-seq

REFERENCES Agius, F., Kapoor, A. and Zhu, J.K. (2006) Role of the Arabidopsis DNA glycosylase/lyase ROS1 in active DNA demethylation. Proc. Natl Acad. Sci. USA, 103, 11796-11801. Ahmed, I., Sarazin, A., Bowler, C. et al. (2011) Genome-wide evidence for local DNA methylation spreading from small RNA-targeted sequences in Arabidopsis. Nucleic

Acids Res., 39, 6919-6931. Alexa, A. and Rahnenfuhrer, J. (2010) topGO: Enrichment analysis for Gene Ontology. Ball, M.P., Li, J.B., Gao, Y. et al. (2009) Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells. Nat. Biotech., 27, 361-368. Bao, X., Franks, R.G., Levin, J.Z. et al. (2004) Repression of AGAMOUS by BELLRINGER in floral and inflorescence meristems. Plant Cell, 16, 1478-1489. Bieberstein, Nicole I., Carrillo Oesterreich, F., Straube, K. et al. (2012) First exon length controls active chromatin signatures and transcription. Cell Rep., 2, 62-68. Bird, A. (2002) DNA methylation patterns and epigenetic memory. Genes Dev., 16, 6-21. Bowman, J.L., Alvarez, J., Weigel, D. et al. (1993) Control of flower development in

Arabidopsis thaliana by APETALA1 and interacting genes. Development, 119, 721-743. Bradnam, K.R. and Korf, I. (2008) Longer first introns are a general property of eukaryotic gene structure. PLoS One, 3, e3093. Brenet, F., Moh, M., Funk, P. et al. (2011) DNA methylation of the first exon is tightly linked to transcriptional silencing. PLoS One, 6, e14524. Chan, S.W., Henderson, I.R. and Jacobsen, S.E. (2005) Gardening the genome: DNA methylation in Arabidopsis thaliana. Nat. Rev. Genet., 6, 351-360. Chang, F., Wang, Y., Wang, S. et al. (2011) Molecular control of microsporogenesis in

Arabidopsis. Curr. Opin. Plant Biol., 14, 66-73. Chodavarapu, R.K., Feng, S., Bernatavichute, Y.V. et al. (2010) Relationship between nucleosome positioning and DNA methylation. Nature, 466, 388-392.

Page 29: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Chuang, T.J., Chen, F.C. and Chen, Y.Z. (2012) Position-dependent correlations between DNA methylation and the evolutionary rates of mammalian coding exons. Proc. Natl Acad. Sci. USA, 109, 15841-15846.

Cohen-Karni, D., Xu, D., Apone, L. et al. (2011) The MspJI family of modification-dependent restriction endonucleases for epigenetic studies. Proc. Natl Acad. Sci. USA, 108, 11040-11045.

Cokus, S.J., Feng, S., Zhang, X. et al. (2008) Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature, 452, 215-219.

Conner, J. and Liu, Z. (2000) LEUNIG, a putative transcriptional corepressor that regulates AGAMOUS expression during flower development. Proc. Natl Acad. Sci. USA, 97, 12902-12907.

Das, P., Ito, T., Wellmer, F. et al. (2009) Floral stem cell termination involves the direct regulation of AGAMOUS by PERIANTHIA. Development, 136, 1605-1611. Feng, J., Meyer, C.A., Wang, Q. et al. (2012) GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics, 28, 2782-2788. Feng, S., Cokus, S.J., Zhang, X. et al. (2010) Conservation and divergence of methylation patterning in plants and animals. Proc. Natl Acad. Sci. USA, 107, 8689-8694. Ferrandiz, C., Gu, Q., Martienssen, R. et al. (2000) Redundant regulation of meristem identity and plant architecture by FRUITFULL, APETALA1 and CAULIFLOWER.

Development, 127, 725-734. Finnegan, E.J., Kovac, K.A., Jaligot, E. et al. (2005) The downregulation of FLOWERING

LOCUS C (FLC) expression in plants with low levels of DNA methylation and by vernalization occurs by distinct mechanisms. Plant J, 44, 420-432. Finnegan, E.J., Peacock, W.J. and Dennis, E.S. (1996) Reduced DNA methylation in

Arabidopsis thaliana results in abnormal plant development. Proc. Natl Acad. Sci. USA, 93, 8449-8454.

Gan, E.-S., Huang, J. and Ito, T. (2013) Functional roles of histone modification, chromatin remodeling and microRNAs in Arabidopsis flower development. In International Review of Cell and Molecular Biology (Kwang, W.J. ed: Academic Press, pp. 115-161.

Ge, X., Chang, F. and Ma, H. (2010) Signaling and transcriptional control of reproductive development in Arabidopsis. Curr. Biol., 20, R988-R997. Gehring, M., Bubb, K.L. and Henikoff, S. (2009) Extensive demethylation of repetitive elements during seed development underlies gene imprinting. Science, 324, 1447-1451. Gent, J.I., Ellis, N.A., Guo, L. et al. (2013) CHH islands: de novo DNA methylation in near-gene chromatin regulation in maize. Genome Res, 23, 628-637. Goll, M.G. and Bestor, T.H. (2005) Eukaryotic cytosine methyltransferases. Annu. Rev.

Biochem., 74, 481-514. Gomez-Mena, C., de Folter, S., Costa, M.M. et al. (2005) Transcriptional program controlled by the floral homeotic gene AGAMOUS during early organogenesis.

Development, 132, 429-438. Gong, Z., Morales-Ruiz, T., Ariza, R.R. et al. (2002) ROS1, a repressor of transcriptional gene silencing in Arabidopsis, encodes a DNA glycosylase/lyase. Cell, 111, 803-814. He, X.J., Chen, T. and Zhu, J.K. (2011) Regulation and function of DNA methylation in plants and animals. Cell Res., 21, 442-465.

Page 30: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Horton, J.R., Mabuchi, M.Y., Cohen-Karni, D. et al. (2012) Structure and cleavage activity of the tetrameric MspJI DNA modification-dependent restriction endonuclease. Nucleic Acids Res., 40, 9763-9773. Hsieh, T.F., Ibarra, C.A., Silva, P. et al. (2009) Genome-wide demethylation of Arabidopsis endosperm. Science, 324, 1451-1454. Huang, X., Lu, H., Wang, J.W. et al. (2013) High-throughput sequencing of methylated cytosine enriched by modification-dependent restriction endonuclease MspJI. BMC

Genet., 14, 56. Initiative, T.A.G. (2000) Analysis of the genome sequence of the flowering plant

Arabidopsis thaliana. Nature, 408, 796-815. Jacobsen, S.E., Sakai, H., Finnegan, E.J. et al. (2000) Ectopic hypermethylation of flower-specific genes in Arabidopsis. Curr. Biol., 10, 179-186. Jin, X., Pang, Y., Jia, F. et al. (2013) A potential role for CHH DNA methylation in cotton fiber growth patterns. PLoS One, 8, e60547. Jones, L., Hamilton, A.J., Voinnet, O. et al. (1999) RNA-DNA interactions and DNA methylation in post-transcriptional gene silencing. Plant Cell, 11, 2291-2301. Jullien, P.E. and Berger, F. (2010) DNA methylation reprogramming during plant sexual reproduction? Trends Genet., 26, 394-399. Jullien, P.E., Susaki, D., Yelagandula, R. et al. (2012) DNA methylation dynamics during sexual reproduction in Arabidopsis thaliana. Curr. Biol., 22, 1825-1830. Kakutani, T., Jeddeloh, J.A., Flowers, S.K. et al. (1996) Developmental abnormalities and epimutations associated with DNA hypomethylation mutations. Proc. Natl Acad.

Sci. USA, 93, 12406-12411. Kaufmann, K., Wellmer, F., Muino, J.M. et al. (2010) Orchestration of floral initiation by APETALA1. Science, 328, 85-89. Law, J.A. and Jacobsen, S.E. (2010) Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat. Rev. Genet., 11, 204-220. Li, X., Wang, X., He, K. et al. (2008) High-resolution mapping of epigenetic modifications of the rice genome uncovers interplay between DNA methylation, histone methylation, and gene expression. Plant Cell, 20, 259-276. Lippman, Z., Gendrel, A.V., Black, M. et al. (2004) Role of transposable elements in heterochromatin and epigenetic control. Nature, 430, 471-476. Lister, R., O'Malley, R.C., Tonti-Filippini, J. et al. (2008) Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell, 133, 523-536. Ma, H. (2005) Molecular genetic analyses of microsporogenesis and microgametogenesis in flowering plants. Annu. Rev. Plant Biol., 56, 393-434. Majewski, J. and Ott, J. (2002) Distribution and characterization of regulatory elements in the human genome. Genome Res., 12, 1827-1836. Martienssen, R.A. and Colot, V. (2001) DNA methylation and epigenetic inheritance in plants and filamentous fungi. Science, 293, 1070-1074. Mette, M.F., Aufsatz, W., van der Winden, J. et al. (2000) Transcriptional silencing and promoter methylation triggered by double-stranded RNA. EMBO J., 19, 5194-5201. Mizukami, Y. and Ma, H. (1992) Ectopic expression of the floral homeotic gene

AGAMOUS in transgenic Arabidopsis plants alters floral organ identity. Cell, 71, 119-131. Park, Y.D., Papp, I., Moscone, E.A. et al. (1996) Gene silencing mediated by promoter homology occurs at the level of transcription and results in meiotically heritable alterations in methylation and gene activity. Plant J., 9, 183-194.

Page 31: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Pelaz, S., Ditta, G.S., Baumann, E. et al. (2000) B and C floral organ identity functions require SEPALLATA MADS-box genes. Nature, 405, 200-203. Ronemus, M.J., Galbiati, M., Ticknor, C. et al. (1996) Demethylation-induced developmental pleiotropy in Arabidopsis. Science, 273, 654-657. Ruiz-Garcia, L., Cervera, M.T. and Martinez-Zapater, J.M. (2005) DNA methylation increases throughout Arabidopsis development. Planta, 222, 301-306. Schmitz, R.J., Schultz, M.D., Urich, M.A. et al. (2013) Patterns of population epigenomic diversity. Nature, 495, 193-198. Simpson, T.I., Armstrong, J.D. and Jarman, A.P. (2010) Merged consensus clustering to assess and improve class discovery with microarray data. BMC Bioinformatics, 11, 590. Slotkin, R.K., Vaughn, M., Borges, F. et al. (2009) Epigenetic reprogramming and small RNA silencing of transposable elements in pollen. Cell, 136, 461-472. Song, Y., Ma, K., Ci, D. et al. (2013) Sexual dimorphic floral development in dioecious plants revealed by transcriptome, phytohormone, and DNA methylation analysis in

Populus tomentosa. Plant Mol. Biol., 83, 559-576. Soppe, W.J., Jacobsen, S.E., Alonso-Blanco, C. et al. (2000) The late flowering phenotype of fwa mutants is caused by gain-of-function epigenetic alleles of a homeodomain gene. Mol. Cell, 6, 791-802. Sridhar, V.V., Surendrarao, A., Gonzalez, D. et al. (2004) Transcriptional repression of target genes by LEUNIG and SEUSS, two interacting regulatory proteins for

Arabidopsis flower development. Proc. Natl Acad. Sci. USA, 101, 11494-11499. Stam, M., Viterbo, A., Mol, J.N. et al. (1998) Position-dependent methylation and transcriptional silencing of transgenes in inverted T-DNA repeats: implications for posttranscriptional silencing of homologous host genes in plants. Mol. Cell Biol., 18, 6165-6177. Stroud, H., Do, T., Du, J. et al. (2014) Non-CG methylation patterns shape the epigenetic landscape in Arabidopsis. Nat. Struct. Mol. Biol., 21, 64-72. Suzuki, M.M. and Bird, A. (2008) DNA methylation landscapes: provocative insights from epigenomics. Nat. Rev. Genet., 9, 465-476. Trapnell, C., Pachter, L. and Salzberg, S.L. (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 25, 1105-1111. Wellmer, F., Alves-Ferreira, M., Dubois, A. et al. (2006) Genome-wide analysis of gene expression during early Arabidopsis flower development. PLoS Genet, 2, e117. Woo, H.R., Dittmer, T.A. and Richards, E.J. (2008) Three SRA-domain methylcytosine-binding proteins cooperate to maintain global CpG methylation and epigenetic silencing in Arabidopsis. PLoS Genet., 4, e1000156. Wuest, S.E., O'Maoileidigh, D.S., Rae, L. et al. (2012) Molecular basis for the specification of floral organs by APETALA3 and PISTILLATA. Proc. Natl Acad. Sci.

USA, 109, 13452-13457. Xiao, W., Custard, K.D., Brown, R.C. et al. (2006) DNA methylation is critical for

Arabidopsis embryogenesis and seed viability. Plant Cell, 18, 805-814. Yang, H., Lu, P., Wang, Y. et al. (2011) The transcriptome landscape of Arabidopsis male meiocytes from high-throughput sequencing: the complexity and evolution of the meiotic process. Plant J., 65, 503-516. Zemach, A., McDaniel, I.E., Silva, P. et al. (2010) Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science, 328, 916-919.

Page 32: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Zhang, X., Yazaki, J., Sundaresan, A. et al. (2006) Genome-wide high-resolution mapping and functional analysis of DNA methylation in Arabidopsis. Cell, 126, 1189-1201. Zheng, Y., Cohen-Karni, D., Xu, D. et al. (2010) A unique family of Mrr-like modification-dependent restriction endonucleases. Nucleic Acids Res, 38, 5527-5534. Zhong, S., Fei, Z., Chen, Y.R. et al. (2013) Single-base resolution methylomes of tomato fruit development reveal epigenome modifications associated with ripening. Nat.

Biotechnol, 31, 154-159. Zilberman, D., Cao, X., Johansen, L.K. et al. (2004) Role of Arabidopsis ARGONAUTE4 in RNA-directed DNA methylation triggered by inverted repeats. Curr. Biol., 14, 1214-1220. Zilberman, D., Gehring, M., Tran, R.K. et al. (2007) Genome-wide analysis of

Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat. Genet., 39, 61-69.

Figure Legends

Figure 1. The DNA methylome landscape of Arabidopsis flowers.

(a) Percentage of identified mCs for each sequence context. Numbers atop each bar indicate

the count of methyl-cytosines. (b) Percentages of mCs found in each genomic region. (c)

Display of log2-transformed methylation levels (RKCM values; scale shown on upper right)

for 100kb sliding windows (step size 50kb) for each sequence context, floral stages, and

different genomic regions. M, meristem; E, early flower; L, late flower. Genomic coordinates

on concatenated chromosomes are indicated at the bottom of the heatmaps. (d) Proportions of

floral stage-specific as well as stage-common mCs across the genome (200kb-long sliding

window; step size, 100kb). ME, ML, EL, the shared methyl-cytosines of two tissues; MEL,

common to three tissues. (e-g) Comparison of mCs between developmental stages for mCG,

mCHG, and mCHH sites, respectively.

Figure 2. Distinct methylation patterns for genes in Arabidopsis flowers.

(a) Percentage of methylated genes for each tissue. Color squares at the bottom of each

vertical bar represent mC contexts. TE, transposable element; ncRNA, non-coding RNA;

Page 33: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

miRNA, microRNA. (b) Between-tissue comparison of methylated genes for each gene class.

Horizontal color bar at the bottom of each vertical bar indicate mC context, as shown in (a). (c)

Percentage of differentially methylated genes for each class (ncRNA and miRNA combined)

and sequence context. +/- for increased/decreased methylation between meristems and early

flowers. (d-f) Normalized methylation levels at different stages (M, E, and L) for protein-

coding genes (dark green, green, and tan), pseudogenes (purple, red, and orange) and

transposable element genes (dark cyan, blue, and turquoise), separately. TSS, transcription

start site; TES, transcription termination site.

Figure 3. Methylation at different genic regions differentially associated with gene

expression.

(a-c) Comparison of gene expression and methylation levels for mCG, mCHG, and mCHH

sites, and for each genic region: upstream 1 kb regions (Up1k), exons, introns, and

downstream 1 kb regions (Down1k). Low and high methylation, the one third of genes with

lowest and highest methylation levels of each genic region, respectively. L and H at the

bottom of the bars indicate the one third of genes with the lowest or highest expression levels.

Gene percentages were calculated separately for each gene group: L or H expression groups

with low (blue) or high (orange) methylation levels. (d-f) comparison of normalized

methylation levels (RKCM) between low and high expression genes for mCG, mCHG, or

mCHH sites in each genic region, separately. Mann-Whitney U test: *, p>0.05; **, p>0.001;

***, p<0.001.

Page 34: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Figure 4. Genes with DNA methylation variations during Arabidopsis floral

development.

(a) Comparison of genes differentially methylated at one or more sequence contexts and

differentially expressed between meristem and early flower. ‘Gene Body’ and ‘Promoter’

represent transcribed region and the 1kb upstream region of genes, respectively;

‘Transcription’ represent genes differentially expressed. (b) Comparison of differentially

expressed genes according to methylation variations among different sequence contexts.

Numbers outside each circle indicate the total gene counts in the respective group. (c)

Percentage of differentially methylated genes among each group of differentially expressed

genes during floral development. ME, meristem to early flower; ML, meristem to late flower;

EL, early to late flower. +, up-regulation; -, down-regulation. Grey lines divide each bar into

a bottom part representing genes with increased methylation, and a top part for genes with

decreased methylation. Chi-square test: *, P<0.05; **, P<1e-3.

Figure 5. Representative Arabidopsis genes with variations in DNA methylation and

floral expression.

Tracks of MspJI-seq and RNA-seq reads are shown for each gene, encompassing the

transcribed as well as the up and downstream 1kb regions. Gene structures are shown at the

bottom of each graph, with blue boxes representing exons and arrows indicating introns and

the transcription direction of the respective gene. In some cases, parts of the exon-intron

structure for adjacent genes are shown; they do not span the central regions and are not

connected with the centrally located gene models. M, meristem; E, early flower; L, late

flower.

Page 35: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Figure 6. The methylation pattern and functional implications for differentially

methylated and expressed genes.

(a) Clustering of genes with concurrent variations in expression and methylation by similar

methylation patterns during floral development. The color gradient represents the log2-

transformed RKCM values, as indicated by the color bar atop the heatmaps. (b) Summarized

methylation patterns for each cluster. M, meristem; E, early flower; L, late flower. (c)

Enriched biological processes among each gene cluster. Gene ontology (GO) terms were

grouped into more general biological processes, as shown in the right side (see also Table

S6). Shading colors represent number of genes annotated to each GO term in each gene

cluster, as illustrated by the color bar on top.

Page 36: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Page 37: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Page 38: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Page 39: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Page 40: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.

Page 41: Whole-genome DNA methylation patterns and complex associations with gene structure and expression during flower development in Arabidopsis

Acc

epte

d A

rtic

le

This article is protected by copyright. All rights reserved.