Whole-genome DNA methylation patterns and complex associations with gene structure and expression...
Transcript of Whole-genome DNA methylation patterns and complex associations with gene structure and expression...
Whole-genome DNA methylation patterns and complexassociations with gene structure and expression duringflower development in Arabidopsis
Hongxing Yang1,2,†, Fang Chang1,*,†, Chenjiang You1,3, Jie Cui1, Genfeng Zhu1, Lei Wang1,3, Yu Zheng4, Ji Qi1,* and
Hong Ma1,3,*1State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Institute
of Plant Biology, Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai 200433, China,2Shanghai Chenshan Plant Science Research Center, Shanghai Institutes for Biological Sciences, Shanghai Chenshan
Botanical Garden, Chinese Academy of Sciences, Shanghai 201602, China,3Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, Institutes of Biomedical Sciences,
Fudan University, Shanghai 200032, China, and4New England Biolabs Inc., 240 County Road, Ipswich, MA 01938, USA
Received 24 September 2014; revised 3 November 2014; accepted 6 November 2014.
*For correspondence (e-mails [email protected], [email protected] and [email protected]).†These authors contributed equally to this work.
SUMMARY
Flower development is a complex process requiring proper spatiotemporal expression of numerous genes.
Accumulating evidence indicates that epigenetic mechanisms, including DNA methylation, play essential
roles in modulating gene expression. However, few studies have examined the relationship between DNA
methylation and floral gene expression on a genomic scale. Here we present detailed analyses of DNA meth-
ylomes at single-base resolution for three Arabidopsis floral periods: meristems, early flowers and late flow-
ers. We detected 1.5 million methylcytosines, and estimated the methylation levels for 24 035 genes. We
found that many cytosine sites were methylated de novo from the meristem to the early flower stage, and
many sites were demethylated from early to late flowers. A comparison of the transcriptome data of the
same three periods revealed that the methylation and demethylation processes were correlated with
expression changes of >3000 genes, many of which are important for normal flower development. We also
found different methylation patterns for three sequence contexts (mCG, mCHG and mCHH) and in different
genic regions, potentially with different roles in gene expression.
Keywords: cytosine methylation, Arabidopsis DNA methylome, MspJI, RNA-seq, flower development, gene
expression.
INTRODUCTION
Flowers are angiosperm reproductive structures, and
develop from the floral meristem. The cells derived from the
floral meristem undergo division and differentiation to form
four types of flower organs, including the reproductive
organs, stamens and pistil. Meiosis generates haploid
spores that then develop into pollen grains and embryo sacs
for fertilization and seed production. Flower development
requires the normal function of receptor-like protein kinases
and ligands, transcription factors, enzymes, and other mole-
cules (Ma, 2005; Ge et al., 2010; Chang et al., 2011). Epige-
netic mechanisms play essential roles by modulating the
expression of numerous genes through histone modifica-
tion, chromatin remodeling, microRNA-mediated mRNA
degradation and DNA methylation. Studies over the past
decade have revealed that these epigenetic pathways are
important for normal gene expression, while DNA methyla-
tion is also known to be involved in genome stability (Chan
et al., 2005; Law and Jacobsen, 2010; Gan et al., 2013).
Although conserved across eukaryotes, DNA methyla-
tion in plants has several unique features with regard to
the pattern of methylation, the methylation machinery, and
demethylation enzymes in non-dividing cells (Chan et al.,
2005). For example, mammalian DNA methylation occurs
mostly at CG sites by the DNA methyltransferase DNMT1
and homologs. In contrast, plant DNA methylation occurs
at CG, CHG and CHH sites, where H = A, T or C; in Arabid-
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd
1
The Plant Journal (2014) doi: 10.1111/tpj.12726
opsis thaliana, methylation at these three types of sites is
performed by DNA METHYLTRANSFERASE 1 (MET1), the
plant-specific CHROMOMETHYLASE 3 (CMT3), and
DOMAINS REARRANGED METHYLTRANSFERASEs
(DRMs), respectively (Chan et al., 2005). Each of the three
types of DNA methylation is crucial for development and
responses to environmental stresses (Bird, 2002; Chan
et al., 2005; Goll and Bestor, 2005; He et al., 2011; Jullien
et al., 2012; Song et al., 2013).
Consistent with the role of DNA methylation in genome
stability, a substantial proportion of the methylated cyto-
sines in Arabidopsis are found in genomic regions compris-
ing repetitive sequences such as transposable elements
(TEs) (Arabidopsis Genome Initiative, 2000; Martienssen
and Colot, 2001; Lippman et al., 2004; Chan et al., 2005).
Exogenous repetitive sequences such as transgenes may
also be methylated and induce methylation and consequent
transcriptional silencing of homologous sequences in trans-
genic lines (Mette et al., 2000; Soppe et al., 2000; Zilberman
et al., 2004; Chan et al., 2005). Case studies and genome-
wide analysis in Arabidopsis indicated that DNA methyla-
tion in promoter regions is often associated with transcrip-
tional gene silencing (Park et al., 1996; Stam et al., 1998;
Jones et al., 1999; Zhang et al., 2006; Zilberman et al.,
2007). Although genome-wide DNA methylation has been
investigated for vegetative tissues, little is known about the
patterns of DNA methylation and their association with
gene expression at various stages in plant development.
More specifically, a relationship between DNA methylation
and gene expression during flower development has not
been reported.
Whole-genome bisulfite sequencing as a gold standard
method has been applied in many studies of DNA methylo-
mes (Feng et al., 2010; Zemach et al., 2010), including Ara-
bidopsis (Cokus et al., 2008; Lister et al., 2008) and tomato
(Solanum lycopersicum) (Zhong et al., 2013). It requires
deep sequencing coverage for confident DNA methylation
calling, and thus is not cost-efficient. The recently identified
methylation-dependent endonuclease MspJI has both low
specificity in recognition sites and fixed cut distances: it rec-
ognizes 5-methylcytosine or 5-hydroxymethylcytosine in
the context of CNN(G/A), and cleaves both strands at fixed
distances (N12/N16–17) away from the modified cytosine on
the 30 side; these properties not only increase the number
of detectable methylcytosines (mCs), but also enable
identification of mCs at single-base resolution (Zheng et al.,
2010; Cohen-Karni et al., 2011; Horton et al., 2012; Huang
et al., 2013). In Arabidopsis, 48.8% of all cytosines and gua-
nines are part of CNNR/YNNG sites, of which 90.2% of
methylated sites were detected by methylation-dependent
MspJI DNA digestion combined with high-throughput
sequencing (MspJI-seq).
Here, we present detailed analyses of DNA methylomes
during Arabidopsis flower development, using MspJI-seq.
We analyzed three periods during flower development -
the meristem, early flower development (organogenesis),
and late flower development (maturation) - and detected
many more methylated cytosines in the second period
than either the first or third periods, suggesting de novo
methylation at many sites during organogenesis, followed
by demethylation at many sites. These developmental
stage-dependent methylation and demethylation activities
are correlated with changes in the expression levels of
over 3000 genes, including many genes that are important
for flower development. Moreover, the methylation pat-
terns and the potential influences on transcription vary sig-
nificantly across sequence contexts and genic regions. Our
study provides valuable insights into the possible func-
tions of DNA methylation during floral development.
RESULTS
The DNA methylation landscape during Arabidopsis floral
development
To survey the DNA methylation and gene expression dur-
ing flower development, we sampled wild-type early flow-
ers of stages 1–9 (E), wild-type late flowers of stages 10–12
(L), and meristems (M) from ap1 cal double mutant plants,
as this mutant is arrested at the inflorescence meristem
stage (Bowman et al., 1993; Ferrandiz et al., 2000), and has
been used previously as a source of meristems for tran-
scriptomic analyses (Gomez-Mena et al., 2005; Wellmer
et al., 2006; Kaufmann et al., 2010). Using high-throughput
sequencing of DNA fragments obtained by MspJI digestion
(Table S1), we identified 1 565 127 cytosines that were
potentially methylated during flower development. We ran-
domly selected nine regions of eight genes that contained
identified mC sites to perform real-time PCR-based valida-
tion, and the results confirmed the methylation status
detected by MspJI-seq (Figure S1 and Table S2). To further
validate the technical reproducibility of our MspJI-seq data,
we obtained methylation data for 44 randomly selected
genes, including the same eight genes mentioned above,
and 39 randomly selected genomic regions, by performing
bisulfite sequencing experiments on the same tissues; we
observed good consistency between MspJI-seq and bisul-
fite sequencing results (Figure S2).
Among the mC sites we identified, 453 066 were in the
CG dinucleotide context, including 207 115 that were previ-
ously detected in Arabidopsis seedlings (Cokus et al.,
2008). In addition, we detected 425 428 mCHG and 685 100mCHH sites (Figure 1a); these mC sites accounted for
17.5%, 13.7% and 5.2%, respectively, of the three types of
cytosine sites (in the CNNR context) that are potentially
digested and detected by MspJI-seq. Among detected
methylation sites, approximately 73% (1 141 758) were in
exons, 3% (46 208) were within introns, 8% (125 325) were
in putative promoter regions (1 kb region upstream of tran-
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
2 Hongxing Yang et al.
scription start sites), and 16% (250 303) were in intergenic
regions. Interestingly, we found a slightly higher percent-
age of mCHH sites (17%) in intergenic regions, compared
with mCG (15.2%) and mCHG (15.0%) sites (P < 1e-60, v2
test) (Figure 1b), indicating that symmetric and non-sym-
metric methylation sites were not evenly distributed
between genic and intergenic genomic sequences.
To obtain an overview of the detected DNA methylation,
we compared the methylation levels of each 100 kb win-
dow throughout the genome in various sequence contexts
(mCG, mCHG, mCHH) and genic regions (exon, intron, pro-
moter or intergenic). First we normalized the methylation
level as reads per kilobase of cytosines of CNNR sites
(each site counts as 1 bp) per million mapped reads
(RKCM) (see Experimental procedures). We found that het-
erochromatic regions (centromeres and peri-centromeric
zones) had relatively high methylation levels for all three
the sequence contexts (Figure 1c), and the methylation
status was highly stable between the three floral develop-
mental periods, consistent with the high frequency of TEs
in these regions (Figure 1d and Figure S3). Hence,
methylation at the heterochromatic regions is least corre-
lated with development, reminiscent of the discovery that
DNA methylation around centromere and peri-centromeric
regions varies least among Arabidopsis populations (Sch-
mitz et al., 2013). In contrast, methylation levels in euchro-
matin were relatively low but more variable between the
various types of mC site: methylation at mCG sites was
higher than at mCHG and mCHH sites, consistent with pre-
vious findings (Cokus et al., 2008; Lister et al., 2008) and
lower percentages of mCG sites were stage-specific thanmCHG and mCHH sites; furthermore, exons were most
highly methylated, particularly for mCG sites.
Difference in methylated sites between developmental
stages
The number of mC sites increased by 8% from meristems
(1 000 123) to early flowers (1 080 179), then decreased
slightly in late flowers (1 074 245). This trend was true for
all sequence contexts (mCG, 6.4%; mCHG 7.2%; 9.8% formCHH). The number of mCs detected in early flowers but
not in meristems was 96 708 for mCG sites, 95 240 for
(a)
(c)
(b) (e)
(f)
(d) (g)
Figure 1. The DNA methylome of Arabidopsis
flowers.
(a) Percentage of identified mCs for each
sequence context. Numbers on the top of each
bar indicate the number of methylcytosines.
(b) Percentages of mCs found in each genomic
region.
(c) Log2-transformed methylation levels (RKCM
values; scale shown on top right) for 100 kb
sliding windows (step size 50 kb) for each
sequence context, floral stage and genomic
region. M, meristem; E, early flower; L, late
flower. Genomic coordinates on concatenated
chromosomes are indicated at the bottom of
the heatmaps.
(d) Proportions of floral stage-specific as well
as non-specific mCs across the genome (200 kb
sliding window; step size, 100 kb). ME, ML, EL,
methylcytosines shared between two tissues;
MEL, methylcytosines common to three tissues.
(e–g) Comparison of mCs between developmen-
tal stages for mCG, mCHG and mCHH sites,
respectively.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
Arabidopsis floral DNA methylomes and transcriptomes 3
mCHG sites, and 178 087 for mCHH sites, significantly out-
numbering mCs found in floral meristems but not in early
flowers (77 052 for mCG sites, 74 861 for mCHG sites, and
138 066 for mCHH sites) (Figure 1e–g), indicating extensive
new methylation during early flower development. The
newly methylated sites in early flowers are associated with
a large number of genes, of which 6570 genes containedmCG sites, 4570 contained mCHG sites, and 5602 containedmCHH sites (results supported by five reads or more). From
early to late flowers, only the number of mCG sites was
slightly increased (by 2.1%), whereas the numbers of mCHG
and mCHH sites were both slightly decreased by 1.7% (Fig-
ure 1e–g).
The proportion of tissue-specific mC sites also appeared
to vary across sequence contexts. In particular, 144 299mCG sites (31.8%), 147 328 mCHG sites (34.6%), and
301 267 mCHH sites (44.0%) were specific to one of the
three floral tissues. On the other hand, 32.7% of mCHH sites
were methylated in all three tissues, compared with 43.4%
and 46.7% of mCHG and mCG sites, respectively, observed
for all tissues. Therefore, we found more between-tissue
variations in number of mCHH sites than the number ofmCHG sites, which in turn was slightly more variable than
the number of mCG sites (Figure 1e–g).
Distinct gene methylation patterns during floral
development
To investigate the relationship between gene expression
and DNA methylation during floral development, we deter-
mined the positions of mC sites relative to individual genes.
In total, we found 24 035 genes with at least one mC site
supported by five reads or more (Table S3). With regard to
the three sequence contexts, we found 20 569, 17 409 and
18 746 genes containing at least one mCG, mCHG and mCHH
site, respectively. The overall methylation at gene level in
Arabidopsis flowers was compared with recently published
DNA methylome data (Schmitz et al., 2013) from mixed
stages of Arabidopsis flowers, for which >20 000 genes
were found to have at least one mC site. Over 83% of these
genes were also found in our list of methylated genes, rep-
resenting a high level of consistency even though different
ecotypes were used in these two studies (Col-0 for Schmitz
et al. and Landsberg erecta for this analysis).
We then divided genes into classes according to their
coding potential, namely protein-coding, microRNA (miR-
NA), other non-coding RNA, pseudogenes and TE genes,
and observed distinct methylation patterns for the various
classes of genes. Approximately 70% of annotated protein-
coding genes (19 182 of 27 416) were methylated in one or
more of the floral tissues, forming the largest group of
methylated genes (79.81%) (Figure S4). Only 15% of these
protein-coding genes were methylated specifically in one
of the three tissues (meristem, 1518, approximately 7.9%;
early flower, 667, approximately 3.5%; late flower, 715,
approximately 3.7%) (Figure 2a,b). Furthermore, protein-
coding genes were more likely to be methylated at mCG
sites than at mCHG and mCHH sites (mCG: 16 187, approxi-
mately 59%, mCHG, 13 060, approximately 48%, mCHH,
14 438, approximately 53%; P < 1e-100, v2 test).Consistent with the role of DNA methylation in silencing
TEs (Zhang et al., 2006; Cokus et al., 2008; Law and Jacob-
sen, 2010), almost 92% of annotated TE genes (3588 of
3903) were methylated in one or more floral tissues, with
only approximately 2.4% (93) being specific to one of the
tissues (Figure 2a,b). We found no significant differences
in methylation at the different mC sequence contexts for TE
genes (P > 0.1 for all comparisons, v2 test). In addition to
TE genes, the Arabidopsis genome is annotated as pos-
sessing 31 118 TEs, most of which do not contain genes,
but may impair genome integrity after activation by trans-
posases or reverse transposases (Law and Jacobsen,
2010). We thus examined methylation of these TEs in floral
tissues. In contrast to the broad methylation of TE genes,
we found that only 24% of TEs (7516 of 31 188) were
methylated in meristems. Previous studies revealed strong
associations between TE methylation and actions of siR-
NAs (Lister et al., 2008; Ahmed et al., 2011). To check how
often siRNAs participated in the methylation of TEs in flo-
ral tissues, we obtained siRNA sequencing data for Arabid-
opsis seedlings (Chodavarapu et al., 2010), and found that
approximately 40% TEs were potential targets of siRNAs
(Figure S5). This relatively low hit ratio may be ascribed to
tissue-specific expression of some siRNA species, the pos-
sibility that a substantial proportion of TEs was not methy-
lated via siRNAs, or the possibility that silencing of TEs
may be primarily achieved via silencing of TE genes.
Genes of different classes also showed differences in the
patterns of methylation variation during floral develop-
ment. Significantly higher proportions of mCG-containing
protein-coding genes (42.1%) than mCHG-containing pro-
tein-coding genes (28.7%) or mCHH-containing protein-cod-
ing genes (23.2%) were differentially methylated between
meristems and early flowers. Similar patterns were
observed for pseudogenes, miRNA and non-coding RNA
genes, although to a lesser extent (data not shown), sug-
gesting similar DNA methylation-related regulatory mecha-
nisms that may reflect a common evolutionary origin and/
or functional constraints.
In contrast, more mCHH-containing TE genes showed
methylation variation than mCG or mCHG TE genes. A closer
look showed that, for TE genes, methylations at mCHH sites
primarily decreased during flower development, while
those at mCG sites were mainly increased (Figure 2c). Con-
cordantly, TEs mainly showed increased methylation,
which primarily occurred at mCG sites and mCHH sites,
although only 2124 TEs (6.8%) were differentially methylat-
ed between meristem and early/late flower development
stages (Figure S5). Taken together, TEs tended to become
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
4 Hongxing Yang et al.
hypermethylated as floral development proceeded, possi-
bly because of a greater need to protect the reproductive
cells against the mutagenic activities of TEs (Slotkin et al.,
2009; Yang et al., 2011). Furthermore, CG methylation may
play critical roles in the increased methylation of TEs as well
as TE genes. TEs may be classified into various families that
may have different methylation patterns during floral devel-
opment. Indeed, the LTR/Gypsy family of retrotransposons,
comprising 13.4% of all TEs in Arabidopsis, accounted for
36.0% of methylated TEs, whereas the RC/Helitron type of
DNA transposons, representing 41.5% of all TEs, consti-
tuted only 11.7% of methylated TEs in the meristem (Figure
S5), indicating that retrotransposons are more tightly con-
trolled by methylation than DNA transposons. Furthermore,
some TE families were more likely to be differentially
methylated than others. In brief, differentially methylated
retrotransposons mainly fall into the families of LTR/Copia,
LTR/Gypsy and Line/L1, with little preference observed with
respect to mC context class; differentially methylated DNA
transposons primarily came from the families of hAT and
DNA/others (Figure S6).
Finally, we examined relative methylation levels (RKCM)
across transcribed regions and the surrounding genomic
regions. We found similar profiles to those previously
(a)
(b)
(c)
(d)
(e)
(f)
Figure 2. Distinct methylation patterns for
genes in Arabidopsis flowers.
(a) Percentage of methylated genes for each tis-
sue. The colored squares at the bottom of each
vertical bar represent mC contexts. TE, trans-
posable element; ncRNA, non-coding RNA;
miRNA, microRNA.
(b) Between-tissue comparison of methylated
genes for each gene class. The colored bar at
the bottom of each vertical bar indicate mC con-
text, as defined in (a).
(c) Percentage of differentially methylated
genes for each class (ncRNA and miRNA com-
bined) and sequence context. The plus and
minus symbols indicate increased/decreased
methylation, respectively, between meristems
and early flowers.
(d–f) Normalized methylation levels at various
stages (M, E, and L) for protein-coding genes
(shades of green), pseudogenes (purple, red
and orange) and transposable element genes
(shades of blue). TSS, transcription start site;
TES, transcription termination site.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
Arabidopsis floral DNA methylomes and transcriptomes 5
reported (Cokus et al., 2008; Feng et al., 2010; Zemach et
al., 2010), and these were also quite similar between floral
stages. For example, the methylation of protein-coding
genes mainly occurred at mCG sites, and was confined
within genic regions. The methylation of TE genes showed
similar patterns across sequence contexts, i.e. primarily
occurred within genic regions, dramatically decreased at
the transcription start site (TSS) and after the transcription
end site (TES), and increased again at adjacent regions fur-
ther upstream of the TSS and downstream of the TES (Fig-
ure 2d–f). Methylations in pseudogenes showed similar
patterns to those for TE genes, with levels intermediate
between TE genes and protein-coding genes, consistent
with previous observations (Cokus et al., 2008). These
results suggested a primary role of MET1 (for mCG) in the
methylation maintenance of protein-coding genes, and
comparable contributions of MET1, DRM1/2 and CMT2/3
(for mCHG and mCHH) to methylation of pseudogenes and
TE genes (Chan et al., 2005; Stroud et al., 2014).
DNA methylation at different genic regions differentially
correlates with gene expression
DNA methylation in promoter regions is often associated
with transcriptional silencing (Zhang et al., 2006), but
recent studies have revealed distinct relationships between
gene expression and methylation in different genic regions
(Brenet et al., 2011). To assess this relationship, we mea-
sured gene expression profiles by RNA sequencing using
the same three tissues used for DNA methylation studies.
Transcripts of 26 764 genes were detected in at least one
floral tissue, with 5768 of them significantly differentially
expressed between the three tissues (see Experimental
procedures). To examine the relationships between expres-
sion and methylation, genes were sorted into three equal-
sized groups according to expression levels or methylation
intensities at a specific genic region (exon, intron, 1 kb
upstream/downstream of the TSS). We found a general
trend of negative associations between expression levels
and methylation levels for all mC contexts and gene
regions, especially gene body regions (Figure 3a–c).
We then compared the low- and high-expression genes
with respect to normalized methylation levels in different
genic regions, with exons and introns classified according
to their positions relative to TSS and TES sites. Consistent
with previous findings (Zhang et al., 2006; Zemach et al.,
2010), genes in the high-expression group were relatively
hypomethylated at all mC contexts in the 1 kb upstream and
downstream regions (Figure 3d–f), and first exons of genes
in the high-expression group were methylated at lower lev-
els than first exons of genes in the low-expression group,
for all mC contexts (Brenet et al., 2011; Chuang et al., 2012).
Notably, the methylation level in the first intron showed
even more significant negative correlation with expression
levels. Hence, regions near TSS may closely participate in
methylation-dependent transcriptional silencing. Further-
more, internal exons of genes with high expression tended
to be significantly highly methylated at mCG sites, but
methylated to low levels at mCHG and mCHH sites; internal
introns showed little or no association between expression
and methylation levels (Figure 3d–f). These observations
suggest that the influences of methylation on gene expres-
sion vary depending on genic region and sequence context.
DNA methylation in different sequence contexts is
differentially associated with gene expression levels
during floral development
By comparing raw read numbers and RKCM values for
each gene in each of the three sequence contexts (Figure
S7 and Table S3), we identified 11 880 genes with statisti-
cally significant methylation variations between meristem
and the early flower, indicating that DNA methylation in
the gene body (body methylation) is broadly regulated
across the genome (Figure 4a and Table S4). In addition,
2503 genes showed significant methylation variations in
their putative promoter regions during the same develop-
mental period, consistent with the relatively poor methyla-
tion of promoter regions (promoter methylation) (Zhang
et al., 2006). Further comparison revealed that only 1235
genes were simultaneously differentially methylated in
both body and promoter regions during early flower devel-
opment (Figure 4a), suggesting that body methylation and
promoter methylation are largely regulated separately,
possibly by independent mechanisms.
We identified 3067 genes that showed significant varia-
tions in both methylation and gene expression during floral
development (hereafter termed co-differential genes), with
only 10% (317 of 3067) being differentially methylated
across all the three sequence contexts (Figure 4b), suggest-
ing sequence context-dependent effects of methylation on
transcription. Moreover, among the 3986 genes differen-
tially expressed between meristems and early flowers, 2117
contained mCG sites, and 1048 (49.5%) of them were co-dif-
ferential at mCG sites, compared with 601 of 1744 mCHG-
containing genes (34.5%) and 509 of 1894 mCHH-containing
genes (26.9%) (Figure 4c and Table S5). We also observed
that a significantly higher proportion of transcriptionally
up-regulated genes were co-differential between meristem
and early or late flowers in comparison with genes that
were down-regulated at the same periods; this was consis-
tent across all mC sequence contexts (Figure 4c). Hence,
variation in DNA methylation levels may function as a sig-
nal affecting transcription during early flower development.
Body methylation variations may serve as important
transcriptional regulatory signals during floral
development
Our dataset provides clues to understanding how the varia-
tion in methylation level affects development. Among the
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
6 Hongxing Yang et al.
co-differential genes are several genes encoding important
floral developmental regulators, including SEP1, LEUNIG
and SEEDSTICK (Figure 5a–c). SEP1 encodes a MADS box
protein that is important for determining floral organ iden-
tity (Pelaz et al., 2000), and was found to have dramatically
increased methylation and transcription levels during early
flower development. LEUNIG encodes an important repres-
sor of AG, whose product is required for the identity of sta-
mens and carpels, and for meristem determinacy
(Mizukami and Ma, 1992; Conner and Liu, 2000; Sridhar
et al., 2004). We also identified co-differential TE genes,
including AT1G64270, which encodes the transposase for
Mutator-like DNA transposons. We found no methylation
in the putative promoter of AT1G64270, and its transcribed
region was dramatically demethylated during the early
flower development, which may have resulted in its tran-
scriptional suppression (Figure 5d). Hence, gene region
demethylation may also function to silence TEs whose
mutagenic activities may destabilize the genome during
reproduction.
Interestingly, we found that the increase in methylation
at the 30 exons and demethylation at the 50 part of the gene
DEMETER-LIKE 1 (DML1) correlated with its transcriptional
reduction (RNA-seq RPKM values: M, 46.3; E, 21.8; L, 21.2)
(Figure 5e). DML1 may function as a transcriptional repres-
sor and demethylase, consistent with the genome-wide
(a)
(b)
(c)
(d)
(e)
(f)
Figure 3. Methylation at various genic regions
differentially associated with gene expression.
(a–c) Comparison of gene expression and meth-
ylation levels for mCG, mCHG and mCHH sites,
and for each genic region: upstream 1 kb
regions (Up1k), exons, introns, and down-
stream 1 kb regions (Down1k). Low and high
methylation represent the one-third of genes
with the lowest and highest methylation levels
at each genic region, respectively. L and H at
the bottom of the bars indicate the third of
genes with the lowest or highest expression
levels. Gene percentages were calculated sepa-
rately for each L or H expression group.
(d–f) Comparison of normalized methylation
levels (RKCM) between low- and high-expres-
sion genes for mCG, mCHG or mCHH sites in
each genic region. Asterisks indicate the statisti-
cal significance of the indicated differences
(*P > 0.05; **P > 0.001; ***P < 0.001; Mann–Whitney U test).
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
Arabidopsis floral DNA methylomes and transcriptomes 7
up-regulation of important floral development regulators
and the massive de novo methylation during early flower
development (Gong et al., 2002; Agius et al., 2006).
Methylated genes exhibited distinct patterns among
sequence contexts/stages, and were enriched for diverse
biological processes
To discover any shared methylation patterns among the
3067 methylated genes, we performed consensus cluster-
ing on the normalized methylation levels for the three mC
sequence contexts in the three tissues (Figure 6a), result-
ing 11 clusters of distinct methylation patterns. Genes in
different clusters showed significantly different mean
methylation levels and patterns across floral development
stages (Figure 6b). For example, genes in cluster I were
highly methylated at mCG, mCHG and mCHH sites, with
higher levels in the early and late flower tissues than in the
flower meristem, but those in cluster II showed relatively
high methylation for mCG, intermediate methylation formCHH, and low methylation for mCHG, at all developmental
stages (Figure 6b). Clusters III–VI showed similar methyla-
tion levels for two of the three sequence contexts. Cluster
VII had higher methylation levels in the meristems than
the other two tissues, for all three sequence contexts,
whereas cluster VIII showed a similar developmental pat-
tern for mCG sites only. Clusters IX–XI were similar in that
the methylation levels increase from the meristem stage to
early and late flower development, but differed in
sequence contexts: cluster IX showed similar patterns for
all three sequence contexts, whereas clusters X and XI,
respectively, showed increased methylation levels for mCG
and mCHH sites only (Figure 6b). These results support the
hypothesis that demethylation and de novo methylation by
different methyltransferases during floral development lar-
gely occur independently of each other.
We next performed gene ontology (GO) enrichment
analysis to explore the associations between methylation
variation and gene functions. The enriched GO categories
for methylated genes suggested possible involvement in
diverse biological processes, including meristem develop-
ment, floral organ development, mitosis and meiosis, as
well as plant body pattern specification (Figure 6c and
Tables S6 and S7). Several genes known to be involved in
floral organ development and reproduction processes were
found among these genes, including MSH7, VIM3, CHR42
ans NUA (Figure 5f and Figure S8). Therefore, DNA meth-
ylation reprogramming may contribute significantly to the
regulation of floral development in Arabidopsis.
DISCUSSION
We used MspJI-seq to survey the genome-wide DNA meth-
ylation during Arabidopsis flower development, and
uncovered a large number of potential de novo methyla-
tion sites in early flower development and demethylation
sites in late flower development, supporting the idea that
extensive de novo methylation as well as demethylation
occur during flower development. Different methylation
patterns for the three sequence contexts (mCG, mCHG andmCHH) and in different genic regions potentially have dif-
ferent effects on gene expression. The whole-genome DNA
methylation and gene expression patterns and derived
hypotheses of their interactions reveal more complex rela-
tionships than expected. Future functional studies are
required to elucidate the mechanisms of control of DNA
methylation and gene expression, and to understand the
biological functions and mechanisms for regulating floral
genes.
The ap1 cal double mutant has been widely used to
obtain a relatively large amount of meristems (Wellmer
et al., 2006) (Kaufmann et al., 2010; Wuest et al., 2012),
(a) (b)
(c)
Figure 4. Genes with DNA methylation variations during Arabidopsis floral
development.
(a) Comparison of genes differentially methylated at one or more sequence
contexts and differentially expressed between meristem and early flower.
‘Gene Body’ and ‘Promoter’ represent the transcribed region and the 1 kb
upstream region of genes, respectively; ‘Transcription’ represents genes
that are differentially expressed.
(b) Comparison of differentially expressed genes according to methylation
variations among sequence contexts.
In (a) and (b), the numbers outside each circle indicate the total gene counts
in the respective group.
(c) Percentage of differentially methylated genes among each group of dif-
ferentially expressed genes during floral development. ME, meristem to
early flower; ML, meristem to late flower; EL, early to late flower. Plus sym-
bol, up-regulation; minus symbol, down-regulation. The gray lines divide
each bar into a bottom part representing genes with increased methylation,
and a top part representing genes with decreased methylation. Asterisk
indicate statistically significant differences (*P < 0.05; **P < 1e-3; v2 test).
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
8 Hongxing Yang et al.
because the mutant is arrested at the inflorescence meri-
stem stage (Bowman et al., 1993; Ferrandiz et al., 2000).
However, we cannot rule out possible effects of the muta-
tions on the DNA methylation and gene expression pat-
terns, and caution is required when interpreting the
differences in results between the ap1 cal meristem and
the early and late flowers. Efficient methods for separating
and collecting wild-type meristems are required for further
studies to confirm the differences relating to the meris-
tems.
Different methylation patterns between mC sequence
contexts and genic regions
The results here allow comprehensive analyses of many
aspects of DNA methylation patterns. The approximately
1.5 million methylation sites from three flower tissues
included 453 066 mCG sites, 425 428 mCHG sites and
685 100 mCHH sites, representing a dramatic increase of
37% for detection of mCHH sites compared with approxi-
mately 500 000 mCHH sites in young flowers as previously
reported (Lister et al., 2008). Previously CHH methylation
has been found to play important roles in various plant
developmental processes in Arabidopsis endosperm,
maize (Zea mays) and cotton fibers (Gossypium hirsutum)
(Hsieh et al., 2009; Gent et al., 2013; Jin et al., 2013). The
relatively high proportion of newly identified mCHH sites
reported here (Figure 1d) and the correlation of >1000genes with variation in CHH methylation (Figure 4b) sug-
gest that CHH methylation may be important for floral
gene expression.
In addition, our observations that methylation patterns
and developmental changes in methylation vary depend-
ing on sequence contexts and gene classes suggest com-
plex relationships between sequences/genes, methylation
and developmental gene functions. For example, the num-
ber of mCs increased from the meristems to early flowers
for all sequence contexts, but only mCG sites increased in
number from early to late flowers. The period from meri-
stem to the early flower involves mainly organ identity
specification and organogenesis, whereas the subsequent
period includes much of the floral organ growth, as well as
gametophyte development. Therefore, our results suggest
that gene expression associated with mCG sites may play a
greater role in floral organ growth. The finding that
(a)
(c)
(e)
(b)
(d)
(f)
Figure 5. Representative Arabidopsis genes
with variations in DNA methylation and floral
expression.
Tracks of MspJI-seq and RNA-seq reads are
shown for each gene, encompassing the tran-
scribed region as well as the upstream and
downstream 1 kb regions. Gene structures are
shown at the bottom of each graph, with blue
boxes representing exons and arrows indicat-
ing introns and the transcription direction of
the respective gene. In some cases, parts of the
exon–intron structure for adjacent genes are
shown; they do not span the central regions
and are not connected with the centrally located
gene models. M, meristem; E, early flower; L,
late flower.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
Arabidopsis floral DNA methylomes and transcriptomes 9
protein-coding genes showed preferential enrichment ofmCG sites and depletion of mCHG and mCHH sites in the
transcribed regions is consistent with previous results
(Zhang et al., 2006; Cokus et al., 2008; Feng et al., 2010; Ze-
mach et al., 2010). In contrast, TE genes and pseudogenes
showed similar mC frequencies in the various sequence
contexts across transcribed and nearby genomic regions,
with mCs being enriched in transcribed regions and
depleted near the transcription start sites (Figure 2d–f).
Therefore, protein-coding genes and other types of genes
are probably affected by DNA methylation in different
ways. Moreover, only a few genes show simultaneously
changes in methylation levels and expression levels at all
sequence contexts, suggesting that influences on expres-
sion by mCs of different sequence contexts tended to be
unrelated to each other (Figure 4b).
Our findings also support the idea that the relationships
between DNA methylation and gene expression levels are
more complicated than previously thought (Zhang et al.,
2006; Suzuki and Bird, 2008; Ball et al., 2009). Promoter
methylation is often linked with gene expression suppres-
sion, whereas the role of gene body methylation is far
more uncertain, with both positive and negative relation-
ships reported (Zhang et al., 2006; Zilberman et al., 2007;
Li et al., 2008; Zemach et al., 2010). These contradictory
findings may be explained when considering the observa-
tions that DNA methylation interacts with other factors,
including histone modifications and siRNAs, to determine
transcriptional status (Li et al., 2008; Stroud et al., 2014). In
addition, the effect of DNA methylation on expression may
also depend on genic regions or sequence contexts. In
humans, methylation of the first exon was more signifi-
cantly associated with gene silencing than methylation of
the nearby promoter (Brenet et al., 2011). Similarly, our
observation in Arabidopsis of strong negative relationships
between DNA methylation at all sequence contexts in the
first exon and expression (Figure 3) suggests that the rele-
vant mechanisms may be conserved between animals and
plants. The even stronger negative effects of methylation
the first intron are consistent with the fact that the first in-
trons of eukaryotic genes often carry regulatory elements
for transcription (Majewski and Ott, 2002; Bradnam and
Korf, 2008; Bieberstein et al., 2012). We also found that
methylation at mCG sites of internal exons tended to be
positively correlated with gene expression, unlike methyla-
tion at the mCHG and mCHH sites of internal exons and all
sequence contexts of internal introns (Figure 3). Therefore,
our separate analyses regarding sequence contexts and
genic regions revealed that the effects of DNA methylation
on gene expression are not only position-dependent (i.e.
(a) (b) (c) Figure 6. Methylation pattern and functional
implications for differentially methylated and
expressed genes.
(a) Clustering of genes with concurrent varia-
tions in expression and methylation on the
basis of similar methylation patterns during flo-
ral development. The color gradient represents
the log2-transformed RKCM values, as indicated
by the color bar at the top.
(b) Summarized methylation patterns for each
cluster. M, meristem; E, early flower; L, late
flower.
(c) Enriched biological processes among each
gene cluster. Gene ontology (GO) terms were
grouped into more general biological pro-
cesses, as shown on the right (see also Table
S6). Colors represent the number of genes for
each GO term in each gene cluster, as indicated
by the color bar at the top.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
10 Hongxing Yang et al.
with respect to genic region), but also sequence context-
dependent, providing further insight into the relationships
between methylation and gene expression.
Implications for functional roles of DNA methylation in
flower development
Changes in genome-wide DNA methylation levels have
been associated with plant development, including global
demethylation in the endosperm of the developing seed
compared with the embryo (Gehring et al., 2009; Hsieh
et al., 2009) and lower methylation in the central cell of the
female gametophyte than in somatic cells (Jullien and Ber-
ger, 2010). Previous studies also suggested that the DNA
methylation level tends to increase during development
from seedling through vegetative stages to floral stages
(Ruiz-Garcia et al., 2005). Our analysis of differential meth-
ylation between three floral tissues revealed that more
sites are methylated de novo than demethylated from mer-
istem to early flowers, whereas similar numbers of sites
were methylated and demethylated from early to late
flower development (Figure 1e–g). These findings suggest
that plant DNA methylation is under critical control during
development, and proper re-patterning of DNA methyla-
tion is important for the developmental program.
The importance of DNA methylation in development has
been demonstrated by genetic studies of methyltransfer-
ase genes or genes encoding chromatin remodelers,
including MET1 and DDM1, with effects on embryogenesis,
meristem identity and flowering time (Finnegan et al.,
1996; Kakutani et al., 1996; Ronemus et al., 1996; Xiao
et al., 2006). The effects of DNA methylation occur at least
in part via modulating transcription of developmental reg-
ulators (Xiao et al., 2006; Li et al., 2008; Gehring et al.,
2009; Hsieh et al., 2009). In addition, DNA methylation reg-
ulates genes that play critical roles during flower develop-
ment; for example, hypermethylation of SUPERMAN and
AG phenocopies the corresponding mutants (Jacobsen
et al., 2000). Our results showing that DNA methylation is
associated with changes in the expression of over 3000
genes suggest that methylation affects genes with diverse
roles during flower development, such as regulation of
early flower development (33 genes) and pollen develop-
ment (21 genes) (Table S6). In addition, the expression of
regulatory genes is probably affected by methylation,
including genes for transcriptional control (201), chromatin
organization (29) and signal transduction (56).
In particular, changes in DNA methylation were linked to
expression changes for several key regulators. For
instance, the expression of floral regulators SEP1 and
SEEDSTICK was affected by DNA methylation (Figure 5a,
c). An effect of DNA methylation on the C function gene
AG was previously reported: hypermethylated epi-alleles
of AG were found in plants with reduced global methyla-
tion (Jacobsen et al., 2000). Our data suggest that DNA
methylation may indirectly influence the expression of AG
in wild-type plants, similar to the case for the FLC gene
(Finnegan et al., 2005). Decreased methylation in early and
late flowers may result in down-regulation of LEUNIG (Fig-
ure 5b), a negative regulator of AG (Sridhar et al., 2004). In
addition, methylation may also function to activate BLH9
and/or suppress PERIANTHIA (Figure S8a), both genes
encoding transcription factors that are required for proper
floral expression of AG (Bao et al., 2004; Das et al., 2009).
On the other hand, the methylcytosine-binding proteins
VIM1, VIM2 and VIM3, which are involved in hypermethy-
lation of the flowering-time gene FWA and its subsequent
suppression (Woo et al., 2008), were observed to have
high gene expression levels that may also be affected by
methylation (Figure S8b), suggesting deep involvement of
DNA methylation in flowering transition.
Our results also suggest that DNA methylation may
affect other cellular and developmental processes during
reproduction, such as pollen tube growth, mitosis and mei-
osis. For example, GO enrichment analysis revealed over-
representation of 21 genes associated with pollen develop-
ment in cluster VI of methylated genes (Figure 6 and Table
S7), and correlated changes in methylation and expression
between the three floral tissues/stages for genes participat-
ing in meiosis, including AtMSH7, AtMSH4, AtDMC1 and
AtSMC3 (Figure 5f and Table S4), suggesting that DNA
methylation may be involved in the floral development
program as well as embryogenesis and other aspects of
plant development. Our analyses provide insights into the
possible roles of DNA methylation in gene expression, and
important resources for further investigation of the genetic
pathway that regulates flower development.
EXPERIMENTAL PROCEDURES
Plant growth, tissues collection, and DNA isolation
Plants of the Arabidopsis thaliana Landsberg ecotype that werehomozygous for the erecta mutation (Ler) and ap1 cal mutantplants(also in the Ler background) were grown in soil in a plantgrowth room at 22°C under 16 h light/8 h dark cycles. The meris-tems of the ap1 cal mutant plants, wild-type early flowers (stages1–9) and late flowers (stages 10–12) were collected separately (Fig-ure S9). The materials collected for each sample were taken frommany individuals and in large amounts (2 g) to control the effect ofbiological variations. Genomic DNA was extracted using a DNeasyplant mini kit (Qiagen, http://www.qiagen.com), precipitated usingtwo volumes of EtOH with a one-tenth volume of 3 M NaAc (pH5.2), and resuspended in Tris/EDTA buffer (pH 8.0) to a final con-centration of 1 lg ll�1.
MspJI digestion and DNA recovery
A 5 lg aliquot of each DNA sample was added to 90 ll of well-mixed NE Buffer 4 (New England Biolabs, https://www.neb.com),BSA and nuclease-free digestion mix before addition of 20 unitsof MspJI enzyme (New England Biolabs, https://www.neb.com) foreffective digestion without obvious DNA degradation (FigureS10a), and incubated at 37°C for 16 h.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
Arabidopsis floral DNA methylomes and transcriptomes 11
After digestion, the DNAs were separated in a 20% polyacryl-amide gel (arc/bis: 29:1; 50 mA, 2.5 h). The polyacrylamide gelpieces containing DNA of approximately 32 bp (Figure S9a) wereexcised from the polyacrylamide gel and crushed and transferredinto sterile microfuge tubes containing 300 ll of buffer comprising0.3 M sodium acetate, pH 7.5, with 0.1 mM EDTA), and shakenovernight at 37°C. Then the gel was pelleted by centrifugation for2 min at 14 000 g at room temperature, and the supernatant wascollected. DNA was precipitated by adding 2 ll glycogen and twovolumes of 100% ethanol to each tube, keeping the tubes at�80°C for 30 min, followed by centrifugation for 30 min at14 000 g at 4°C. The DNA pellets were washed with 70% ethanoltwice (for 2 min at 14 000 g at room temperature) before resuspen-sion in Tris/EDTA buffer (pH 8.0).
Construction of a DNA methylation fragment library for
sequencing
The recovered DNA samples were used to construct sequencinglibraries according to the fragment library preparation protocoldescribed in the SOLiDTM system library preparation guide (LifeTechnologies, http://www.lifetechnologies.com/) with some mod-ifications. Both ends of recovered DNA were repaired using ENDpolishing enzymes 1 and 2 from the SOLiDTM fragment libraryconstruction kit, and purified using a QIAquick nucleotideremoval kit (Qiagen) and spin columns from a MinElute� reac-tion clean-up kit (Qiagen). After purification, the P1 and P2 adap-tors from the SOLiDTM fragment library construction kit wereligated to the DNA fragments, followed by another purificationstep using the QIAquick nucleotide removal kit. Recovered DNAlibraries were nick-translated and amplified by PCR for tencycles, then purified using a QIAquick nucleotide removal kit.Purified DNA libraries were subjected to 4% agarose gel electro-phoresis, and bands of approximately 100 bp (Figure S10b) wererecovered using a QIAquick gel extraction kit (Qiagen). Theresulting libraries were analyzed using an Agilent bioanalyzer(http://www.agilent.com), and 0.5 pmol of each DNA library wereused to perform emulsion PCR reactions according to the tem-plated bead preparation guide from the SOLiDTM system. The35 bp sequence reads were obtained using the SOLiDTM 3.0 sys-tem, and were subsequently aligned against the Arabidopsis ref-erence genome sequences (TAIR10, http://www.arabidopsis.org)using BIOSCOPE software version 1.3 (Life Technologies) (seeMethods S1). The overall mapping rates were approximately61% (Table S1).
DNA methylome analysis
We developed an in-house program to identify SOLiD reads sup-porting methylcytosines. In brief, six cases of paired sequencepatterns were inferred based on the MspJI digestion properties,with each case resulting in DNA fragments that could be recov-ered (Figure S11 and Methods S2). Aligned reads that may beclassified into one of the six patterns were identified as methylreads, each supporting two potential mCs. MspJI cut double-stranded DNA at fixed distances downstream of the recognizedsequence (mCNNR), with the cleavage on the reverse strand wob-bling by one nucleotide (16 or 17) (Zheng et al., 2010). Methylreads were counted separately for each of the two cleavage pat-terns (N12/N16 or N12/N17), resulting in an N16/N17 ratio of approxi-mately 3:1 (Table S8). The number of reads supportingmethylcytosines was calculated for each genomic feature (genes,exons, etc.) and sequence context separately, and normalized asRKCM values. Fisher’s exact test was performed on the readcounts of a pair of tissues for each gene, and a gene was defined
as differentially methylated when the false discovery rate (calcu-lated by the Benjamini-Hochberg method) was < 0.001 and theratio of RKCM values was >2.
Gene ontology enrichment analysis was performed using the R
package topGO (Alexa and Rahnenfuhrer, 2010). The graphs inFigure 5 displaying the DNA methylation and gene transcriptionprofiles were produced using the R package Gviz (Hahne et al.,2013). Consensus clustering on the log2-transformed RKCM valuesof differentially methylated genes was performed using the R
package clusterCons (Simpson et al., 2010) with the algorithm ofpartitioning (clustering) of the data into ‘k’ clusters ‘around med-oids’ (PAM).
Total RNA isolation and transcriptome analysis
All three tissues were frozen in liquid nitrogen immediately aftercollection. Total RNAs were extracted using a Plant RNase mini kit(Qiagen). More than 2 lg of total RNA with an absorbance at 260/280 nm between 1.8 and 2.0 from each sample were used to cre-ate libraries that were deep-sequenced using the IlluminaTM Hi-seq2000 system (Illumina, http://www.illumina.com), to obtain 100 bppaired-end reads. All reads with fewer than two mismatches weremapped to the Arabidopsis genome (TAIR10) by TopHat (Trapnellet al., 2009). Calculation of expression values and differentialexpression analysis were performed using GFOLD (Feng et al.,2012) with default parameters.
ACKNOWLEDGEMENTS
We thank Zhiyi Sun from New England Biolabs for discus-
sion regarding computational identification of MspJI-
digested methylation sites. We appreciate the data kindly
provided by Steven Jacobsen and Matteo Pellegrini from
University of California, Los Angeles. This work was sup-
ported by the Ministry of Science and Technology of the
People’s Republic of China (MOST 2012CB910503), the
National Natural Science Foundation of China (31130006
and 31371330), and start-up funds from Fudan University
to F.C.
SUPPORTING INFORMATION
Additional Supporting Information may be found in the online ver-sion of this article.Figure S1. Validation of selected methylcytosine sites by PCRexperiments.
Figure S2. Methylation profiles determined by MspJI-seq andbisulfite sequencing were consistent for most of the randomlyselected genes and genomic regions.
Figure S3. Density of genes and TEs across Arabidopsis genome.
Figure S4. Percentages of genes of each class among all methylat-ed genes.
Figure S5. Methylation of TEs during Arabidopsis floral develop-ment.
Figure S6. Enrichment of TEs of various families in different mCsequence contexts.
Figure S7. Distribution of normalized methylation levels for eachmC context for genes of various classes.
Figure S8. Examples of genes with correlated variations in methyl-ation and expression levels.
Figure S9. Phenotypes of the three Arabidopsis floral stages usedfor experiments.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
12 Hongxing Yang et al.
Figure S10. MspJI digestion and DNA library recovery.
Figure S11. Identification of mCs based on MspJI-seq.
Table S1. Summary of SOLiD reads sequenced and mappedagainst the Arabidopsis reference genome and reads that wereidentified as arising from MspJI digestion for each possible recog-nition site pattern.
Table S2. Primers used in PCR experiments for selected generegions digested by MspJI.
Table S3. DNA methylation and expression levels for genes in Ara-bidopsis flowers.
Table S4. Arabidopsis genes that were differentially methylatedand differentially expressed during floral development.
Table S5. Statistics for genes that were differentially expressedand methylated between flower meristems and early flowers.
Table S6. Significantly enriched biological processes for eachgene cluster in Figure 6.
Table S7. Number of genes for enriched GO terms for each genecluster in Figure 6.
Table S8. Relative frequencies of the wobbling cut positions ofMspJI.
Methods S1. Mapping of SOLiD short sequencing reads.
Methods S2. Identification of methylcytosines based on MspJI-seq.
REFERENCES
Agius, F., Kapoor, A. and Zhu, J.K. (2006) Role of the Arabidopsis DNA gly-
cosylase/lyase ROS1 in active DNA demethylation. Proc. Natl Acad. Sci.
USA, 103, 11796–11801.Ahmed, I., Sarazin, A., Bowler, C. et al. (2011) Genome-wide evidence for
local DNA methylation spreading from small RNA-targeted sequences in
Arabidopsis. Nucleic Acids Res. 39, 6919–6931.Alexa, A. and Rahnenfuhrer, J. (2010) topGO: Enrichment analysis for Gene
Ontology.
Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of
the flowering plant Arabidopsis thaliana. Nature, 408, 796–815.Ball, M.P., Li, J.B., Gao, Y. et al. (2009) Targeted and genome-scale strate-
gies reveal gene-body methylation signatures in human cells. Nat. Bio-
tech. 27, 361–368.Bao, X., Franks, R.G., Levin, J.Z. et al. (2004) Repression of AGAMOUS by
BELLRINGER in floral and inflorescence meristems. Plant Cell, 16, 1478–1489.
Bieberstein, N.I., Carrillo Oesterreich, F., Straube, K. et al. (2012) First exon
length controls active chromatin signatures and transcription. Cell Rep.
2, 62–68.Bird, A. (2002) DNA methylation patterns and epigenetic memory. Genes
Dev. 16, 6–21.Bowman, J.L., Alvarez, J., Weigel, D. et al. (1993) Control of flower develop-
ment in Arabidopsis thaliana by APETALA1 and interacting genes. Devel-
opment, 119, 721–743.Bradnam, K.R. and Korf, I. (2008) Longer first introns are a general property
of eukaryotic gene structure. PLoS One, 3, e3093.
Brenet, F., Moh, M., Funk, P. et al. (2011) DNA methylation of the first exon
is tightly linked to transcriptional silencing. PLoS One, 6, e14524.
Chan, S.W., Henderson, I.R. and Jacobsen, S.E. (2005) Gardening the genome:
DNAmethylation in Arabidopsis thaliana. Nat. Rev. Genet. 6, 351–360.Chang, F., Wang, Y., Wang, S. et al. (2011) Molecular control of microsporo-
genesis in Arabidopsis. Curr. Opin. Plant Biol. 14, 66–73.Chodavarapu, R.K., Feng, S., Bernatavichute, Y.V. et al. (2010) Relationship
between nucleosome positioning and DNA methylation. Nature, 466,
388–392.Chuang, T.J., Chen, F.C. and Chen, Y.Z. (2012) Position-dependent correla-
tions between DNA methylation and the evolutionary rates of mamma-
lian coding exons. Proc. Natl Acad. Sci. USA, 109, 15841–15846.Cohen-Karni, D., Xu, D., Apone, L. et al. (2011) The MspJI family of modifi-
cation-dependent restriction endonucleases for epigenetic studies. Proc.
Natl Acad. Sci. USA, 108, 11040–11045.
Cokus, S.J., Feng, S., Zhang, X. et al. (2008) Shotgun bisulphite sequencing
of the Arabidopsis genome reveals DNA methylation patterning. Nature,
452, 215–219.Conner, J. and Liu, Z. (2000) LEUNIG, a putative transcriptional corepressor
that regulates AGAMOUS expression during flower development. Proc.
Natl Acad. Sci. USA, 97, 12902–12907.Das, P., Ito, T., Wellmer, F. et al. (2009) Floral stem cell termination involves
the direct regulation of AGAMOUS by PERIANTHIA. Development, 136,
1605–1611.Feng, S., Cokus, S.J., Zhang, X. et al. (2010) Conservation and divergence
of methylation patterning in plants and animals. Proc. Natl Acad. Sci.
USA, 107, 8689–8694.Feng, J., Meyer, C.A., Wang, Q. et al. (2012) GFOLD: a generalized fold
change for ranking differentially expressed genes from RNA-seq data.
Bioinformatics, 28, 2782–2788.Ferrandiz, C., Gu, Q., Martienssen, R. et al. (2000) Redundant regulation of
meristem identity and plant architecture by FRUITFULL, APETALA1 and
CAULIFLOWER. Development, 127, 725–734.Finnegan, E.J., Peacock, W.J. and Dennis, E.S. (1996) Reduced DNA methyl-
ation in Arabidopsis thaliana results in abnormal plant development.
Proc. Natl Acad. Sci. USA, 93, 8449–8454.Finnegan, E.J., Kovac, K.A., Jaligot, E. et al. (2005) The downregulation of
FLOWERING LOCUS C (FLC) expression in plants with low levels of DNA
methylation and by vernalization occurs by distinct mechanisms. Plant J.
44, 420–432.Gan, E.-S., Huang, J. and Ito, T. (2013) Functional roles of histone modifica-
tion, chromatin remodeling and microRNAs in Arabidopsis flower devel-
opment. In International Review of Cell and Molecular Biology (Kwang,
W.J. ed). Waltham, MA: Academic Press, pp. 115–161.Ge, X., Chang, F. and Ma, H. (2010) Signaling and transcriptional control
of reproductive development in Arabidopsis. Curr. Biol. 20, R988–R997.
Gehring, M., Bubb, K.L. and Henikoff, S. (2009) Extensive demethylation of
repetitive elements during seed development underlies gene imprinting.
Science, 324, 1447–1451.Gent, J.I., Ellis, N.A., Guo, L. et al. (2013) CHH islands: de novo DNA methyla-
tion in near-gene chromatin regulation in maize. Genome Res. 23, 628–637.Goll, M.G. and Bestor, T.H. (2005) Eukaryotic cytosine methyltransferases.
Annu. Rev. Biochem. 74, 481–514.Gomez-Mena, C., de Folter, S., Costa, M.M. et al. (2005) Transcriptional pro-
gram controlled by the floral homeotic gene AGAMOUS during early
organogenesis. Development, 132, 429–438.Gong, Z., Morales-Ruiz, T., Ariza, R.R. et al. (2002) ROS1, a repressor of
transcriptional gene silencing in Arabidopsis, encodes a DNA glycosy-
lase/lyase. Cell, 111, 803–814.Hahne, F., Durinck, S., Ivanek, R., Mueller, A., Lianoglou, S., Tan, G. and
Parsons, L. (2013) Gviz: Plotting data and annotation information along
genomic coordinates. R package version, 1(8), 4.
He, X.J., Chen, T. and Zhu, J.K. (2011) Regulation and function of DNA
methylation in plants and animals. Cell Res. 21, 442–465.Horton, J.R., Mabuchi, M.Y., Cohen-Karni, D. et al. (2012) Structure and
cleavage activity of the tetrameric MspJI DNA modification-dependent
restriction endonuclease. Nucleic Acids Res. 40, 9763–9773.Hsieh, T.F., Ibarra, C.A., Silva, P. et al. (2009) Genome-wide demethylation
of Arabidopsis endosperm. Science, 324, 1451–1454.Huang, X., Lu, H., Wang, J.W. et al. (2013) High-throughput sequencing of
methylated cytosine enriched by modification-dependent restriction
endonuclease MspJI. BMC Genet. 14, 56.
Jacobsen, S.E., Sakai, H., Finnegan, E.J. et al. (2000) Ectopic hyperme-
thylation of flower-specific genes in Arabidopsis. Curr. Biol. 10, 179–186.
Jin, X., Pang, Y., Jia, F. et al. (2013) A potential role for CHH DNA methyla-
tion in cotton fiber growth patterns. PLoS One, 8, e60547.
Jones, L., Hamilton, A.J., Voinnet, O. et al. (1999) RNA-DNA interactions
and DNA methylation in post-transcriptional gene silencing. Plant Cell,
11, 2291–2301.Jullien, P.E. and Berger, F. (2010) DNA methylation reprogramming during
plant sexual reproduction? Trends Genet. 26, 394–399.Jullien, P.E., Susaki, D., Yelagandula, R. et al. (2012) DNA methylation
dynamics during sexual reproduction in Arabidopsis thaliana. Curr. Biol.
22, 1825–1830.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
Arabidopsis floral DNA methylomes and transcriptomes 13
Kakutani, T., Jeddeloh, J.A., Flowers, S.K. et al. (1996) Developmental
abnormalities and epimutations associated with DNA hypomethylation
mutations. Proc. Natl Acad. Sci. USA, 93, 12406–12411.Kaufmann, K., Wellmer, F., Muino, J.M. et al. (2010) Orchestration of floral
initiation by APETALA1. Science, 328, 85–89.Law, J.A. and Jacobsen, S.E. (2010) Establishing, maintaining and modify-
ing DNA methylation patterns in plants and animals. Nat. Rev. Genet. 11,
204–220.Li, X., Wang, X., He, K. et al. (2008) High-resolution mapping of epigenetic
modifications of the rice genome uncovers interplay between DNA meth-
ylation, histone methylation, and gene expression. Plant Cell, 20, 259–276.Lippman, Z., Gendrel, A.V., Black, M. et al. (2004) Role of transposable ele-
ments in heterochromatin and epigenetic control. Nature, 430, 471–476.Lister, R., O’Malley, R.C., Tonti-Filippini, J. et al. (2008) Highly integrated
single-base resolution maps of the epigenome in Arabidopsis. Cell, 133,
523–536.Ma, H. (2005) Molecular genetic analyses of microsporogenesis and micro-
gametogenesis in flowering plants. Annu. Rev. Plant Biol. 56, 393–434.Majewski, J. and Ott, J. (2002) Distribution and characterization of regula-
tory elements in the human genome. Genome Res. 12, 1827–1836.Martienssen, R.A. and Colot, V. (2001) DNA methylation and epigenetic
inheritance in plants and filamentous fungi. Science, 293, 1070–1074.Mette, M.F., Aufsatz, W., van der Winden, J. et al. (2000) Transcriptional
silencing and promoter methylation triggered by double-stranded RNA.
EMBO J. 19, 5194–5201.Mizukami, Y. and Ma, H. (1992) Ectopic expression of the floral homeotic
gene AGAMOUS in transgenic Arabidopsis plants alters floral organ
identity. Cell, 71, 119–131.Park, Y.D., Papp, I., Moscone, E.A. et al. (1996) Gene silencing mediated by
promoter homology occurs at the level of transcription and results in
meiotically heritable alterations in methylation and gene activity. Plant J.
9, 183–194.Pelaz, S., Ditta, G.S., Baumann, E. et al. (2000) B and C floral organ identity
functions require SEPALLATA MADS-box genes. Nature, 405, 200–203.Ronemus, M.J., Galbiati, M., Ticknor, C. et al. (1996) Demethylation-induced
developmental pleiotropy in Arabidopsis. Science, 273, 654–657.Ruiz-Garcia, L., Cervera, M.T. and Martinez-Zapater, J.M. (2005) DNA meth-
ylation increases throughout Arabidopsis development. Planta, 222, 301–306.
Schmitz, R.J., Schultz, M.D., Urich, M.A. et al. (2013) Patterns of population
epigenomic diversity. Nature, 495, 193–198.Simpson, T.I., Armstrong, J.D. and Jarman, A.P. (2010) Merged consensus
clustering to assess and improve class discovery with microarray data.
BMC Bioinformatics, 11, 590.
Slotkin, R.K., Vaughn, M., Borges, F. et al. (2009) Epigenetic reprogramming
and small RNA silencing of transposable elements in pollen. Cell, 136,
461–472.Song, Y., Ma, K., Ci, D. et al. (2013) Sexual dimorphic floral development in
dioecious plants revealed by transcriptome, phytohormone, and DNA
methylation analysis in Populus tomentosa. Plant Mol. Biol. 83, 559–576.
Soppe, W.J., Jacobsen, S.E., Alonso-Blanco, C. et al. (2000) The late flower-
ing phenotype of fwa mutants is caused by gain-of-function epigenetic
alleles of a homeodomain gene. Mol. Cell, 6, 791–802.Sridhar, V.V., Surendrarao, A., Gonzalez, D. et al. (2004) Transcriptional
repression of target genes by LEUNIG and SEUSS, two interacting
regulatory proteins for Arabidopsis flower development. Proc. Natl Acad.
Sci. USA, 101, 11494–11499.Stam, M., Viterbo, A., Mol, J.N. et al. (1998) Position-dependent methyla-
tion and transcriptional silencing of transgenes in inverted T-DNA
repeats: implications for posttranscriptional silencing of homologous
host genes in plants. Mol. Cell. Biol. 18, 6165–6177.Stroud, H., Do, T., Du, J. et al. (2014) Non-CG methylation patterns shape
the epigenetic landscape in Arabidopsis. Nat. Struct. Mol. Biol. 21, 64–72.Suzuki, M.M. and Bird, A. (2008) DNA methylation landscapes: provocative
insights from epigenomics. Nat. Rev. Genet. 9, 465–476.Trapnell, C., Pachter, L. and Salzberg, S.L. (2009) TopHat: discovering splice
junctions with RNA-Seq. Bioinformatics, 25, 1105–1111.Wellmer, F., Alves-Ferreira, M., Dubois, A. et al. (2006) Genome-wide analy-
sis of gene expression during early Arabidopsis flower development.
PLoS Genet. 2, e117.
Woo, H.R., Dittmer, T.A. and Richards, E.J. (2008) Three SRA-domain meth-
ylcytosine-binding proteins cooperate to maintain global CpG methyla-
tion and epigenetic silencing in Arabidopsis. PLoS Genet. 4, e1000156.
Wuest, S.E., O’Maoileidigh, D.S., Rae, L. et al. (2012) Molecular basis for the
specification of floral organs by APETALA3 and PISTILLATA. Proc. Natl
Acad. Sci. USA, 109, 13452–13457.Xiao, W., Custard, K.D., Brown, R.C. et al. (2006) DNA methylation is critical
for Arabidopsis embryogenesis and seed viability. Plant Cell, 18, 805–814.
Yang, H., Lu, P., Wang, Y. et al. (2011) The transcriptome landscape of Ara-
bidopsis male meiocytes from high-throughput sequencing: the com-
plexity and evolution of the meiotic process. Plant J. 65, 503–516.Zemach, A., McDaniel, I.E., Silva, P. et al. (2010) Genome-wide evolutionary
analysis of eukaryotic DNA methylation. Science, 328, 916–919.Zhang, X., Yazaki, J., Sundaresan, A. et al. (2006) Genome-wide high-reso-
lution mapping and functional analysis of DNA methylation in Arabidop-
sis. Cell, 126, 1189–1201.Zheng, Y., Cohen-Karni, D., Xu, D. et al. (2010) A unique family of Mrr-like
modification-dependent restriction endonucleases. Nucleic Acids Res.
38, 5527–5534.Zhong, S., Fei, Z., Chen, Y.R. et al. (2013) Single-base resolution methylo-
mes of tomato fruit development reveal epigenome modifications asso-
ciated with ripening. Nat. Biotechnol. 31, 154–159.Zilberman, D., Cao, X., Johansen, L.K. et al. (2004) Role of Arabidopsis ARG-
ONAUTE4 in RNA-directed DNA methylation triggered by inverted
repeats. Curr. Biol. 14, 1214–1220.Zilberman, D., Gehring, M., Tran, R. K. et al. (2007) Genome-wide analysis
of Arabidopsis thaliana DNA methylation uncovers an interdependence
between methylation and transcription. Nat. Genet. 39, 61–69.
© 2014 The AuthorsThe Plant Journal © 2014 John Wiley & Sons Ltd, The Plant Journal, (2014), doi: 10.1111/tpj.12726
14 Hongxing Yang et al.