Tissue-dependent limited pleiotropy affects gene expression in barley
-
Upload
independent -
Category
Documents
-
view
3 -
download
0
Transcript of Tissue-dependent limited pleiotropy affects gene expression in barley
Tissue-dependent limited pleiotropy affects geneexpression in barley
Elena Potokina1,†,‡, Arnis Druka2,†, Zewei Luo1,3, Matthew Moscou4, Roger Wise4,5, Robbie Waugh2 and Mike Kearsey1,*
1School of Biosciences, The University of Birmingham, Birmingham B15 2TT, UK,2Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, UK,3Laboratory of Population & Quantitative Genetics, School of Life Sciences, Institute of Biomedical Sciences,
Fudan University, Shanghai 200433, China,4Bioinformatics and Computational Biology Graduate Program & Department of Plant Pathology, Iowa State University,
Ames, IA 50011-1020, USA, and5Corn Insects and Crop Genetics Research, USDA-ARS, Iowa State University, Ames, IA 50011-1020, USA
Received 16 April 2008; revised 21 May 2008; accepted 30 May 2008; published online 29 July 2008.*For correspondence (fax +44 121 414 5925; e-mail [email protected]).†Both authors contributed equally to this work.‡Present address: Vavilov Institute of Plant Industry (VIR), Bolshaya Morskaya, 44, 190000, St Petersburg, Russia.
Summary
Non-synonymous coding mutations in a gene change the resulting protein, no matter where it is expressed,
but the effects of cis-regulatory mutations could be spatially or temporally limited – a phenomenon termed
limited pleiotropy. Here, we report the genome-wide occurrence of limited pleiotropy of cis-regulatory
mutations in barley (Hordeum vulgare L.) using Affymetrix analysis of 22 840 genes in a population of 139
doubled haploid lines derived from a cross between the cultivars Steptoe (St) and Morex (Mx). We identified
robust cis-acting expression regulators that segregate as major genes in two successive ontogenetic stages:
germinating embryo tissues and seedling leaves from the embryonic axis. We show that these polymorphisms
may be consistent in both tissues or may cause a dramatic change in transcript abundance in one tissue but not
in another. We also show that the parental allele that increases expression can vary with the tissue, suggesting
nucleotide polymorphism in enhancer sequences. Because of the limited pleiotropy of cis-regulating
mutations, the number of cis expression quantitative trait loci (cis-eQTLs) discovered by ‘genetical genomics’
is strongly affected by the particular tissue or developmental stage studied. Given that limited pleiotropy is
a common feature of cis-regulatory mutations in barley, we predict that the phenomenon would be relevant
to developmental and/or tissue-specific interactions across wide taxonomic boundaries in both plants
and animals.
Keywords: barley, cis-eQTL, gene expression, genetical genomics, pleiotropy.
Introduction
Variation in gene expression is a heritable trait, and can be
mapped in segregating populations using the approaches of
genetical genomics (Jansen and Nap, 2001). For those
organisms with sequenced genomes, the approach provides
an unprecedented opportunity to compare the genetic
position of the gene encoding each transcript with the
position of its expression quantitative trait locus (eQTL),
thereby making it possible to discriminate between the cis-
and trans-regulatory control elements of gene expression
for thousands of genes across the genome (Brem et al.,
2002; Schadt et al., 2003; Hubner et al., 2005; Keurentjes
et al., 2007; West et al., 2007). Cis-acting variation has been
proposed as a major determinant of quantitative phenotypic
traits, and so the detection of cis-eQTLs is a matter of par-
ticular significance (Stamatoyannopoulos, 2004). Addition-
ally, co-segregation of a cis-eQTL with the gene itself
confirms its authenticity.
Whereas non-synonymous coding mutations change the
resulting protein, no matter where the gene is expressed, the
effects of cis-regulatory mutations could be spatially or
ª 2008 The Authors 287Journal compilation ª 2008 Blackwell Publishing Ltd
The Plant Journal (2008) 56, 287–296 doi: 10.1111/j.1365-313X.2008.03601.x
temporally limited, e.g. to larval anatomy, without affecting
the adult, or just to a single organ or tissue, even when the
gene is much more widely expressed. This phenomenon is
known as reduced or limited pleiotropy (Stern, 2000; Wray,
2007). Similar effects can arise from tissue-specific tran-
scription factors. Limited pleiotropy has been observed in
many cases in which cis-regulatory mutations have an
ecologically significant phenotypic impact (Wray, 2007).
Analogously, Li et al. (2006) have demonstrated the exis-
tence of plasticity QTLs (pQTLs) in Caenorhabditis elegans,
which result in changing levels of expression in different
environments (temperature in their case) that were prepon-
derantly trans-acting QTLs, and which could be responsible
for responses to fluctuating environments.
Although variability in gene expression at a genome-wide
scale across tissues is well known for both sequenced (cf.
Novak et al., 2002) and unsequenced organisms (cf. Druka
et al., 2006), there are very few studies focusing on the
tissue-specific appearance and behaviour of cis-regulatory
elements. Recently, tissue specificity of eQTLs was reported
for rats (Petretto et al., 2006). The investigation of pleiotropy
of cis-regulatory mutations becomes more challenging with
unsequenced species, because the lack of precise informa-
tion about the physical location of genes makes separation
of cis- from trans- eQTLs very difficult. Rice synteny was
recently employed for physically mapping wheat expressed
sequence tags (ESTs), in order to compare the physical
position of the ESTs with their expression mapping data
obtained using wheat doubled haploid lines (Jordan et al.,
2007). Another approach employed for barley (Potokina
et al., 2007) was based on the numerous published reports
indicating that the proportion of gene expression patterns
that can be accounted for by cis-acting versus trans-acting
components depends heavily on the threshold applied for
eQTL detection (Gibson and Weir, 2005; Hubner et al., 2005;
Keurentjes et al., 2007; West et al., 2007; Yamashita et al.,
2005). As a result, the highly heritable cis-eQTLs can be
widely recognized by their extremely high log of odds (LOD)
scores relative to the rest of the genes (Potokina et al., 2007),
and confirmed by single-nucleotide polymorphisms (SNPs).
Although co-segregation technically only confirms local
regulation (Rockman and Kruglyak, 2006), we use the term
‘cis-regulation’ to imply all near-regulation that has not been
resolved by recombination. Trans-eQTLs may also have
high LOD scores, although less frequently (Luo et al., 2007),
but are less easy to reject as false positives.
In the present paper we investigate the tissue-specific
appearance of cis-eQTLs in barley, one of the most impor-
tant crop species with an unsequenced genome. To address
this issue we analysed the steady-state mRNA transcript
abundance (sometimes referred to as ‘expression’) reported
by 22 840 probe sets (from here on termed ‘genes’) on
Affymetrix Barley 1 GeneChips across 139 double-haploid
(DH) lines of the Steptoe (St)/Morex (Mx) barley mapping
population in one tissue, and in a subset of 30 DH lines for a
second tissue, with the two tissues being derived from
different temporal development stages. Working with the
restricted mapping population, we focused only on the
highly heritable cis-eQTLs that segregated like major genes
and that were reproducible with the larger set of 139 DH lines
both in expression and SNP genotype. The number of cis-
eQTLs detected with an empirically established statistical
threshold was found to be strongly dependent on the
particular tissue, demonstrating that limited pleiotropy of
cis-regulatory mutations occurs widely in barley. This sug-
gests that the question ‘how many genes are regulated in
cis?’ is context-dependent, with different outcomes for
different tissues, as well as different crosses, under analysis.
Results
The natural nucleotide polymorphism between St and Mx in
the regulatory part of genes may affect an interaction of a
gene promoter with RNA polymerase and transcription fac-
tors. As a consequence, alleles of St and Mx would result in
different levels of gene transcript abundance, and this can be
detected by eQTL analysis in the mapping population of DH
lines of the St/Mx cross. Because of the limited pleiotropy of
cis-regulatory mutations, the linkage might be discovered in
one tissue, but not in another. To test this hypothesis, we
compared the chromosomal locations of highly significant
eQTLs detected in two tissues: the embryo-derived tissue of
germinating barley grains (subsequently referred to as
‘embryo’) and leaves of 12-day-old seedlings (‘leaf’), and
compared them with SNP data for the same genes.
Overall tissue effect on transcript abundance level and
tissue-specific cis-regulators
We started by estimating the overall tissue effect on the
22 840 genes, by comparing the relative abundance of their
transcripts across 32 genotypes (two parents and 30 DH
lines) in embryo and leaf tissues. Initially, we mainly focused
on the between-group variation (expression in embryo vs
expression in leaves) caused by tissue-specific gene regula-
tion, without regard to genetic polymorphism within the DH
lines. A one-way ANOVA detected 19 958 genes (87% of the
total gene number on the array) in which the transcript
abundance level differed significantly between tissues
(FDR < 0.05; Benjamini and Hochberg, 1995) (Table S1).
Among these, approximately half (9985) were higher in the
embryo, whereas the other half (9973) were higher in the leaf.
We observed that replacing the St for an Mx allele for
many genes significantly affected the tissue-specific regula-
tion. Contig6206_s_at (Figure 1a) is an example of a gene
being completely suppressed in the embryo (all hybridiza-
tion signals were classified by the MAS 5.0 algorithm as
‘Absent’), but was activated in leaves, although only in DH
288 Elena Potokina et al.
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
lines that carried the Mx allele. In contrast, Contig19508_at
(Figure 1b) was not detectably transcribed in leaves (all
hybridization signals were classified as ‘Absent’), and was
specifically expressed in embryo tissue, but again only in
those offspring that inherited the Mx allele for the gene. The
St allele produced no detectable transcript in either tissue for
Contig3642_at (the ‘Absent’ call was obtained for all DH lines
carrying St allele; Figure 1c). Contig13784_at (Figure 2a) was
detectably expressed in both tissues; however, the St allele
decreased the transcript abundance just in the embryo,
while no allele-specific gene expression was detected in
leaves. Remarkably, the alleles of the particular parent (e.g.
Mx) may increase the transcription level of a gene in the
embryo, but decrease it in leaves (Figure 2b). To determine
the biological relevance of this phenomenon it would be
informative to estimate the frequency of occurrence for each
class of tissue-specific expression pattern. Technically, this
can be achieved via eQTL analysis of the gene expression
data available for the 30 DH lines in two tissues to compare
positions of the eQTLs mapped for the same gene, but in two
different tissues, as described below.
Tissue-specific cis-regulators and the empirical
genome-wide threshold of eQTL significance
In a previous paper (Potokina et al., 2007), we presented
evidence for 23 738 eQTLs in this population of 139 DH lines
for embryo tissue, whereas in an earlier paper (Luo et al.,
2007) we identified expression markers using a subset of 30
DH lines based on leaf tissue. In the latter case, only 30 lines
were available, but we now wish to compare the cis-eQTLs
for the two tissues in these 30 DH lines to explore limited
pleiotropy on a genome-wide scale. The use of such a small
population required several key steps to minimize false gene
discovery, including reference to the full 139-line set and to
SNP data.
We thus carried out eQTL analysis with gene expression
data available for the 30 DH lines, and compared the
positions of the eQTLs mapped in the two tissues. To avoid
false eQTLs artificially created by background noise, only
those genes expressed at a detectable level were subjected
to eQTL analysis. Thus, out of 22 840 genes on the chip, we
analysed the transcript abundance variation for the 15 967
(70%) genes for embryo tissue and the 15 247 (67%) genes
for leaf tissue that were classified by the MAS 5.0 algorithm
as ‘Present’ for at least two out of three replications of St or
Mx hybridizations. The ‘Present’ flags designate genes in
which the hybridization signal differs from the background
significantly (P < 0.05). The two sets of mapped expression
profiles for different tissues overlapped for 13 940 genes.
Altogether, 15 967 (embryo) and 15 247 (leaf) genes were
each tested by composite interval mapping (CIM) for linkage
between transcript level variation and a marker position
in one of the 209 recombination bins of the seven
9
10(a)
(b)
(c)
6
7
8
3
4
5
8.5
7.0
7.5
8.0
5.0
5.5
6.0
6.5
4.5
9.0
9.5
7.5
8.0
8.5
P1 DH(P1) P2 DH(P2) . P1 DH(P1) P2 DH(P2)
P1 DH(P1) P2 DH(P2) . P1 DH(P1) P2 DH(P2)
P1 DH(P1) P2 DH(P2) . P1 DH(P1) P2 DH(P2)
6.0
6.5
7.0
Figure 1. Tissue-specific pattern of gene expression across 30 double-haploid
(DH) lines of the Steptoe (St)/Morex (Mx) cross.
Expression level is shown on the y-axis (log scale) for embryo tissue (on the
left) and leaf tissue (on the right). The DHs are assigned to genotypic classes
and mean values (black mark) plus SD (hashed boxes) are depicted for P1
(Morex), P2 (Steptoe) and genotypic classes DH(P1) and DH(P2).
(a) Contig6206_s_at: allele-specific expression appearing only in leaf tissue.
(b) Contig19508_at: allele-specific expression appearing only in embryo
tissue.
(c) Contig3642_at: allele-specific expression detected in both tissues
analysed.
Tissue-dependent gene expression in barley 289
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
chromosomes in this restricted population. The resulting
LOD profiles were examined to determine eQTL peaks, as
described previously (Potokina et al., 2007). To establish a
threshold for declaring eQTLs to be statistically significant,
we used the global permutation threshold (GPT) approach
(West et al., 2007) (see also Experimental procedures).
Accordingly, each eQTL peak was assigned to the corre-
sponding P-value, reflecting its genome-wide significance
obtained by the permutation approach.
Not only do false eQTLs arise from the multiple testing of
thousands of expression traits, but they also arise from the
very small size of the mapping population (n = 30) for which
we had expression data for both tissues. The linkage
obtained might be consistent for the particular set of 30
genotypes and, therefore, it would pass through the permu-
tation test, but it is at risk of not being reproducible with a
larger set of recombinants. The smaller the subset of
genotypes, the more likely it is that a linkage can reflect a
spurious association from a biased selection of the 30
genotypes. For this reason, large mapping populations (of at
least n > 100) are routinely used for QTL mapping practice.
To account for these statistical issues, we have estab-
lished an empirically estimated genome-wide threshold for
the eQTL significance. The 30 DH lines analysed for eQTLs in
the present study were a subset of the larger mapping
population of 139 DHs previously subjected to eQTL analysis
using transcript abundance in embryo tissue (Potokina et al.,
2007). We evaluated the proportion of the eQTLs detected
with the 30-lines set that was also reproducible with the 139-
DH population, at different thresholds of the eQTL signifi-
cance obtained via permutation tests. Table 1 shows that a
genome-wide significance threshold of P £ 0.0004 allows us
to keep the empirical false discovery rate (FDR) at the 5%
level for the panel of 30 DH lines. The significance threshold
obtained was comparable with the threshold (P < 0.001,
FDR < 0.05) detected previously for a panel of 30 rat
recombinant inbred strains (Hubner et al., 2005; Petretto
et al., 2006). Using the established significance threshold
(P £ 0.0004, FDR < 0.05), 1527 eQTLs were detected for the
embryo tissue, and 1158 eQTLs were detected for the leaf
10.0
10.2(a)
(b)
9.4
9.6
9.8
8.6
8.8
9.0
9.2
8.2
8.4
9.5
10.0
8.0
8.5
9.0
6.5
7.0
7.5
5.5
6.0
P1 DH(P1) P2 DH(P2) . P1 DH(P1) P2 DH(P2)
P1 DH(P1) P2 DH(P2) . P1 DH(P1) P2 DH(P2)
Figure 2. Limited pleiotropy of cis-regulatory mutations and reciprocal
change in the parent that contributes the allele with the most abundant
transcript.
The double haploids (DHs) are assigned to genotypic classes and mean values
(black mark) plus SD (hashed boxes) are depicted for P1 (Morex), P2 (Steptoe)
and genotypic classes DH(P1) and DH(P2).
(a) Contig13784_at: a cis-regulatory mutation modifies transcript abundance
in just a single tissue, even when the gene is more widely expressed.
(b) Contig4374_s_at: the allele of parent Steptoe (P2) maintains consistent
expression in leaf and embryo, whereas the allele of parent Morex (P1)
increases transcription efficiency in embryo, but decreases efficiency in
leaves.
Table 1 Comparisons of the numbers of expression quantitative trait loci (eQTLs) controlling gene expression in the embryo detected at variousLOD thresholds in the 30 double-haploid (DH) line population compared with those confirmed in the 139-line population
LOD threshold foreQTLs in 30-DHpopulation
Genome-wide significanceafter a permutation testwith the 30-line population(P-value)
Total number of eQTLsdetected with the 30-linepopulation
Number of eQTLs detected withthe 30-line population andsupported with 139-linepopulation Proportion
>3.08 <0.05 32 401 6755 0.208>4.06 <0.01 19 984 5528 0.277>8.47 <0.001 2465 2051 0.832>9.47 <0.0006 1843 1697 0.921>9.85 <0.0005 1673 1581 0.945‡10 £0.0004 1607 1527 0.950
290 Elena Potokina et al.
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
tissue. Multiple eQTLs were identified in each tissue for�2%
of these genes, and were removed from the analysis. This
left 2081 unique eQTLs (one eQTL per gene), of which 1498
and 1134 eQTLs were found in the embryo and leaf tissue,
respectively; of these, 551 eQTLs were common to both
tissues (Table S2).
The selected eQTLs represent the highest LOD scores
relative to the rest of the genes: they fall into the 95% and
96% percentile boundaries of the total distribution of all LOD
scores detected for embryo and leaf tissue (data not shown),
and, therefore, they are most likely to be cis-regulators
(Gibson and Weir, 2005; Hubner et al., 2005; West et al.,
2007; Yamashita et al., 2005). To obtain experimental sup-
port for this assumption, the positions of 98 randomly taken
genes mapped using SNPs were compared with the posi-
tions of their eQTLs. The SNPs and corresponding eQTLs co-
segregated among the set of 30 DH lines for 93 out of the 98
genes. If the eQTL and SNP genotypes were independent,
the probability of all 30 co-segregating for a given gene
would be 0.530, i.e. �1 · 10)9. Based on this, we estimate
that 95% of the selected eQTLs are cis-eQTLs; the remaining
5% map elsewhere and may be strong trans-eQTLs or
cis-eQTLs associated with paralogues.
Co-appearance of cis-eQTLs when two tissues are compared
Figure 3 shows examples for three cases of co-appearance of
those 2081 eQTLs in the two tissues. In addition, the LOD
scores of all the 2081 eQTLs that were significant in one tis-
sue (P < 0.0004) were plotted for the other tissue (Figure 4).
For the first group of 1083 genes, an eQTL was mapped in the
same position for both embryo and leaf tissues (Figures 3a
and 4a–c). Among these, 551 eQTLs were highly significant
(P < 0.0004) in both tissues (Figure 4a). Another 532 eQTLs
were highly significant (P < 0.0004) in one tissue and sig-
nificant in the second tissue when the threshold was lowered
to P < 0.05 (Figure 4b,c). These 532 eQTLs were not consid-
ered as tissue specific because at least 20% of the eQTLs
mapped in the 30-DH line population with P < 0.05 could be
the true positives (Table 1). So, if the eQTL was mapped in
the same position in both tissues, with a level of significance
in one tissue of P < 0.0004 and in another of P < 0.05, it could
not be reliably classified as tissue-specific.
For the second group of 615 genes (Figure 3b), an eQTL
was mapped in embryos at P £ 0.0004, but it could not be
detected even at P < 0.05 in leaf tissue (Figure 4d). The
absence of the corresponding eQTLs in leaf tissue resulted
from two main reasons: (i) the gene was not detectably
expressed in leaf tissue, making any eQTL detection impos-
sible (236 genes); (ii) the gene was detectably expressed in
leaf tissue, but there was no genetical variation in expres-
sion level; i.e. no detectable eQTL (379 genes). The 379
P-values for the corresponding LODs in leaf tissue were far
from significant, even at P < 0.05: for 80% of them, the
P-values ranged from 0.98 up to 1 (data not shown). For the
third group of 383 genes (Figures 3c and 4e) eQTLs were
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
52% 30% 18%
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
1H 2H 3H 4H 5H 6H 7H
Embryo
Leaf
Embryo
Leaf
Embryo
Leaf
(a) (b) (c)
Figure 3. Comparison of the tissue-specific detection of 2081 expression quantitative trait loci (eQTLs) in embryo-derived (embryo) and seedling leaf (leaf) tissues.
Examples: (a) an eQTL mapped in the same position for both embryo and leaf tissues, HVSMEa0001B10r2_x_at; (b) an eQTL mapped in embryo but not in leaf tissue,
baak13l10_s_at; (c) an eQTL mapped in leaf but not in embryo tissue, Contig86_at.
Tissue-dependent gene expression in barley 291
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
mapped in leaf tissue but not in embryos, for the same
reasons as above (126 and 257 genes, respectively).
We were able to check the extent to which limited
pleiotropy was being misidentified as a result of genuine
eQTLs failing to reach significance in the small set of 30 DH
lines (false negatives). We took those 383 cases where an
eQTL was clearly identified in leaf tissue (P < 0.0004; false-
positive rate < 5%), but not in embryos (P > 0.05), both
based on 30 DHLs. We then looked for evidence of significant
eQTLs for embryo tissue among these 383 genes based on
an analysis of all 139 DH lines, and found 73 such genes
(19%). This suggests that �81% of the cases of limited
pleiotropy are correctly identified. We do not have data for
the reciprocal set because expression in leaf tissue was not
studied on the 139-DH lines set, but we assume that the
same confidence applies.
In summary, the eQTL analysis in embryos and leaf tissue
yields 1498 and 1134 detected cis-eQTLs, respectively;
combining both tissues gives 2081 cis-eQTLs. Thus, by
adding another tissue for eQTL analysis, we increased the
number of detected cis-eQTLs by 39% compared with those
detected in embryos, and by 84% compared with those
detected in leaf.
Reversed effect of parental alleles in different tissues
The first group (i.e. genes with eQTLs mapping to the same
location in both tissues) was further characterized by iden-
tifying tissue-dependent cis-eQTLs that undergo a reciprocal
change in the parent that contributes the allele with the most
abundant transcript. Thirty-four of the 1083 genes were
found to invert the direction of the eQTLs between tissues
(Table S3). The example of Contig4374_s_at (Figure 2b)
shows that alleles of St maintain a consistent level of
expression in both leaves and embryos, whereas alleles of
Mx increase transcription levels in embryos, but decrease
transcription levels in leaves. The results can be explained by
assuming mutations exist in the putative enhancer sequence
for the gene. The enhancers being targeted for tissue-specific
or temporal regulation may recruit either negative or positive
regulators of gene expression, depending on the develop-
mental stage (Bilic et al., 2006; Lewin, 1997). For this partic-
ular sample, this might be the case for DH lines that inherit
the Mx allele. The mutations occurring in St might disable
the enhancer function, thereby reducing tissue-specific gene
regulation. Of the 34 genes, 19 had alleles generating higher
transcript abundance in Mx embryos and St leaves, whereas
15 had the opposite effect (i.e. higher transcript abundance
in Mx leaves and St embryos).
Gene function prevalence among the genes with
tissue-specific activity of cis-regulators
The gene ontology (GO SLIMS) classification provided by
GENESPRING 7.2 was employed to predict functions for the
cis-regulated genes based on the latest annotation infor-
mation available. We used the annotations for the genes
with detected cis-eQTLs to investigate possible cellular
activity associated with revealed tissue-specific cis-regula-
tors. Focusing on the functional category ‘Biological Pro-
cesses’, 4549 genes can be annotated (20% of all the genes
on the Affymetrix Barley 1 GeneChip). We investigated
whether the tissue-specific cis-regulators represent a ran-
dom sample of the total number of annotated barley genes,
or if there is over-representation of detected tissue-specific
cis-factors from one or more GO classifications related to the
particular biological goals. To make this comparison, we
considered only those functional categories where both 615
embryo-specific and 383 leaf-specific cis-regulators were
represented sufficiently to permit a chi-squared test (i.e. ‡ 5
genes expected per category). The selected functional cate-
gories included genes with unknown biological processes
(GO:0000004), regulation of gene expression, epigenetic
(GO:0040029), electron transport (GO:0006118) and genes
involved in high-level processes such as ‘nucleobase,
nucleoside, nucleotide and nucleic acid metabolism’ and
‘cell communication’.
Figure S1 shows the frequency distribution of the func-
tional categories among (i) all 4549 annotated genes,
(ii) annotated genes with eQTLs observed both in embryo
and leaf tissues, (iii) annotated genes with eQTLs detected in
embryo tissue only and (iv) annotated genes with eQTLs
detected in leaf tissue only. For each of the latter three
groups we compared the distribution of functional catego-
ries against the distribution of those categories in the total
sample of 4549 annotated genes.
P < 0.00001P < 0.0004P < 0.05
P < 0.00001
Lea
f
P < 0.05
P < 0.0004
Embryo
A B C D E
Figure 4. Comparison of significance of the expression quantitative trait loci
(eQTLs) mapped for the same genes in two different tissues.
P-values of 2081 eQTLs that were significant in one tissue (P < 0.0004) are
plotted for the other tissue.
(a) The 551 eQTLs that were highly significant (P < 0.0004) in both tissues.
(b–c) The 532 eQTLs that were highly significant (P < 0.0004) in one tissue and
that were significant in the second tissue when the threshold was lowered to
P < 0.05.
(d) The 615 eQTLs mapped in embryo but not in leaf tissue.
(e) The 383 eQTLs mapped in leaf tissue but not in embryo tissue.
292 Elena Potokina et al.
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
The distribution of functional category frequencies
among leaf-specific cis-eQTLs alone significantly differs
from the distribution of functional categories among the
total number of annotated barley genes (P = 0.010), mostly
because of over-representation of genes involved in electron
transport pathways. This might be expected given the
photosynthetic electron transport processes preferentially
activated in leaf tissue. On the other hand, 23% of the leaf-
specific cis-factors related to electron transport showed
homology to the cytochrome P450 gene family. In plants,
P450s are known to play important roles in many processes,
including the production of hormones, pigments, oils and
defensive compounds (Nguyen et al., 2001).
Discussion
In this paper, we have described the genome-wide occur-
rence of limited pleiotropy of cis-regulatory mutations in
barley. The tissues in this analysis represent two successive
ontogenetic stages of the barley plant: embryo-derived
tissues and leaves from seedlings developed from the
embryonic axis. We showed that polymorphism between
St and Mx in cis-regulatory regions may alter gene tran-
scription for only one of the two developmental stages,
supporting reports that mutations in regulatory regions may
sometimes have few or no pleiotropic consequences
(Carroll, 2005; Stern, 2000).
The limited pleiotropy of cis-regulatory mutations was
recently suggested as one of the possible reasons why
selection could generally operate more efficiently and
flexibly on cis-regulatory mutations than on coding muta-
tions (Wray, 2007). In contrast with the extensive pleiotropic
effects that may arise from mutations within protein-coding
regions, mutations in cis-regulatory regions may affect gene
transcription in just one crucial cell type out of several where
it is expressed. This optimizes the use of genetic variation
while avoiding the extremely deleterious effects on fitness
(Carroll, 2005). One of the well-established examples dis-
cussed in terms of the evolutionary significance of limited
pleiotropy (Wray, 2007) is a cis-regulatory SNP in the DARC
locus (Duffy blood group, chemokine receptor). The cis-
regulatory SNP abolishes the DARC transcription specifically
in red blood cells, whereas in several other tissues and cell
types the DARC locus is still expressed normally (Iwamoto
et al., 1996; Peiper et al., 1995). Individuals lacking DARC
expression in erythrocytes show no adverse health conse-
quences, but become completely resistant to infection with
the malarial parasite. Thus, the single cis-regulatory SNP
results in a phenotype expected to provide a substantial
fitness gain (Wray, 2007).
With the currently available methods of eQTL mapping in
a segregating population, limited pleiotropy could be traced
by tissue (developmental stage)-specific detection of cis-
eQTLs. In our study, we only analysed a fraction of all of the
cis-regulatory factors that could possibly be revealed in
barley with the St/Mx cross. This highly selected set of 2081
genes showed the highest LOD scores for eQTLs compared
with the rest of the genes, providing the opportunity to
confidently identify them as cis-regulated loci. For approx-
imately half (1083) of these genes, the cis-regulatory muta-
tions appeared to work in the embryo of germinating grains
as efficiently as in the leaves of seedlings. For the remaining
998 genes, the cis-factor activity was tissue-specific. The
tissue-specific activity of the cis-factors appeared to be
associated with the cellular processes that were preferen-
tially activated in the corresponding tissue. For example,
in leaf tissue a higher proportion of cis-regulated genes
associated with photosynthetic electron transport processes
was detected. In as few as 35% of the 998 genes, cis-eQTL
variation could not be identified in one of the two tissues
because of the complete suppression of gene expression in
that tissue. For the remaining 65%, the genes were detec-
tably expressed in both tissues, but nucleotide polymor-
phism between St and Mx in cis-regulatory regions caused
dramatic regulatory changes (and, consequently, changes in
the appearance of eQTLs) in only one tissue. Therefore, the
data suggest that the limited pleiotropy of cis-regulatory
mutations is widely distributed in barley.
Assuming that limited pleiotropy is a common feature of
cis-regulatory mutations not only in barley, but also in other
species (e.g. man), one could predict that the phenomenon
would be relevant to some age-related disorders and/or
tissue-specific syndromes in humans, with implications for
‘personalized medicine’. Indeed, the complex genetic com-
ponent of many age-related disorders is well reported (Ruse
and Parker, 2001). If the inheritance of a particular parental
allele in a cis-regulatory region causes dramatic alterations
of gene expression, some of the inherited alleles may be
associated with a significant deleterious effect on fitness.
However, transcription of many genes, particularly in higher
eukaryotes, is dependent upon multiple physiological sig-
nals (Ptashne and Gann, 1997). Thus, the deleterious effect
of the inherited parental allele can be temporarily concealed
until the signal is received at a certain developmental stage.
If negative action of the alleles appears later in life, the
deleterious alleles might be kept in a population, as the
effects of natural selection decline with age (Medawar,
1952).
With the restricted examples of two barley genotypes we
investigated how the natural nucleotide polymorphism of
DNA may modify variation of gene transcription efficiency.
The most common level of transcription regulation is
interaction of a gene promoter with RNA polymerase and
other transcription factors. Consequently, nucleotide substi-
tutions in binding sites upstream of the start-point of
transcription, such as promoter, cis-acting motifs (short
sequences recognizable by transcription factors) or enhanc-
ers (DNA sequences that can activate gene transcription
Tissue-dependent gene expression in barley 293
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
from remote positions) would change the level of transcript
abundance. As a consequence, eQTL analysis detects
significant linkage between polymorphism at the genetic
locus and variation in its transcript abundance. In that
situation, the nucleotide polymorphism assigned to one
genotype (e.g. Mx) may favour transcription efficiency,
whereas alleles of the other (e.g. St) may not. Remarkably,
this Morex (+)/Steptoe ()) pattern could be tissue-specific in
cases where nucleotide polymorphism occurs in enhancer
sequences, which are targets for tissue-specific or temporal
regulation (Lewin, 1997). It was previously reported (e.g.
Bilic et al., 2006) that enhancers may recruit either negative
or positive regulators of gene expression, depending on the
developmental stage. In our study, such nucleotide poly-
morphism in enhancer sequences between Steptoe and
Morex genotypes can be hypothesized for the 34 genes in
which we detected a reciprocal change in the parent that
contributes the allele with the most abundant transcript in
the two tissues.
In summary, the number of detected cis-eQTLs is always a
function of nucleotide polymorphism content between
parents in the particular cross. For example, the same cis-
eQTL being mapped in the DH population from barley
genotypes in the St/Mx cross may not be detectable in
mapping populations of other barley genotypes (e.g. Barke/
Mx cross), simply because there is no polymorphism
between Barke and Mx for the particular regions affecting
the transcript abundance of that gene. Certainly, it would not
mean that the gene is not regulated ‘in cis’, because the
promoter and cis-motifs have key impacts on the transcrip-
tion of any gene. In addition, our results showed that
because of limited pleiotropy of cis-regulating mutations,
the number of cis-eQTLs discovered by the genetical
genomics approach is strongly affected by the experimental
situation (e.g. particular tissue, fixed developmental stage).
In the present study, the addition to the eQTL mapping
experiments of a second tissue increased the number of
detected cis-eQTLs by 39% (embryo) and 84% (leaf). Thus,
instead of the question ‘how many genes are regulated in
cis?’ we should ask more accurately ‘how many cis-regula-
tory mutations can be detected with the particular cross for a
given tissue?’.
Experimental procedures
Plant material
We used mRNA from the embryo-derived tissue of germinatinggrains for expression profiling of 30 recombinant lines of a St · MxDH population (Kleinhofs et al., 1993). The same set of DH lines wasinvestigated previously for the purpose of single feature polymor-phisms (SFP) detection (Luo et al., 2007).
Embryo-derived tissues [coleoptile (shoot sheath), plumule(foliage leaves), shoot apical meristem, scutellum, radical (embry-onic root), calyptra (root cap), coleorhiza (root sheath) (see also
http://www.seedbiology.de)] from three grains were dissected as asingle tissue piece and were flash frozen in liquid nitrogen.Germination and tissue collection were repeated for all lines withcomplete randomization of the Petri plates on each of threeoccasions. For each line, tissues from all three occasions werebulked before RNA isolation.
To obtain seedling leaf tissue, 10 sterilized seeds per line weresown in each of three replicate 13-cm2 pots. One pot of everymember of the ‘trial set’ was randomized in each of three random-ized blocks, and each block was placed in a separate Snijdersgrowth cabinet with a 16-h light (400 lE m)1 sec)1)/8-h dark, 17�C/12�C, cycle. After 12 days, leaves of seven or eight seedlings fromeach pot were collected, bulked and flash frozen in liquid nitrogen;tissues from all three replicate pots of each line were bulked for RNAisolation.
The RNA isolated from these bulks was used to hybridize to themicroarray. A single hybridization from each of 30 DH lines wasperformed, but three biological replicate hybridizations were usedfor the parental genotypes Mx and St. RNA was isolated, processedand hybridized to the Barley 1 GeneChip (Affymetrix product#900515 GeneChip� Barley Genome Array; complete descriptionand references can be found at http://www.affymetrix.com/prod-ucts/arrays/specific/barley.affx) using previously described Trizolprocedures (Caldo et al., 2004). The labelling, hybridization andGeneChip data acquisition were conducted at the GeneChip facilityat Iowa State University (http://www.biotech.iastate.edu/facilities/genechip/Genechip.htm ).
Microarray data handling
To estimate the overall tissue effect on the expression of the 22 840genes present on the Affymetrix Barley 1 GeneChip, 72 probe result(CEL) files combining 36 files (30 DH lines plus three replications oftwo parents) for the two tissues under analysis were normalized toeach other using the Robust Multi-assay Average (RMA) normali-zation routine from the Bioconductor packages (Irizarry et al., 2003).A one-way ANOVA was used to compare the expression level of eachof the 22 840 genes on the chip across 36 genotypes and two tissuesusing PERL script developed in-house. An estimation of FDR wasachieved according to the method of Benjamini and Hochberg(1995), following the approach suggested by Benjamini et al. (2001).
In order to compare eQTLs detected with the 30-lines set withthose discovered with all 139 lines, we further followed theprocedure of data handling described previously (Potokina et al.,2007). CEL files were directly loaded into GENESPRING 7.2 and weresubmitted to the RMA file pre-processor. This converts the probe-level expression data into gene-level expression data, which arenormalized to a certain extent. As an additional normalization step,the procedure ‘Per Gene: Normalize to specific samples’ wasapplied, where each gene signal was divided by the correspondingmean of three replications of Mx. The normalized data for bothparents and each of the 30 DH lines were exported for eQTLmapping. To obtain MAS 5.0 presence calls, the ‘mas5calls’ methodfrom the BIOCONDUCTOR package was used.
Data access
All the detailed data and protocols from these experiments havebeen deposited in BarleyBase/PLEXdb (http://barleybase.org; http://plexdb.org/), an MIAME-compliant expression database for plantGeneChips (Shen et al., 2005). Data files have also been depositedin ArrayExpress (http://www.ebi.ac.uk/arrayexpress) as accessionnumbers E-TABM-111 (leaf) and E-TABM-112 (embryo).
294 Elena Potokina et al.
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
eQTL mapping
A recently developed transcript derived marker (TDM) map used foreQTL analysis of the St/Mx population of 139 DH lines, and congruentwith the SNP map for the St/Mx cross (Rostoks et al., 2006), was takenas a standard to maintain the marker order. The TDMs included SFPs,as described, and were defined by Borevitz et al. (2003). To developthe genetic map for the 30 DH lines, we used the same markers as forthe map based on 139 lines, but the reduced population size (n = 30)led to a smaller number of recombination bins (209 instead of 512).Thus, all of the markers linked to eQTLs on the 30-line map could bechecked for significant linkage to eQTL on the 139-line map.
The CIM analysis was implemented using WINDOWS QTLCARTOGRAPHER 2.5 (http://statgen.ncsu.edu/qtlcart/WQTLCart.htm)with a 2-cM walk speed and a type-I error rate of 5%. Intervals offive background markers with a window width of 10 cM wereanalysed to control the QTL background effects. To establish athreshold for declaring statistically significant eQTLs, we used theGPT approach (West et al., 2007). A representative null distributionbased on 1 000 000 maximum likelihood ratio test (LRT) statistics(1000 permutations · 1000 randomly selected expression traits) wasemployed for all transcripts detectable for embryo-derived and leaftissues. The GPT was calculated as the 95% upper bound of therepresentative null distribution, giving 14.168. A LOD score for each2-cM interval was compared with the representative null distribu-tion, and was assigned to the corresponding P value (Churchill andDoerge, 1994).
Acknowledgements
This research was supported by a research grant from the UK Bio-technology and Biological Sciences Research Council (BBSRC), andby the Scottish Executive Environment and Rural Affairs Depart-ment (SEERAD) of the United Kingdom.
Supporting Information
Additional Supporting Information may be found in the onlineversion of this article:Figure S1. Significance of cis expression quantitative trait loci(cis-eQTLs) when two tissues are compared.Figure S2. Gene function frequencies in groups of genes withtissue-specific activity of cis-regulators.Table S1. Overall tissue effect on the transcript abundance level of22 840 genes on the Affymetrix Barley 1 GeneChip.Table S2. Co-appearance of 2081 cis expression quantitative traitloci (cis-eQTLs) when two tissues are compared.Table S3. The 34 cis expression quantitative trait loci (cis-eQTLs)that undergo a reciprocal change in the parent that contributes theallele with more abundant transcript.Please note: Wiley-Blackwell are not responsible for the content orfunctionality of any supporting materials supplied by the authors.Any queries (other than missing material) should be directed to thecorresponding author for the article.
References
Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discov-ery rate: a practical and powerful approach to multiple testing.J. R. Stat. Soc. Ser. B, 57, 289–300.
Benjamini, Y., Drai, D., Elmer, G., Kafkafi, N. and Golani, I. (2001)Controlling the false discovery rate in behavior genetics research.Behav. Brain Res. 125, 279–284.
Bilic, I., Koesters, C., Unger, B., Sekimata, M., Hertweck, A.,
Maschek, R., Wilson, C.B. and Ellmeier, W. (2006) Negative regu-lation of CD8 expression via Cd8 enhancer-mediated recruitmentof the zinc finger protein MAZR. Nat. Immunol. 7, 392–400.
Borevitz, J.O., Liang, D., Plouffe, D., Chang, H.S., Zhu, T., Weigel, D.,
Berry, C.C., Winzeler, E. and Chory, J. (2003) Large-scale identi-fication of single-feature polymorphisms in complex genomes.Genome Res. 13, 513–523.
Brem, R.B., Yvert, G., Clinton, R. and Kruglyak, L. (2002) Geneticdissection of transcriptional regulation in budding yeast. Science,296, 752–755.
Caldo, R.A., Nettleton, D. and Wise, R.P. (2004) Interaction-depen-dent gene expression in Mla-specified response to barley pow-dery mildew. Plant Cell, 16, 2514–2528.
Carroll, S.B. (2005) Evolution at two levels: on genes and form. PLoSBiol. 3(7), e245.
Churchill, G.A. and Doerge, R.W. (1994) Empirical threshold valuesfor quantitative trait mapping. Genetics, 138, 963–971.
Druka, A., Muehlbauer, G., Druka, I. et al. (2006) An atlas of geneexpression from seed to seed through barley development.Funct. Integr. Genomics, 6(3), 202–211.
Gibson, G. and Weir, B. (2005) The quantitative genetics of tran-scription. Trends Genet. 21, 616–623.
Hubner, N., Wallace, C.A., Zimdahl, H. et al. (2005) Integrated tran-scriptional profiling and linkage analysis for identification ofgenes underlying disease. Nat. Genet. 37, 243–253.
Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis,
K.J., Scherf, U. and Speed, T.P. (2003) Exploration, normalization,and summaries of high density oligonucleotide array probe leveldata. Biostatistics, 4(2), 249–264.
Iwamoto, S., Li, J., Sugimoto, N., Okuda, H. and Kajii, E. (1996)Characterization of the Duffy gene promoter: evidence fortissue-specific abolishment of expression in Fy(a-b-) of blackindividuals. Biochem. Biophys. Res. Commun. 222(3), 852–859.
Jansen, R.C. and Nap, J.P. (2001) Genetical genomics: the addedvalue from segregation. Trends Genet. 17, 338–391.
Jordan, M., Somers, D.J. and Banks, T.W. (2007) Identifyingregions of the wheat genome controlling seed developmentby mapping expression quantitative trait loci. Plant Biotechnol.J. 5, 442–453.
Keurentjes, J.J.B., Fu, J., Terpstra, I.R., Garcia, J.M., van den
Ackerveken, G., Snoek, L.B., Peeters, A.J.M., Vreugdenhil, D.,
Koornneef, M. and Jansen, R.C. (2007) Regulatory networkconstruction in Arabidopsis by using genome-wide geneexpression quantitative trait loci. Proc. Natl Acad. Sci. USA, 104,1708–1713.
Kleinhofs, A., Kilian, A., Saghai-Maroof, M.A. et al. (1993) Amolecular, isozyme and morphological map of the barley gen-ome. Theor. Appl. Genet. 86, 705–712.
Lewin, B. (1997) Genes. Oxford, New York, Tokyo: Oxford UniversityPress.
Li, Y., Alvarez, O.A., Gutteling, E.W. et al. (2006) Mapping deter-minants of gene expression plasticity by genetical genomics inC. elegans. PLOS Genet. 2, e222.
Luo, Z.W., Potokina, E., Druka, A., Wise, R., Waugh, R. and Kearsey,
M.J. (2007) Robust, high density genotyping from gene-expres-sion data in species with un-sequenced genomes. Genetics, 176,789–800.
Medawar, P.B. (1952) An Unsolved Problem in Biology. London: HKLewis.
Nguyen, L., Delves, A., Bundock, P. and Holton, T.A. (2001) Cloningand characterisation of cytochrome P456 genes from barley(Hordeum vulgare). Proceedings of the 10th Australian Barley
Tissue-dependent gene expression in barley 295
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296
Technical Symposium, 16–20 September 2001, Canberra, ACT,Australia.
Novak, J.P., Sladek, R. and Hudson, T.J. (2002) Characterization ofvariability in large-scale gene expression data: implications forstudy design. Genomics, 79, 104–113.
Peiper, S.C., Wang, Z.X., Neote, K. et al. (1995) The Duffy antigen/receptor for chemokines (DARC) is expressed in endothelial cellsof Duffy negative individuals who lack the erythrocyte receptor.J. Exp. Med. 181, 1311–1317.
Petretto, E., Mangion, J., Dickens, N.J. et al. (2006) Heritability andtissue specificity of expression quantitative trait loci. PLoS Genet.20; 2(10), e172.
Potokina, E., Druka, A., Luo, Z., Wise, R., Waugh, R. and Kearsey,
M.J. (2007) Gene expression quantitative trait locus analysis of 16000 barley genes reveals a complex pattern of genome-widetranscriptional regulation. PMID: 17944808.
Ptashne, M. and Gann, A. (1997) Transcriptional activation byrecruitment. Nature, 386, 569–577.
Rockman, M.V. and Kruglyak, L. (2006) Genetics of global geneexpression. Nat. Rev. Genet. 7, 862–872.
Rostoks, N., Ramsay, L., MacKenzie, K. et al. (2006) Recent historyof artificial outcrossing facilitates whole-genome associationmapping in elite inbred crop varieties. Proc. Natl Acad. Sci. USA,103, 18656–18661.
Ruse, C.E. and Parker, S.G. (2001) Molecular genetics and age-related disease. Age Ageing, 30, 449–454.
Schadt, E.E., Monks, S.A., Drake, T.A., Lusis, A.J., Che, N., Colinayo,
V., Ruff, T.G., Milligan, S.B., Lamb, J.R., Cavet, G., Linsley, P.S.,
Mao, M., Stoughton, R.B. and Friend, S.H. (2003) Genetics of geneexpression surveyed in maize, mouse and man. Nature, 422, 297–302.
Shen, L., Gong, J., Caldo, R.A., Nettleton, D., Cook, D., Wise, R.P.
and Dickerson, J.A. (2005) BarleyBase-an expression profilingdatabase for plant genomics. Nucleic Acids Res. 33, 614–618.
Stamatoyannopoulos, J.A. (2004) The genomics of gene expres-sion. Genomics, 84, 449–457.
Stern, D.L. (2000) Evolutionary developmental biology and theproblem of variation. Evolution, 54, 1079–1091.
West, M.A.L., Kim, K., Kliebenstein, A., van Leeuwen, H., Michel-
more, R.W., Doerge, R.W. and Clair, D.A. (2007) Global eQTLmapping reveals the complex genetic architecture of transcriptlevel variation in arabidopsis. Genetics, 175, 1441–1450.
Wray, G.A. (2007) The evolutionary significance of cis-regulatorymutations. Nat. Rev. Genet. 8(3), 206–216.
Yamashita, S., Wakazono, K., Nomoto, T., Tsujino, Y., Kuramoto, T.
and Ushijima, T. (2005) Expression quantitative trait loci analysisof 13 genes in the rat prostate. Genetics, 171, 1231–1238.
296 Elena Potokina et al.
ª 2008 The AuthorsJournal compilation ª 2008 Blackwell Publishing Ltd, The Plant Journal, (2008), 56, 287–296