Inheritance of gene expression level and selective constraints on trans- and cis-regulatory changes...

13
Article Inheritance of Gene Expression Level and Selective Constraints on Trans- and Cis-Regulatory Changes in Yeast Bernhard Schaefke, y,1,2,3 J.J. Emerson, y,4,5 Tzi-Yuan Wang, 2 Mei-Yeh Jade Lu, 2 Li-Ching Hsieh, 6,7 and Wen-Hsiung Li* ,2,8 1 National Yang-Ming University, Taipei, Taiwan 2 Biodiversity Research Center, Academia Sinica, Taipei, Taiwan 3 Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan 4 Department of Ecology & Evolutionary Biology, University of California, Irvine 5 Center for Complex Biological Systems, University of California, Irvine 6 Institute of Genomics and Bioinformatics, National Chung Hsing University, Taichung, Taiwan 7 Biotechnology Center, National Chung Hsing University, Taichung, Taiwan 8 Department of Ecology and Evolution, University of Chicago y These authors contributed equally to this work. Corresponding author: E-mail: [email protected]. Associate Editor: Jianzhi Zhang Abstract Gene expression evolution can be caused by changes in cis- or trans-regulatory elements or both. As cis and trans regulation operate through different molecular mechanisms, cis and trans mutations may show different inheritance patterns and may be subjected to different selective constraints. To investigate these issues, we obtained and analyzed gene expression data from two Saccharomyces cerevisiae strains and their hybrid, using high-throughput sequencing. Our data indicate that compared with other types of genes, those with antagonistic cis–trans interactions are more likely to exhibit over- or underdominant inheritance of expression level. Moreover, in accordance with previous studies, genes with trans variants tend to have a dominant inheritance pattern, whereas cis variants are enriched for additive inher- itance. In addition, cis regulatory differences contribute more to expression differences between species than within species, whereas trans regulatory differences show a stronger association between divergence and polymorphism. Our data indicate that in the trans component of gene expression differences genes subjected to weaker selective constraints tend to have an excess of polymorphism over divergence compared with those subjected to stronger selective constraints. In contrast, in the cis component, this difference between genes under stronger and weaker selective constraint is mostly absent. To explain these observations, we propose that purifying selection more strongly shapes trans changes than cis changes and that positive selection may have significantly contributed to cis regulatory divergence. Key words: gene regulation, expression evolution, cis effect, trans effect, functional constraint, natural selection. Introduction Phenotypic variation within or between species can be caused by differences either in protein sequences or in the abun- dance level or timing of gene expression. Divergence in gene expression has been proposed to be a major factor in the evolution of phenotypic differences between closely related species (Ohno 1972; King and Wilson 1975). Identifying the genetic changes underlying such expression differences is of great importance for understanding the evo- lution of gene regulation and its role in phenotypic evolution and speciation. The genetic causes of gene expression changes can be clas- sified into two categories: changes in cis-acting elements (e.g., promoters and enhancers), which are on the same chromo- some of the gene they affect, and changes in trans-acting factors (e.g., transcription factors and chromatin modifiers), which are diffusible and can influence the expression of genes on other chromosomes. The way in which gene expression is changed can affect its inheritance pattern and evolution (Ronald and Akey 2007). Thus, it is important to distinguish between these two types of change to understand the causes of intraspecific variation and interspecific divergence in gene expression. Two complementary methods exist for uncovering the regulatory basis of gene expression differences: expression quantitative trait loci (eQTL) mapping and hybrid experi- ments. Although eQTL mapping can be used to locate the elements responsible for expression variation and to differen- tiate between local and distant regulators, it cannot reliably distinguish between cis- and trans-acting elements, as a trans regulator also can be located on the same chromosome in close proximity to the target gene, and a cis regulatory element (e.g., an enhancer) can be distantly located from it (Rockman and Kruglyak 2006; Emerson and Li 2010). The first ß The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 30(9):2121–2133 doi:10.1093/molbev/mst114 Advance Access publication June 22, 2013 2121 at Academia SinicaLife Science Library on August 22, 2013 http://mbe.oxfordjournals.org/ Downloaded from

Transcript of Inheritance of gene expression level and selective constraints on trans- and cis-regulatory changes...

Article

Inheritance of Gene Expression Level and Selective Constraintson Trans- and Cis-Regulatory Changes in YeastBernhard Schaefkey123 JJ Emersony45 Tzi-Yuan Wang2 Mei-Yeh Jade Lu2 Li-Ching Hsieh67 andWen-Hsiung Li28

1National Yang-Ming University Taipei Taiwan2Biodiversity Research Center Academia Sinica Taipei Taiwan3Taiwan International Graduate Program Academia Sinica Taipei Taiwan4Department of Ecology amp Evolutionary Biology University of California Irvine5Center for Complex Biological Systems University of California Irvine6Institute of Genomics and Bioinformatics National Chung Hsing University Taichung Taiwan7Biotechnology Center National Chung Hsing University Taichung Taiwan8Department of Ecology and Evolution University of ChicagoyThese authors contributed equally to this work

Corresponding author E-mail whliuchicagoedu

Associate Editor Jianzhi Zhang

Abstract

Gene expression evolution can be caused by changes in cis- or trans-regulatory elements or both As cis and transregulation operate through different molecular mechanisms cis and trans mutations may show different inheritancepatterns and may be subjected to different selective constraints To investigate these issues we obtained and analyzedgene expression data from two Saccharomyces cerevisiae strains and their hybrid using high-throughput sequencing Ourdata indicate that compared with other types of genes those with antagonistic cisndashtrans interactions are more likely toexhibit over- or underdominant inheritance of expression level Moreover in accordance with previous studies geneswith trans variants tend to have a dominant inheritance pattern whereas cis variants are enriched for additive inher-itance In addition cis regulatory differences contribute more to expression differences between species than withinspecies whereas trans regulatory differences show a stronger association between divergence and polymorphism Ourdata indicate that in the trans component of gene expression differences genes subjected to weaker selective constraintstend to have an excess of polymorphism over divergence compared with those subjected to stronger selective constraintsIn contrast in the cis component this difference between genes under stronger and weaker selective constraint is mostlyabsent To explain these observations we propose that purifying selection more strongly shapes trans changes than cischanges and that positive selection may have significantly contributed to cis regulatory divergence

Key words gene regulation expression evolution cis effect trans effect functional constraint natural selection

IntroductionPhenotypic variation within or between species can be causedby differences either in protein sequences or in the abun-dance level or timing of gene expression Divergence ingene expression has been proposed to be a major factor inthe evolution of phenotypic differences between closelyrelated species (Ohno 1972 King and Wilson 1975)Identifying the genetic changes underlying such expressiondifferences is of great importance for understanding the evo-lution of gene regulation and its role in phenotypic evolutionand speciation

The genetic causes of gene expression changes can be clas-sified into two categories changes in cis-acting elements (egpromoters and enhancers) which are on the same chromo-some of the gene they affect and changes in trans-actingfactors (eg transcription factors and chromatin modifiers)which are diffusible and can influence the expression of genes

on other chromosomes The way in which gene expression ischanged can affect its inheritance pattern and evolution(Ronald and Akey 2007) Thus it is important to distinguishbetween these two types of change to understand the causesof intraspecific variation and interspecific divergence in geneexpression

Two complementary methods exist for uncovering theregulatory basis of gene expression differences expressionquantitative trait loci (eQTL) mapping and hybrid experi-ments Although eQTL mapping can be used to locate theelements responsible for expression variation and to differen-tiate between local and distant regulators it cannot reliablydistinguish between cis- and trans-acting elements as a transregulator also can be located on the same chromosome inclose proximity to the target gene and a cis regulatoryelement (eg an enhancer) can be distantly located from it(Rockman and Kruglyak 2006 Emerson and Li 2010) The first

The Author 2013 Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution All rights reserved For permissions pleasee-mail journalspermissionsoupcom

Mol Biol Evol 30(9)2121ndash2133 doi101093molbevmst114 Advance Access publication June 22 2013 2121

at Academ

ia SinicaLife Science L

ibrary on August 22 2013

httpmbeoxfordjournalsorg

Dow

nloaded from

eQTL study on a genomic scale was conducted inSaccharomyces cerevisiae (Brem et al 2002) and themethod was also successfully applied to investigate theroles of distant-acting variation (Yvert et al 2003) and ofepistatic interactions (Brem and Kruglyak 2005 Brem et al2005) in determining different gene expression phenotypes Incontrast hybrid experiments can be used to distinguish be-tween cis- and trans-regulatory changes contributing to dif-ferences in gene expression An additional advantage of thehybrid approach used in this study is that only the two pa-rental strains and their F1 hybrid have to be assayed Hybridexperiments require the measurement of mRNA levels inhomo- or hemizygous parental strains and of allele-specificexpression (ASE) in their cross As the same set of diffusibleelements acts on both parental alleles in the hybrid transvariation produces no differential effect on the two allelesso that ASE differences in the F1 hybrid can be interpreted asa direct representation of cis-regulatory variation (Cowleset al 2002 Wittkopp et al 2004) Several studies have usedthis approach to investigate the relative importance of cis andtrans variation for the expression response in yeast to differ-ent environmental conditions (Tirosh et al 2009 Li et al2012) or in the evolution of specific pathways (Chang et al2008) Furthermore it has been successfully applied to inves-tigate the role of cis and trans effects in nucleosome position-ing (Tirosh et al 2010) in protein expression (Khan et al2012) and in the regulation of DNA replication timing inyeast (Muller and Nieduszynski 2012)

Past research has unraveled the genetic causes of geneexpression differences between various Saccharomyces strainsand species and their relevance for phenotypic evolution In arecent study several transcription factors were identifiedwhose expression profiles differ between wine strains result-ing in the production of volatile aroma compounds in a pre-dictable manner (Rossouw et al 2012) Another studyidentified a subset of genes for which the expression diver-gence between two yeast strains correlates well with changesin predicted transcription factor binding sites (TFBSs) (Chenet al 2010) The Barkai lab compared the expression of genesinduced during mating among three different Saccharomycesspecies and found that divergence in TFBSs of the transcrip-tion factor STE12 could explain about half of the expressiondifferences although they found no general correlation be-tween promoter sequence divergence and gene expressionevolution in yeasts and mammals (Tirosh et al 2008) Thesame group found a positive correlation between divergencein promoter sequence and the cis-component of gene expres-sion differences in a comparison of two Saccharomyces spe-cies and their hybrid (Tirosh et al 2009)

Several properties of the promoter region play an impor-tant role in the evolution of gene regulation especially intrans Genes with a significant trans effect in a comparisonbetween S cerevisiae and S paradoxus lack a pronouncednucleosome-free region tend to contain a TATA box intheir promoter region and consistently display larger expres-sion differences between different yeast strains or species(Tirosh et al 2009) In addition the sensitivity of expressionlevels to genetic perturbations in mutation accumulation

lines is enhanced for TATA box-containing genes(Landry et al 2007) The relationship between nucleosomeoccupancy of the promoter region and gene expression evo-lution is complex Yeast genes can be roughly classified intotwo groups those with a promoter containing a well-definednucleosome-free region close to the transcription start sitereferred to as DPN (depleted proximal-nucleosome) genesand those with a promoter lacking such a region referredto as OPN (occupied proximal-nucleosome) genes(Tirosh and Barkai 2008) The latter group exhibits more ex-pression plasticity between different environmental condi-tions and more cell-to-cell variability (Tirosh and Barkai2008) Although some studies found that changes in nucleo-some occupancy are related to divergence in gene expression(Field et al 2009 Tsankov et al 2010) a comparison ofS cerevisiae and its closest relative S paradoxus found norelationship between divergence of gene expression and di-vergence of nucleosome positioning (Tirosh et al 2010) Incontrast a recent experimental evolution study selectingyeast strains for overexpression of a target gene found differ-ent evolutionary mechanisms for the two classes DPN geneshave a stronger tendency to be duplicated (which could beconsidered as a cis acting mutation) than OPN genes whichhave predominantly undergone trans regulatory changes(Rosin et al 2012)

Several studies have shown a relationship between themode of inheritance of gene expression levels and the molec-ular mechanisms of gene regulatory differences Alleles con-ferring cis regulatory variation tend to have an additiveinfluence on gene expression level with the expression levelin the hybrid being intermediate between those of the twoparents (Lemos et al 2008 McManus et al 2010) This ishypothesized to contribute to positive selection on cis-regu-latory elements over long evolutionary time (Lemos et al2008) because the expression levels of single genes can beldquofine-tunedrdquo so that a gradual adaptation to changing selec-tive pressures can take place On the other hand genes withantagonistic cis-trans interactions have been found to beenriched for over- or underdominant inheritance of geneexpression level (with the mRNA level in the hybrid beingeither higher or lower than those in both parents) and couldplay a role in the development of hybrid incompatibilities(Landry et al 2005 McManus et al 2010) It has been pro-posed that even when the expression level of a gene is understabilizing selection its regulatory elements may undergo di-vergent evolution between species if mutations in cis (trans)are balanced by compensatory mutations in trans (cis)(Landry et al 2005)

Different inheritance patterns of changes in cis and transand the potentially pleiotropic nature of trans mutations arelikely to result in different evolutionary constraints Changesin trans regulators can impact the expression of multipledownstream genes and can thus be expected to affect mul-tiple phenotypic traits more often than cis regulatory changes

Indeed a study of mutation accumulation and naturalisolate lines in Caenorhabditis elegans found that mosttrans-acting mutations that resulted in expression changesof multiple genes were quickly removed by selection in

2122

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

natural populations (Denver et al 2005) Furthermore differ-ences in cis regulatory elements appear to play a larger role inexpression differences between species than within species(Wittkopp et al 2008a Emerson et al 2010) Additionallygenes which show a significant gene expression differencein trans between two different S cerevisiae strains also tendto exhibit more gene expression divergence in trans betweenS cerevisiae and S paradoxus while in cis this trend is weak orabsent (Emerson et al 2010)

These findings could be explained by stronger positiveselection on cis divergence and stronger selective constrainton trans-acting factors As selective constraint is expected toaffect essential genes more strongly than nonessential genesits impact on gene regulatory evolution in cis and trans can beevaluated by comparing the cis and trans components ofwithin-species and between-species gene expression differ-ences for genes of higher and lower importance

In this study we investigate the relationship between geneexpression inheritance patterns and regulatory differences incis and trans between two strains of Saccharomyces cerevisiaeRM11-1 (RM) and BY4741 (BY) Our results indicate thatgenes with antagonistic cisndashtrans interactions are morelikely to show an under- or overdominant inheritance patternin our within-species hybrids whereas essential genes are lesslikely to exhibit an underdominant inheritance pattern

In addition we integrate the data from an interspeciescomparison (Tirosh et al 2009) with our data and evaluatethe role of selective constraint on changes in cis and transfactors We show that trans regulatory mutations indeed tendto be under stronger selective constraint than cis regulatorymutations and that this observation may explain the relativecontributions of cis and trans changes to intra- and in-terspecific gene expression differences

Results

Transcriptome Sequencing and Expression LevelEstimation

We selected 4442 genes for our study (Methods Emersonet al 2010) and estimated their expression levels usingIllumina paired-end (PE) sequencing with a read length of151 base pairs (bp) In the hybrid sample 4558258 readswere mapped specifically to the BY genome and 4564464reads to the RM genome In the coculture sample 3745275reads were mapped as BY specific and 3776660 reads as RMspecific This new data set enabled us to analyze ASE differ-ences with greater power than the expression data obtainedin a previous study (Emerson et al 2010) In this study 4237out of the 4442 genes under study have more than 10 se-quence reads for both alleles in both experiments (cocultureand hybrid) and were used for further analyses Among the4237 genes under study 2268 genes (535) show a signifi-cant expression polymorphism in coculture and 1207 (285)show a significant ASE difference in the hybrid (binomialexact test false discovery rate [FDR] lt5 see Materialsand Methods) These are two times higher than the corre-sponding numbers in Emerson et al (2010) in which thenumbers of genes with significant ASE differences are 1294

(351) and 488 (132) out of the 3685 genes under studyfor coculture and hybrid respectively using the same criteriafor statistical significance Thus the new data set allows us todo more rigorous statistical analyses

Classifying Gene Expression Differences in Terms ofCis and Trans Effects

As discussed earlier a cis-regulatory factor influences the ex-pression level of only the allele on the same chromosomewhereas a trans-regulatory factor can affect the expression ofboth alleles in a cell Therefore it is possible to estimate therelative contributions of cis- and trans-regulatory changes todifferences in gene expression between the RM and the BYstrain by comparing the ASE in the hybrid to expression dif-ferences between the two parental strains (Wittkopp et al2004) We assume that there is no allele-specific preferentialbinding of the maternal or paternal transcription factor(Takahasi et al 2011) and that the expression of an allele isindependent of the other that is there is no transvectionTherefore the expression differences between the twoparental alleles in the hybrid are interpreted as a directrepresentation of cis-regulatory differences (Cowles et al2002) (see examples in supplementary figs S1 and S2Supplementary Material online) because in the same cellthe trans-regulatory milieu is identical for the two allelesThe expression difference between the two parental strainsin coculture is thus interpreted as a combination of cis- andtrans-effect (see examples in supplementary figs S1 and S2Supplementary Material online) In agreement with previousstudies (Wittkopp et al 2008b Emerson et al 2010) we foundthat trans effects dominate in our within-species comparison1577 (373) of the 4237 genes under study show a signifi-cant trans effect whereas only 1267 (30) show a significantcis effect (significance was determined using the likelihoodratio test FDRlt 5 see Materials and Methods) Themedian absolute trans effect (0301) is significantly higherthan the median cis component (0166) (Wilcoxon ranksum test P valuelt 22 1016) Although genes with signif-icant expression differences in coculture or in the hybridshowed a higher single nucleotide polymorphism (SNP) den-sity than nondifferentially expressed genes (Wilcoxon ranksum test P valuelt 22 1016) we found no difference be-tween cis or trans regulatory changes related to gene SNPdensity (supplementary table S1 Supplementary Materialonline) Differentially expressed genes also showed a signifi-cantly higher sequence divergence in the promoter region(defined as 500 bp upstream of the transcription start site)(supplementary table S2 Supplementary Material online)This trend is significant for both the cis and the trans effectbut stronger for the cis component of expressiondifferences as can be expected (Wilcoxon rank sum testcis P value = 779 1012 trans P value = 1497 105)Similarly genes whose promoter region contains a TATAbox (Basehoar et al 2004) are more likely to be differentiallyexpressed than those without a TATA box (supplementarytable S3 Supplementary Material online) Genes withouta well-defined nucleosome-free region close to the

2123

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

transcription start site (ie OPN genes) are more likely to bedifferentially expressed than those with such a region (ieDPN genes) (the gene sets were defined in Tirosh et al2008) This relationship is observed in trans and in cis (sup-plementary table S4 Supplementary Material online) but isstronger for the trans effect (Fisherrsquos exact test transP value = 3993 105 cis P value = 00027) We testedwhether genes with significant and consistent cis or transeffects (in the old and the new data set) were enriched inspecific biological processes or cellular components in theGene Ontology (GO) annotation using the FunSpec analysistool (Robinson et al 2002) Genes with a significant transcomponent were enriched in mitochondrial electrontransport (GO term ldquomitochondrial electron transport ubi-quinol to cytochrome crdquo [GO identifier 0006122] P value109 107 and overlapping terms) and in the biosynthesisof ergosterol (GO term ldquoergosterol biosynthetic processrdquo [GOidentifier 0006696] P value 18 107 and overlappingterms) a major component of the fungal cell membraneGenes with a significant cis effect were enriched in oxida-tionreduction among biological processes (GO term ldquooxida-tionndashreduction processrdquo GO identifier 0055114 P value945 107 and overlapping terms) and in the cell wallamong cellular components (GO term ldquocell wallrdquo GO iden-tifier 0005618 P value 219 109 and related terms) This isconsistent with the previous finding of an enrichment for cellwall related genes among those with local regulatory differ-ences between the RM and BY strains (Chen et al 2010) Toestimate the importance of transcription factors in trans reg-ulatory evolution versus changes in sensory and signalingmolecules or chromatin modifiers we compared gene pairswhich either share a common regulator (Teixeira et al 2006)but belong to different expression modules (Ihmels et al2002) or which belong to the same module(s) but are notknown to be regulated by an identical transcription factorWe did not find any significant difference between these twosets of genes regarding the probability of both genes in a pairhaving expression differences in trans in the same directionthat is favoring the allele from the same strain (either BY orRM) for both genes (supplementary table S5 SupplementaryMaterial online)

The genes under study were classified into five categoriesas in McManus et al (2010) but with some different categorynames as follows

1) Nondifferential no significant expression difference be-tween the RM and the BY allele in coculture or hybrid Itis the same as the ldquoconservedrdquo category in McManuset al (2010)

2) Cis only A significant cis- component but no significanttrans difference

3) Trans only A significant trans-component but no signif-icant cis difference

4) Cis + Trans The cis and trans components are bothsignificant and work in the same direction (supplemen-tary fig S1 Supplementary Material online)

5) Cis Trans The cis and trans components are both sig-nificant but have opposite effects It can be divided into

three subcategories according to the relative magnitudesof the cis and trans components (supplementary fig S2Supplementary Material online)a) ldquoCis Trans (tgt c)rdquo (ie ldquocis transrdquo with a

greater absolute trans effect) The log2 expressionratios in coculture and in the hybrid have differentsigns (the allele which is more highly expressed inthe hybrid has lower expression levels in the paren-tal comparison) it is equivalent to the ldquocis transrdquocategory in McManus et al (2010)

b) ldquoCis Trans (c = t)rdquo The cis and the trans compo-nent work in opposite directions and have approx-imately the same absolute value no significantexpression difference between the two alleles inthe parental strains it is equivalent to the ldquocom-pensatoryrdquo category in McManus et al (2010)

c) ldquoCis Trans (cgt t)rdquo (ie ldquocis transrdquo with agreater absolute cis effect) The cis and trans com-ponents have opposite signs but the log2 expres-sion ratios in hybrid and in coculture have the samesign (ie the same allele is favored in coculture andhybrid but the absolute expression ratio in hybrid isgreater than that in coculture) it was assigned tothe ldquocis + transrdquo category by McManus et al(2010)

Among the 4237 genes under study 2077 (49) showedno significant expression difference between the RM and BYalleles in hybrid or in coculture and were classified as nondif-ferential Among the 2160 (51) ldquodifferentially expressedrdquogenes 583 genes (138) were classified as ldquocis onlyrdquo whereas893 (211) as ldquotrans onlyrdquo (fig 1 and table 1) The groupldquocis + transrdquo comprises only 172 genes (41) The totalnumber of genes falling into the ldquocis ndash transrdquo category is512 (121) Among these 234 genes (55) have cis andtrans effects of approximately equal magnitude and wereclassified as ldquocis trans (c = t)rdquo Only 71 genes (17) in theldquocis transrdquo category have a larger cis component and wereclassified as ldquocis trans (cgt t)rdquo In contrast 207 genes (49)fall into the ldquocis trans (tgt c)rdquo category having a strongertrans effect than a cis effect These observations show theoverall prevalence of trans regulatory changes in our within-species comparison

Inheritance Mode of Gene Expression Level VersusASE Differences in Cis and Trans

To study the mode of inheritance the expression levels of thehybrid and the parental strains were compared for each genein three comparisons 1) the expression of the gene in theparental BY strain (ldquoBYrdquo) versus in the parental RM strain(ldquoRMrdquo) in coculture 2) the expression of the gene in BY versusthe total expression level in the hybrid and 3) the expressionof the gene in RM versus the total expression level in thehybrid A gene was classified as conserved if the expressiondifference in each of the three comparisons was not statisti-cally significant or was less than 25 This category comprised437 of the genes (18524237) The other 2385 genes(563) were nonconserved and assigned to one of the

2124

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

categories ldquoadditiverdquo ldquoBY dominantrdquo ldquoRM dominantrdquo ldquoover-dominantrdquo and ldquounderdominantrdquo (fig 2) The ldquoadditiverdquo cat-egory comprised 434 genes (102) Interestingly 1048 genes(247) were classified as ldquoRM dominantrdquo but only approx-imately half as many (445 genes 105) were classified as ldquoBYdominantrdquo In total 458 genes (108) were misexpressed(overdominant or underdominant) in the hybrid the under-dominant expression pattern was found in 294 genes (69)and the overdominant pattern in 164 genes (39)

To investigate how the molecular mechanism of geneexpression differences influences the inheritance mode ofthe expression level we examined whether an inheritancemode is enriched for genes belonging to a specific expressiondivergence pattern (table 1) Consistent with previous studies(Lemos et al 2008 McManus et al 2010) we found a weakbut significant relationship between cis regulation and addi-tive inheritance The median percent cis for genes with

additive inheritance (3984) was significantly higher thanfor those with the other inheritance modes (3746)(Wilcoxon rank sum test P value = 00014)

Additionally in agreement with previous findings (Lemoset al 2008) genes with dominant inheritance (either RM orBY dominant) showed a strong enrichment for trans regula-tory variation The median percent trans was significantlyhigher for genes with dominant inheritance (6883) thanfor the other genes (5934) (Wilcoxon rank sum testP valuelt 14 1014)

Furthermore we investigated whether genes in theldquocis transrdquo category disproportionately contributed to mis-expression in our within-species cross as previously describedfor between-species hybrids (Landry et al 2005 McManuset al 2010) Indeed we found an enrichment for misexpressedgenes in the ldquocis transrdquo category (Fisherrsquos exact testP valuelt 5 109 table 2) This relationship remains signif-icant even when both ldquoconservedrdquo and ldquonondifferentialrdquogenes are removed from the analysis (table 2)

Different Constraints on Cis and Trans RegulatoryComponents

We divided genes into different classes expected to be underrelatively weak or strong selective constraint using three cri-teria 1) the ratio of the rate of nonsynonymous substitutionto the rate of synonymous substitution () genes with an higher than the median value (~009) were classified as lessconserved and those with a lower as more conserved2) connectivity in proteinndashprotein interaction (PPI) networks(Stark et al 2006 Collins et al 2007) genes with more than

minus4 minus2 0 2

minus6

minus4

minus2

02

4

trans

cis

(a)

NonminusdifferentialCis OnlyTrans OnlyCis + TransCisminusTrans

Nonminusdifferential Cis Only Trans Only Cis + Trans Cis minusTrans

(b)

050

010

0015

0020

00

2077 (4902)

583 (1376)

893 (2108)

172 (406)

512 (1208)

FIG 1 Classification of genes according to cis or trans effects (a) Scatterplot Y axis the cis component [the log2-ratio of reads in the hybrid samplemapped to the RM and BY genomes log2(ecis) = log2(eHy) = log2(RMHyBYHy)] X axis the trans component [difference between parental and hybridlog2-transformed ASE ratios log2(etrans) = log2(eCoecis) = log2(RMCoBYCo) log2(RMHyBYHy)] Notations RMHy expression level of the RM allele inthe hybrid BYHy expression level of the BY allele in the hybrid eHy ASE ratio in the hybrid RMCo expression level of the RM allele in the cocultureBYCo expression level of the BY allele in the coculture and eCo ASE ratio in the coculture (b) The bar graph shows the number of genes in eachcistrans category

Table 1 Number of Genes Falling into Different Combinations ofInheritance and cistrans Categories

Inheritance mode Regulatory Effect

Nondifferential TransOnly

CisOnly

Cis +

TransCisTrans

Sum

Conserved 1265 199 165 6 217 1852

RM dominant 371 376 149 66 86 1048

BY dominant 147 116 85 24 73 445

Additive 53 134 141 67 39 434

Overdominant 69 11 23 1 60 164

Underdominant 172 57 20 8 37 294

Sum 2077 893 583 172 512 4237

2125

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

the median (four known interaction partners) were classifiedas more constrained whereas those with no known interac-tion partners as less constrained and 3) essentiality essentialgenes versus genes with a fitness gt085 in knock-out exper-iments (Deutschbauer et al 2005) We used the ratiop(p + d) [polymorphism(polymorphism + divergence)]as a measure to detect shifts toward polymorphism or diver-gence where p represents the absolute value of either the cisor the trans component in gene expression differences be-tween different strains of the same species whereas d repre-sents the respective value for the interspecies comparison(figs 3 and 4) (the divergence data were obtained fromTirosh et al 2009) The ratios pcis(pcis + dcis) and ptrans(ptrans + dtrans) were then each compared between categorieswith expected strong selective constraints or with expectedweak selective constraints for each of the three criteria ofselective constraints (figs 3 and 4) All three comparisonsshowed a significant relative abundance of polymorphismin trans for the less constrained category when comparedwith the more constrained category (Wilcoxon rank-sumtest P valueslt 001 in all three comparisons) In contrastin cis none of the three comparisons showed a significantdifference between the more and the less constrained cate-gory (P valuesgt 02) We tested for the equality of the distri-butions of p(p + d) in the more constrained and less

constrained categories and found significant differences forall three comparisons in trans (bootstrapped KolmogorovndashSmirnov [KS] tests P values 0002 in all three comparisons)but not in cis (bootstrapped KS tests P valuesgt 02 in allthree comparisons) In agreement with these observationswe find that essential genes are significantly less likely thannonessential genes to have a significant trans effect (supple-mentary table S6 Supplementary Material online Fisherrsquosexact test P value = 0018) whereas there is no significantdifference in cis (supplementary table S6 SupplementaryMaterial online Fisherrsquos exact test P value = 037)

Misexpressed inheritance modes are slightly under-repre-sented among essential genes for our within-species data thisdifference is not statistically significant in comparison withall other inheritance categories (supplementary table S7Supplementary Material online Fisherrsquos exact testP value = 035) However misexpressed genes are significantlyless likely to be essential if the comparison is restricted only togenes with a conserved total expression level (supplementarytable S7 Supplementary Material online Fisherrsquos exact testP value = 0022) This tendency for misexpressed genes to beless essential appears to be largely due to an enrichment ofnonessential genes among those with underdominant inher-itance Underdominant genes are significantly less likely to beessential in comparison with all other inheritance categories(table 3 Fisherrsquos exact test P valuelt 9763 109) and alsoin comparison with overdominant genes only (table 3 Fisherrsquosexact test P value = 114 1013)

As genes in the ldquocis transrdquo category are more likely to bemisexpressed (table 2) we test whether this category exhibitsa similar enrichment for nonessential genes Indeed nones-sential genes are more likely to fall into the ldquocis transrdquo cat-egory when compared with all other categories takentogether This is true not only for our within-species compar-ison (table 4 Fisherrsquos exact test P value = 0078) but also forthe between-species data of Tirosh et al (table 4P value = 00003) In contrast compared with ldquocis + transrdquogenes only this enrichment for nonessential genes is not

minus6 minus4 minus2 0 2

minus6

minus4

minus2

02

4

log2(Hybrid) minus log2(BY)

log 2

(Hyb

rid)

minus lo

g 2(R

M)

(a)

ConservedRM dominantBY dominantAdditiveOverdominantUnderdominant

ConservedRM

dominantBY

dominant AdditiveOver

dominantUnder

dominant

(b)

050

010

0015

0020

00 1852(4371)

1048(2473)

445(105)

434(1024)

164(387)

294(694)

FIG 2 Inheritance modes (a) The scatterplot compares the differences in expression level between the F1 hybrid and each of the parental strains(BY on the X axis and RM on the Y axis) (b) The bar graph shows the number of genes in each inheritance category

Table 2 Enrichment of Genes with Under- or OverdominantInheritance in the ldquoCis Transrdquo Category

Regulatory Effect Inheritance mode

Misexpressed Other

Cis trans vs other categoriesa 97 415361 3364

Cis transb vs other categoriesb 97 198120 1158

aMisexpressed genes are enriched for genes in the ldquocis transrdquo category Fisherrsquosexact test P value = 499 109bIndicates that both the ldquoconservedrdquo and ldquonondifferentialrdquo genes are removed fromthe analysis Fisherrsquos exact test P-valuelt 22 1016

2126

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

significant (P value = 049 within-species and P value = 051between species table 4)

DiscussionCis and trans regulatory factors differ in the way in which theyinfluence gene expression levels and their inheritance pat-terns Thus they may be subjected to different selection pres-sures and this should be reflected in the way gene regulatorynetworks evolve It remains difficult to tease apart these dif-ferent gene regulatory mechanisms and their evolutionarypathways However an elegant approach is the use ofhybrid experiments and the comparison of cis and trans ef-fects in crosses between and within species (Wittkopp et al2004) although some of the underlying assumptions mightlead to an overestimation of the relative contribution of cischanges to gene expression differences (Takahasi et al 2011)

Our new data set provides more power to detect geneexpression differences than our previous data (Emersonet al 2010) Indeed only 487 (10112078) of the geneswith a significant expression difference in coculture in thenew data set were also significantly different in the old dataset but 782 (10111293) of the genes with a significant

expression difference in the old data set are also significantlydifferent in the new data set (supplementary table S8Supplementary Material online) For ASE differences in thehybrid the respective numbers are 276 (3051106) and628 (305486) (supplementary table S8 SupplementaryMaterial online) The relatively small number of geneswhich were differentially expressed in the old data set butnot in the new one (282 in coculture and 181 in the hybrid)could be due to variation between biological or technicalreplicates and stochastic fluctuations (Busby et al 2011)

Our new analysis examines the different effects of func-tional constraints on the cis and trans components of geneexpression differences In general highly deleterious muta-tions would be quickly removed from the population andare unlikely to be observed either as differences betweenstrains or between species whereas slightly deleterious muta-tions may be found in polymorphisms The chance for slightlydeleterious mutations to be observed as trans polymorphismscould be high because of frequent trans changes (Wittkopp2005 Landry et al 2007) and the insufficient evolutionarytime for selection to remove the mutations from the popu-lation However they are unlikely to contribute significantly

Low w

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d) trans p(p+d)

Den

sity

00 02 04 06 08 10

00

05

10

15 High w

Den

sity

00 02 04 06 08 10

00

05

10

15 All genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Nonessential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

05

10

15

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

05

10

15

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

05

10

15

Both groups combined

Den

sity

00 02 04 06 08 100

00

51

01

5

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 3 The distribution of the ratio p(p + d) is significantly different between constrained and less constrained categories for all three classificationsystems in trans (andashc the ratio of the rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivityin PPI networks) Notations p trans polymorphism d trans divergence

2127

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

eQTL study on a genomic scale was conducted inSaccharomyces cerevisiae (Brem et al 2002) and themethod was also successfully applied to investigate theroles of distant-acting variation (Yvert et al 2003) and ofepistatic interactions (Brem and Kruglyak 2005 Brem et al2005) in determining different gene expression phenotypes Incontrast hybrid experiments can be used to distinguish be-tween cis- and trans-regulatory changes contributing to dif-ferences in gene expression An additional advantage of thehybrid approach used in this study is that only the two pa-rental strains and their F1 hybrid have to be assayed Hybridexperiments require the measurement of mRNA levels inhomo- or hemizygous parental strains and of allele-specificexpression (ASE) in their cross As the same set of diffusibleelements acts on both parental alleles in the hybrid transvariation produces no differential effect on the two allelesso that ASE differences in the F1 hybrid can be interpreted asa direct representation of cis-regulatory variation (Cowleset al 2002 Wittkopp et al 2004) Several studies have usedthis approach to investigate the relative importance of cis andtrans variation for the expression response in yeast to differ-ent environmental conditions (Tirosh et al 2009 Li et al2012) or in the evolution of specific pathways (Chang et al2008) Furthermore it has been successfully applied to inves-tigate the role of cis and trans effects in nucleosome position-ing (Tirosh et al 2010) in protein expression (Khan et al2012) and in the regulation of DNA replication timing inyeast (Muller and Nieduszynski 2012)

Past research has unraveled the genetic causes of geneexpression differences between various Saccharomyces strainsand species and their relevance for phenotypic evolution In arecent study several transcription factors were identifiedwhose expression profiles differ between wine strains result-ing in the production of volatile aroma compounds in a pre-dictable manner (Rossouw et al 2012) Another studyidentified a subset of genes for which the expression diver-gence between two yeast strains correlates well with changesin predicted transcription factor binding sites (TFBSs) (Chenet al 2010) The Barkai lab compared the expression of genesinduced during mating among three different Saccharomycesspecies and found that divergence in TFBSs of the transcrip-tion factor STE12 could explain about half of the expressiondifferences although they found no general correlation be-tween promoter sequence divergence and gene expressionevolution in yeasts and mammals (Tirosh et al 2008) Thesame group found a positive correlation between divergencein promoter sequence and the cis-component of gene expres-sion differences in a comparison of two Saccharomyces spe-cies and their hybrid (Tirosh et al 2009)

Several properties of the promoter region play an impor-tant role in the evolution of gene regulation especially intrans Genes with a significant trans effect in a comparisonbetween S cerevisiae and S paradoxus lack a pronouncednucleosome-free region tend to contain a TATA box intheir promoter region and consistently display larger expres-sion differences between different yeast strains or species(Tirosh et al 2009) In addition the sensitivity of expressionlevels to genetic perturbations in mutation accumulation

lines is enhanced for TATA box-containing genes(Landry et al 2007) The relationship between nucleosomeoccupancy of the promoter region and gene expression evo-lution is complex Yeast genes can be roughly classified intotwo groups those with a promoter containing a well-definednucleosome-free region close to the transcription start sitereferred to as DPN (depleted proximal-nucleosome) genesand those with a promoter lacking such a region referredto as OPN (occupied proximal-nucleosome) genes(Tirosh and Barkai 2008) The latter group exhibits more ex-pression plasticity between different environmental condi-tions and more cell-to-cell variability (Tirosh and Barkai2008) Although some studies found that changes in nucleo-some occupancy are related to divergence in gene expression(Field et al 2009 Tsankov et al 2010) a comparison ofS cerevisiae and its closest relative S paradoxus found norelationship between divergence of gene expression and di-vergence of nucleosome positioning (Tirosh et al 2010) Incontrast a recent experimental evolution study selectingyeast strains for overexpression of a target gene found differ-ent evolutionary mechanisms for the two classes DPN geneshave a stronger tendency to be duplicated (which could beconsidered as a cis acting mutation) than OPN genes whichhave predominantly undergone trans regulatory changes(Rosin et al 2012)

Several studies have shown a relationship between themode of inheritance of gene expression levels and the molec-ular mechanisms of gene regulatory differences Alleles con-ferring cis regulatory variation tend to have an additiveinfluence on gene expression level with the expression levelin the hybrid being intermediate between those of the twoparents (Lemos et al 2008 McManus et al 2010) This ishypothesized to contribute to positive selection on cis-regu-latory elements over long evolutionary time (Lemos et al2008) because the expression levels of single genes can beldquofine-tunedrdquo so that a gradual adaptation to changing selec-tive pressures can take place On the other hand genes withantagonistic cis-trans interactions have been found to beenriched for over- or underdominant inheritance of geneexpression level (with the mRNA level in the hybrid beingeither higher or lower than those in both parents) and couldplay a role in the development of hybrid incompatibilities(Landry et al 2005 McManus et al 2010) It has been pro-posed that even when the expression level of a gene is understabilizing selection its regulatory elements may undergo di-vergent evolution between species if mutations in cis (trans)are balanced by compensatory mutations in trans (cis)(Landry et al 2005)

Different inheritance patterns of changes in cis and transand the potentially pleiotropic nature of trans mutations arelikely to result in different evolutionary constraints Changesin trans regulators can impact the expression of multipledownstream genes and can thus be expected to affect mul-tiple phenotypic traits more often than cis regulatory changes

Indeed a study of mutation accumulation and naturalisolate lines in Caenorhabditis elegans found that mosttrans-acting mutations that resulted in expression changesof multiple genes were quickly removed by selection in

2122

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

natural populations (Denver et al 2005) Furthermore differ-ences in cis regulatory elements appear to play a larger role inexpression differences between species than within species(Wittkopp et al 2008a Emerson et al 2010) Additionallygenes which show a significant gene expression differencein trans between two different S cerevisiae strains also tendto exhibit more gene expression divergence in trans betweenS cerevisiae and S paradoxus while in cis this trend is weak orabsent (Emerson et al 2010)

These findings could be explained by stronger positiveselection on cis divergence and stronger selective constrainton trans-acting factors As selective constraint is expected toaffect essential genes more strongly than nonessential genesits impact on gene regulatory evolution in cis and trans can beevaluated by comparing the cis and trans components ofwithin-species and between-species gene expression differ-ences for genes of higher and lower importance

In this study we investigate the relationship between geneexpression inheritance patterns and regulatory differences incis and trans between two strains of Saccharomyces cerevisiaeRM11-1 (RM) and BY4741 (BY) Our results indicate thatgenes with antagonistic cisndashtrans interactions are morelikely to show an under- or overdominant inheritance patternin our within-species hybrids whereas essential genes are lesslikely to exhibit an underdominant inheritance pattern

In addition we integrate the data from an interspeciescomparison (Tirosh et al 2009) with our data and evaluatethe role of selective constraint on changes in cis and transfactors We show that trans regulatory mutations indeed tendto be under stronger selective constraint than cis regulatorymutations and that this observation may explain the relativecontributions of cis and trans changes to intra- and in-terspecific gene expression differences

Results

Transcriptome Sequencing and Expression LevelEstimation

We selected 4442 genes for our study (Methods Emersonet al 2010) and estimated their expression levels usingIllumina paired-end (PE) sequencing with a read length of151 base pairs (bp) In the hybrid sample 4558258 readswere mapped specifically to the BY genome and 4564464reads to the RM genome In the coculture sample 3745275reads were mapped as BY specific and 3776660 reads as RMspecific This new data set enabled us to analyze ASE differ-ences with greater power than the expression data obtainedin a previous study (Emerson et al 2010) In this study 4237out of the 4442 genes under study have more than 10 se-quence reads for both alleles in both experiments (cocultureand hybrid) and were used for further analyses Among the4237 genes under study 2268 genes (535) show a signifi-cant expression polymorphism in coculture and 1207 (285)show a significant ASE difference in the hybrid (binomialexact test false discovery rate [FDR] lt5 see Materialsand Methods) These are two times higher than the corre-sponding numbers in Emerson et al (2010) in which thenumbers of genes with significant ASE differences are 1294

(351) and 488 (132) out of the 3685 genes under studyfor coculture and hybrid respectively using the same criteriafor statistical significance Thus the new data set allows us todo more rigorous statistical analyses

Classifying Gene Expression Differences in Terms ofCis and Trans Effects

As discussed earlier a cis-regulatory factor influences the ex-pression level of only the allele on the same chromosomewhereas a trans-regulatory factor can affect the expression ofboth alleles in a cell Therefore it is possible to estimate therelative contributions of cis- and trans-regulatory changes todifferences in gene expression between the RM and the BYstrain by comparing the ASE in the hybrid to expression dif-ferences between the two parental strains (Wittkopp et al2004) We assume that there is no allele-specific preferentialbinding of the maternal or paternal transcription factor(Takahasi et al 2011) and that the expression of an allele isindependent of the other that is there is no transvectionTherefore the expression differences between the twoparental alleles in the hybrid are interpreted as a directrepresentation of cis-regulatory differences (Cowles et al2002) (see examples in supplementary figs S1 and S2Supplementary Material online) because in the same cellthe trans-regulatory milieu is identical for the two allelesThe expression difference between the two parental strainsin coculture is thus interpreted as a combination of cis- andtrans-effect (see examples in supplementary figs S1 and S2Supplementary Material online) In agreement with previousstudies (Wittkopp et al 2008b Emerson et al 2010) we foundthat trans effects dominate in our within-species comparison1577 (373) of the 4237 genes under study show a signifi-cant trans effect whereas only 1267 (30) show a significantcis effect (significance was determined using the likelihoodratio test FDRlt 5 see Materials and Methods) Themedian absolute trans effect (0301) is significantly higherthan the median cis component (0166) (Wilcoxon ranksum test P valuelt 22 1016) Although genes with signif-icant expression differences in coculture or in the hybridshowed a higher single nucleotide polymorphism (SNP) den-sity than nondifferentially expressed genes (Wilcoxon ranksum test P valuelt 22 1016) we found no difference be-tween cis or trans regulatory changes related to gene SNPdensity (supplementary table S1 Supplementary Materialonline) Differentially expressed genes also showed a signifi-cantly higher sequence divergence in the promoter region(defined as 500 bp upstream of the transcription start site)(supplementary table S2 Supplementary Material online)This trend is significant for both the cis and the trans effectbut stronger for the cis component of expressiondifferences as can be expected (Wilcoxon rank sum testcis P value = 779 1012 trans P value = 1497 105)Similarly genes whose promoter region contains a TATAbox (Basehoar et al 2004) are more likely to be differentiallyexpressed than those without a TATA box (supplementarytable S3 Supplementary Material online) Genes withouta well-defined nucleosome-free region close to the

2123

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

transcription start site (ie OPN genes) are more likely to bedifferentially expressed than those with such a region (ieDPN genes) (the gene sets were defined in Tirosh et al2008) This relationship is observed in trans and in cis (sup-plementary table S4 Supplementary Material online) but isstronger for the trans effect (Fisherrsquos exact test transP value = 3993 105 cis P value = 00027) We testedwhether genes with significant and consistent cis or transeffects (in the old and the new data set) were enriched inspecific biological processes or cellular components in theGene Ontology (GO) annotation using the FunSpec analysistool (Robinson et al 2002) Genes with a significant transcomponent were enriched in mitochondrial electrontransport (GO term ldquomitochondrial electron transport ubi-quinol to cytochrome crdquo [GO identifier 0006122] P value109 107 and overlapping terms) and in the biosynthesisof ergosterol (GO term ldquoergosterol biosynthetic processrdquo [GOidentifier 0006696] P value 18 107 and overlappingterms) a major component of the fungal cell membraneGenes with a significant cis effect were enriched in oxida-tionreduction among biological processes (GO term ldquooxida-tionndashreduction processrdquo GO identifier 0055114 P value945 107 and overlapping terms) and in the cell wallamong cellular components (GO term ldquocell wallrdquo GO iden-tifier 0005618 P value 219 109 and related terms) This isconsistent with the previous finding of an enrichment for cellwall related genes among those with local regulatory differ-ences between the RM and BY strains (Chen et al 2010) Toestimate the importance of transcription factors in trans reg-ulatory evolution versus changes in sensory and signalingmolecules or chromatin modifiers we compared gene pairswhich either share a common regulator (Teixeira et al 2006)but belong to different expression modules (Ihmels et al2002) or which belong to the same module(s) but are notknown to be regulated by an identical transcription factorWe did not find any significant difference between these twosets of genes regarding the probability of both genes in a pairhaving expression differences in trans in the same directionthat is favoring the allele from the same strain (either BY orRM) for both genes (supplementary table S5 SupplementaryMaterial online)

The genes under study were classified into five categoriesas in McManus et al (2010) but with some different categorynames as follows

1) Nondifferential no significant expression difference be-tween the RM and the BY allele in coculture or hybrid Itis the same as the ldquoconservedrdquo category in McManuset al (2010)

2) Cis only A significant cis- component but no significanttrans difference

3) Trans only A significant trans-component but no signif-icant cis difference

4) Cis + Trans The cis and trans components are bothsignificant and work in the same direction (supplemen-tary fig S1 Supplementary Material online)

5) Cis Trans The cis and trans components are both sig-nificant but have opposite effects It can be divided into

three subcategories according to the relative magnitudesof the cis and trans components (supplementary fig S2Supplementary Material online)a) ldquoCis Trans (tgt c)rdquo (ie ldquocis transrdquo with a

greater absolute trans effect) The log2 expressionratios in coculture and in the hybrid have differentsigns (the allele which is more highly expressed inthe hybrid has lower expression levels in the paren-tal comparison) it is equivalent to the ldquocis transrdquocategory in McManus et al (2010)

b) ldquoCis Trans (c = t)rdquo The cis and the trans compo-nent work in opposite directions and have approx-imately the same absolute value no significantexpression difference between the two alleles inthe parental strains it is equivalent to the ldquocom-pensatoryrdquo category in McManus et al (2010)

c) ldquoCis Trans (cgt t)rdquo (ie ldquocis transrdquo with agreater absolute cis effect) The cis and trans com-ponents have opposite signs but the log2 expres-sion ratios in hybrid and in coculture have the samesign (ie the same allele is favored in coculture andhybrid but the absolute expression ratio in hybrid isgreater than that in coculture) it was assigned tothe ldquocis + transrdquo category by McManus et al(2010)

Among the 4237 genes under study 2077 (49) showedno significant expression difference between the RM and BYalleles in hybrid or in coculture and were classified as nondif-ferential Among the 2160 (51) ldquodifferentially expressedrdquogenes 583 genes (138) were classified as ldquocis onlyrdquo whereas893 (211) as ldquotrans onlyrdquo (fig 1 and table 1) The groupldquocis + transrdquo comprises only 172 genes (41) The totalnumber of genes falling into the ldquocis ndash transrdquo category is512 (121) Among these 234 genes (55) have cis andtrans effects of approximately equal magnitude and wereclassified as ldquocis trans (c = t)rdquo Only 71 genes (17) in theldquocis transrdquo category have a larger cis component and wereclassified as ldquocis trans (cgt t)rdquo In contrast 207 genes (49)fall into the ldquocis trans (tgt c)rdquo category having a strongertrans effect than a cis effect These observations show theoverall prevalence of trans regulatory changes in our within-species comparison

Inheritance Mode of Gene Expression Level VersusASE Differences in Cis and Trans

To study the mode of inheritance the expression levels of thehybrid and the parental strains were compared for each genein three comparisons 1) the expression of the gene in theparental BY strain (ldquoBYrdquo) versus in the parental RM strain(ldquoRMrdquo) in coculture 2) the expression of the gene in BY versusthe total expression level in the hybrid and 3) the expressionof the gene in RM versus the total expression level in thehybrid A gene was classified as conserved if the expressiondifference in each of the three comparisons was not statisti-cally significant or was less than 25 This category comprised437 of the genes (18524237) The other 2385 genes(563) were nonconserved and assigned to one of the

2124

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

categories ldquoadditiverdquo ldquoBY dominantrdquo ldquoRM dominantrdquo ldquoover-dominantrdquo and ldquounderdominantrdquo (fig 2) The ldquoadditiverdquo cat-egory comprised 434 genes (102) Interestingly 1048 genes(247) were classified as ldquoRM dominantrdquo but only approx-imately half as many (445 genes 105) were classified as ldquoBYdominantrdquo In total 458 genes (108) were misexpressed(overdominant or underdominant) in the hybrid the under-dominant expression pattern was found in 294 genes (69)and the overdominant pattern in 164 genes (39)

To investigate how the molecular mechanism of geneexpression differences influences the inheritance mode ofthe expression level we examined whether an inheritancemode is enriched for genes belonging to a specific expressiondivergence pattern (table 1) Consistent with previous studies(Lemos et al 2008 McManus et al 2010) we found a weakbut significant relationship between cis regulation and addi-tive inheritance The median percent cis for genes with

additive inheritance (3984) was significantly higher thanfor those with the other inheritance modes (3746)(Wilcoxon rank sum test P value = 00014)

Additionally in agreement with previous findings (Lemoset al 2008) genes with dominant inheritance (either RM orBY dominant) showed a strong enrichment for trans regula-tory variation The median percent trans was significantlyhigher for genes with dominant inheritance (6883) thanfor the other genes (5934) (Wilcoxon rank sum testP valuelt 14 1014)

Furthermore we investigated whether genes in theldquocis transrdquo category disproportionately contributed to mis-expression in our within-species cross as previously describedfor between-species hybrids (Landry et al 2005 McManuset al 2010) Indeed we found an enrichment for misexpressedgenes in the ldquocis transrdquo category (Fisherrsquos exact testP valuelt 5 109 table 2) This relationship remains signif-icant even when both ldquoconservedrdquo and ldquonondifferentialrdquogenes are removed from the analysis (table 2)

Different Constraints on Cis and Trans RegulatoryComponents

We divided genes into different classes expected to be underrelatively weak or strong selective constraint using three cri-teria 1) the ratio of the rate of nonsynonymous substitutionto the rate of synonymous substitution () genes with an higher than the median value (~009) were classified as lessconserved and those with a lower as more conserved2) connectivity in proteinndashprotein interaction (PPI) networks(Stark et al 2006 Collins et al 2007) genes with more than

minus4 minus2 0 2

minus6

minus4

minus2

02

4

trans

cis

(a)

NonminusdifferentialCis OnlyTrans OnlyCis + TransCisminusTrans

Nonminusdifferential Cis Only Trans Only Cis + Trans Cis minusTrans

(b)

050

010

0015

0020

00

2077 (4902)

583 (1376)

893 (2108)

172 (406)

512 (1208)

FIG 1 Classification of genes according to cis or trans effects (a) Scatterplot Y axis the cis component [the log2-ratio of reads in the hybrid samplemapped to the RM and BY genomes log2(ecis) = log2(eHy) = log2(RMHyBYHy)] X axis the trans component [difference between parental and hybridlog2-transformed ASE ratios log2(etrans) = log2(eCoecis) = log2(RMCoBYCo) log2(RMHyBYHy)] Notations RMHy expression level of the RM allele inthe hybrid BYHy expression level of the BY allele in the hybrid eHy ASE ratio in the hybrid RMCo expression level of the RM allele in the cocultureBYCo expression level of the BY allele in the coculture and eCo ASE ratio in the coculture (b) The bar graph shows the number of genes in eachcistrans category

Table 1 Number of Genes Falling into Different Combinations ofInheritance and cistrans Categories

Inheritance mode Regulatory Effect

Nondifferential TransOnly

CisOnly

Cis +

TransCisTrans

Sum

Conserved 1265 199 165 6 217 1852

RM dominant 371 376 149 66 86 1048

BY dominant 147 116 85 24 73 445

Additive 53 134 141 67 39 434

Overdominant 69 11 23 1 60 164

Underdominant 172 57 20 8 37 294

Sum 2077 893 583 172 512 4237

2125

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

the median (four known interaction partners) were classifiedas more constrained whereas those with no known interac-tion partners as less constrained and 3) essentiality essentialgenes versus genes with a fitness gt085 in knock-out exper-iments (Deutschbauer et al 2005) We used the ratiop(p + d) [polymorphism(polymorphism + divergence)]as a measure to detect shifts toward polymorphism or diver-gence where p represents the absolute value of either the cisor the trans component in gene expression differences be-tween different strains of the same species whereas d repre-sents the respective value for the interspecies comparison(figs 3 and 4) (the divergence data were obtained fromTirosh et al 2009) The ratios pcis(pcis + dcis) and ptrans(ptrans + dtrans) were then each compared between categorieswith expected strong selective constraints or with expectedweak selective constraints for each of the three criteria ofselective constraints (figs 3 and 4) All three comparisonsshowed a significant relative abundance of polymorphismin trans for the less constrained category when comparedwith the more constrained category (Wilcoxon rank-sumtest P valueslt 001 in all three comparisons) In contrastin cis none of the three comparisons showed a significantdifference between the more and the less constrained cate-gory (P valuesgt 02) We tested for the equality of the distri-butions of p(p + d) in the more constrained and less

constrained categories and found significant differences forall three comparisons in trans (bootstrapped KolmogorovndashSmirnov [KS] tests P values 0002 in all three comparisons)but not in cis (bootstrapped KS tests P valuesgt 02 in allthree comparisons) In agreement with these observationswe find that essential genes are significantly less likely thannonessential genes to have a significant trans effect (supple-mentary table S6 Supplementary Material online Fisherrsquosexact test P value = 0018) whereas there is no significantdifference in cis (supplementary table S6 SupplementaryMaterial online Fisherrsquos exact test P value = 037)

Misexpressed inheritance modes are slightly under-repre-sented among essential genes for our within-species data thisdifference is not statistically significant in comparison withall other inheritance categories (supplementary table S7Supplementary Material online Fisherrsquos exact testP value = 035) However misexpressed genes are significantlyless likely to be essential if the comparison is restricted only togenes with a conserved total expression level (supplementarytable S7 Supplementary Material online Fisherrsquos exact testP value = 0022) This tendency for misexpressed genes to beless essential appears to be largely due to an enrichment ofnonessential genes among those with underdominant inher-itance Underdominant genes are significantly less likely to beessential in comparison with all other inheritance categories(table 3 Fisherrsquos exact test P valuelt 9763 109) and alsoin comparison with overdominant genes only (table 3 Fisherrsquosexact test P value = 114 1013)

As genes in the ldquocis transrdquo category are more likely to bemisexpressed (table 2) we test whether this category exhibitsa similar enrichment for nonessential genes Indeed nones-sential genes are more likely to fall into the ldquocis transrdquo cat-egory when compared with all other categories takentogether This is true not only for our within-species compar-ison (table 4 Fisherrsquos exact test P value = 0078) but also forthe between-species data of Tirosh et al (table 4P value = 00003) In contrast compared with ldquocis + transrdquogenes only this enrichment for nonessential genes is not

minus6 minus4 minus2 0 2

minus6

minus4

minus2

02

4

log2(Hybrid) minus log2(BY)

log 2

(Hyb

rid)

minus lo

g 2(R

M)

(a)

ConservedRM dominantBY dominantAdditiveOverdominantUnderdominant

ConservedRM

dominantBY

dominant AdditiveOver

dominantUnder

dominant

(b)

050

010

0015

0020

00 1852(4371)

1048(2473)

445(105)

434(1024)

164(387)

294(694)

FIG 2 Inheritance modes (a) The scatterplot compares the differences in expression level between the F1 hybrid and each of the parental strains(BY on the X axis and RM on the Y axis) (b) The bar graph shows the number of genes in each inheritance category

Table 2 Enrichment of Genes with Under- or OverdominantInheritance in the ldquoCis Transrdquo Category

Regulatory Effect Inheritance mode

Misexpressed Other

Cis trans vs other categoriesa 97 415361 3364

Cis transb vs other categoriesb 97 198120 1158

aMisexpressed genes are enriched for genes in the ldquocis transrdquo category Fisherrsquosexact test P value = 499 109bIndicates that both the ldquoconservedrdquo and ldquonondifferentialrdquo genes are removed fromthe analysis Fisherrsquos exact test P-valuelt 22 1016

2126

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

significant (P value = 049 within-species and P value = 051between species table 4)

DiscussionCis and trans regulatory factors differ in the way in which theyinfluence gene expression levels and their inheritance pat-terns Thus they may be subjected to different selection pres-sures and this should be reflected in the way gene regulatorynetworks evolve It remains difficult to tease apart these dif-ferent gene regulatory mechanisms and their evolutionarypathways However an elegant approach is the use ofhybrid experiments and the comparison of cis and trans ef-fects in crosses between and within species (Wittkopp et al2004) although some of the underlying assumptions mightlead to an overestimation of the relative contribution of cischanges to gene expression differences (Takahasi et al 2011)

Our new data set provides more power to detect geneexpression differences than our previous data (Emersonet al 2010) Indeed only 487 (10112078) of the geneswith a significant expression difference in coculture in thenew data set were also significantly different in the old dataset but 782 (10111293) of the genes with a significant

expression difference in the old data set are also significantlydifferent in the new data set (supplementary table S8Supplementary Material online) For ASE differences in thehybrid the respective numbers are 276 (3051106) and628 (305486) (supplementary table S8 SupplementaryMaterial online) The relatively small number of geneswhich were differentially expressed in the old data set butnot in the new one (282 in coculture and 181 in the hybrid)could be due to variation between biological or technicalreplicates and stochastic fluctuations (Busby et al 2011)

Our new analysis examines the different effects of func-tional constraints on the cis and trans components of geneexpression differences In general highly deleterious muta-tions would be quickly removed from the population andare unlikely to be observed either as differences betweenstrains or between species whereas slightly deleterious muta-tions may be found in polymorphisms The chance for slightlydeleterious mutations to be observed as trans polymorphismscould be high because of frequent trans changes (Wittkopp2005 Landry et al 2007) and the insufficient evolutionarytime for selection to remove the mutations from the popu-lation However they are unlikely to contribute significantly

Low w

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d) trans p(p+d)

Den

sity

00 02 04 06 08 10

00

05

10

15 High w

Den

sity

00 02 04 06 08 10

00

05

10

15 All genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Nonessential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

05

10

15

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

05

10

15

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

05

10

15

Both groups combined

Den

sity

00 02 04 06 08 100

00

51

01

5

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 3 The distribution of the ratio p(p + d) is significantly different between constrained and less constrained categories for all three classificationsystems in trans (andashc the ratio of the rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivityin PPI networks) Notations p trans polymorphism d trans divergence

2127

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

natural populations (Denver et al 2005) Furthermore differ-ences in cis regulatory elements appear to play a larger role inexpression differences between species than within species(Wittkopp et al 2008a Emerson et al 2010) Additionallygenes which show a significant gene expression differencein trans between two different S cerevisiae strains also tendto exhibit more gene expression divergence in trans betweenS cerevisiae and S paradoxus while in cis this trend is weak orabsent (Emerson et al 2010)

These findings could be explained by stronger positiveselection on cis divergence and stronger selective constrainton trans-acting factors As selective constraint is expected toaffect essential genes more strongly than nonessential genesits impact on gene regulatory evolution in cis and trans can beevaluated by comparing the cis and trans components ofwithin-species and between-species gene expression differ-ences for genes of higher and lower importance

In this study we investigate the relationship between geneexpression inheritance patterns and regulatory differences incis and trans between two strains of Saccharomyces cerevisiaeRM11-1 (RM) and BY4741 (BY) Our results indicate thatgenes with antagonistic cisndashtrans interactions are morelikely to show an under- or overdominant inheritance patternin our within-species hybrids whereas essential genes are lesslikely to exhibit an underdominant inheritance pattern

In addition we integrate the data from an interspeciescomparison (Tirosh et al 2009) with our data and evaluatethe role of selective constraint on changes in cis and transfactors We show that trans regulatory mutations indeed tendto be under stronger selective constraint than cis regulatorymutations and that this observation may explain the relativecontributions of cis and trans changes to intra- and in-terspecific gene expression differences

Results

Transcriptome Sequencing and Expression LevelEstimation

We selected 4442 genes for our study (Methods Emersonet al 2010) and estimated their expression levels usingIllumina paired-end (PE) sequencing with a read length of151 base pairs (bp) In the hybrid sample 4558258 readswere mapped specifically to the BY genome and 4564464reads to the RM genome In the coculture sample 3745275reads were mapped as BY specific and 3776660 reads as RMspecific This new data set enabled us to analyze ASE differ-ences with greater power than the expression data obtainedin a previous study (Emerson et al 2010) In this study 4237out of the 4442 genes under study have more than 10 se-quence reads for both alleles in both experiments (cocultureand hybrid) and were used for further analyses Among the4237 genes under study 2268 genes (535) show a signifi-cant expression polymorphism in coculture and 1207 (285)show a significant ASE difference in the hybrid (binomialexact test false discovery rate [FDR] lt5 see Materialsand Methods) These are two times higher than the corre-sponding numbers in Emerson et al (2010) in which thenumbers of genes with significant ASE differences are 1294

(351) and 488 (132) out of the 3685 genes under studyfor coculture and hybrid respectively using the same criteriafor statistical significance Thus the new data set allows us todo more rigorous statistical analyses

Classifying Gene Expression Differences in Terms ofCis and Trans Effects

As discussed earlier a cis-regulatory factor influences the ex-pression level of only the allele on the same chromosomewhereas a trans-regulatory factor can affect the expression ofboth alleles in a cell Therefore it is possible to estimate therelative contributions of cis- and trans-regulatory changes todifferences in gene expression between the RM and the BYstrain by comparing the ASE in the hybrid to expression dif-ferences between the two parental strains (Wittkopp et al2004) We assume that there is no allele-specific preferentialbinding of the maternal or paternal transcription factor(Takahasi et al 2011) and that the expression of an allele isindependent of the other that is there is no transvectionTherefore the expression differences between the twoparental alleles in the hybrid are interpreted as a directrepresentation of cis-regulatory differences (Cowles et al2002) (see examples in supplementary figs S1 and S2Supplementary Material online) because in the same cellthe trans-regulatory milieu is identical for the two allelesThe expression difference between the two parental strainsin coculture is thus interpreted as a combination of cis- andtrans-effect (see examples in supplementary figs S1 and S2Supplementary Material online) In agreement with previousstudies (Wittkopp et al 2008b Emerson et al 2010) we foundthat trans effects dominate in our within-species comparison1577 (373) of the 4237 genes under study show a signifi-cant trans effect whereas only 1267 (30) show a significantcis effect (significance was determined using the likelihoodratio test FDRlt 5 see Materials and Methods) Themedian absolute trans effect (0301) is significantly higherthan the median cis component (0166) (Wilcoxon ranksum test P valuelt 22 1016) Although genes with signif-icant expression differences in coculture or in the hybridshowed a higher single nucleotide polymorphism (SNP) den-sity than nondifferentially expressed genes (Wilcoxon ranksum test P valuelt 22 1016) we found no difference be-tween cis or trans regulatory changes related to gene SNPdensity (supplementary table S1 Supplementary Materialonline) Differentially expressed genes also showed a signifi-cantly higher sequence divergence in the promoter region(defined as 500 bp upstream of the transcription start site)(supplementary table S2 Supplementary Material online)This trend is significant for both the cis and the trans effectbut stronger for the cis component of expressiondifferences as can be expected (Wilcoxon rank sum testcis P value = 779 1012 trans P value = 1497 105)Similarly genes whose promoter region contains a TATAbox (Basehoar et al 2004) are more likely to be differentiallyexpressed than those without a TATA box (supplementarytable S3 Supplementary Material online) Genes withouta well-defined nucleosome-free region close to the

2123

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

transcription start site (ie OPN genes) are more likely to bedifferentially expressed than those with such a region (ieDPN genes) (the gene sets were defined in Tirosh et al2008) This relationship is observed in trans and in cis (sup-plementary table S4 Supplementary Material online) but isstronger for the trans effect (Fisherrsquos exact test transP value = 3993 105 cis P value = 00027) We testedwhether genes with significant and consistent cis or transeffects (in the old and the new data set) were enriched inspecific biological processes or cellular components in theGene Ontology (GO) annotation using the FunSpec analysistool (Robinson et al 2002) Genes with a significant transcomponent were enriched in mitochondrial electrontransport (GO term ldquomitochondrial electron transport ubi-quinol to cytochrome crdquo [GO identifier 0006122] P value109 107 and overlapping terms) and in the biosynthesisof ergosterol (GO term ldquoergosterol biosynthetic processrdquo [GOidentifier 0006696] P value 18 107 and overlappingterms) a major component of the fungal cell membraneGenes with a significant cis effect were enriched in oxida-tionreduction among biological processes (GO term ldquooxida-tionndashreduction processrdquo GO identifier 0055114 P value945 107 and overlapping terms) and in the cell wallamong cellular components (GO term ldquocell wallrdquo GO iden-tifier 0005618 P value 219 109 and related terms) This isconsistent with the previous finding of an enrichment for cellwall related genes among those with local regulatory differ-ences between the RM and BY strains (Chen et al 2010) Toestimate the importance of transcription factors in trans reg-ulatory evolution versus changes in sensory and signalingmolecules or chromatin modifiers we compared gene pairswhich either share a common regulator (Teixeira et al 2006)but belong to different expression modules (Ihmels et al2002) or which belong to the same module(s) but are notknown to be regulated by an identical transcription factorWe did not find any significant difference between these twosets of genes regarding the probability of both genes in a pairhaving expression differences in trans in the same directionthat is favoring the allele from the same strain (either BY orRM) for both genes (supplementary table S5 SupplementaryMaterial online)

The genes under study were classified into five categoriesas in McManus et al (2010) but with some different categorynames as follows

1) Nondifferential no significant expression difference be-tween the RM and the BY allele in coculture or hybrid Itis the same as the ldquoconservedrdquo category in McManuset al (2010)

2) Cis only A significant cis- component but no significanttrans difference

3) Trans only A significant trans-component but no signif-icant cis difference

4) Cis + Trans The cis and trans components are bothsignificant and work in the same direction (supplemen-tary fig S1 Supplementary Material online)

5) Cis Trans The cis and trans components are both sig-nificant but have opposite effects It can be divided into

three subcategories according to the relative magnitudesof the cis and trans components (supplementary fig S2Supplementary Material online)a) ldquoCis Trans (tgt c)rdquo (ie ldquocis transrdquo with a

greater absolute trans effect) The log2 expressionratios in coculture and in the hybrid have differentsigns (the allele which is more highly expressed inthe hybrid has lower expression levels in the paren-tal comparison) it is equivalent to the ldquocis transrdquocategory in McManus et al (2010)

b) ldquoCis Trans (c = t)rdquo The cis and the trans compo-nent work in opposite directions and have approx-imately the same absolute value no significantexpression difference between the two alleles inthe parental strains it is equivalent to the ldquocom-pensatoryrdquo category in McManus et al (2010)

c) ldquoCis Trans (cgt t)rdquo (ie ldquocis transrdquo with agreater absolute cis effect) The cis and trans com-ponents have opposite signs but the log2 expres-sion ratios in hybrid and in coculture have the samesign (ie the same allele is favored in coculture andhybrid but the absolute expression ratio in hybrid isgreater than that in coculture) it was assigned tothe ldquocis + transrdquo category by McManus et al(2010)

Among the 4237 genes under study 2077 (49) showedno significant expression difference between the RM and BYalleles in hybrid or in coculture and were classified as nondif-ferential Among the 2160 (51) ldquodifferentially expressedrdquogenes 583 genes (138) were classified as ldquocis onlyrdquo whereas893 (211) as ldquotrans onlyrdquo (fig 1 and table 1) The groupldquocis + transrdquo comprises only 172 genes (41) The totalnumber of genes falling into the ldquocis ndash transrdquo category is512 (121) Among these 234 genes (55) have cis andtrans effects of approximately equal magnitude and wereclassified as ldquocis trans (c = t)rdquo Only 71 genes (17) in theldquocis transrdquo category have a larger cis component and wereclassified as ldquocis trans (cgt t)rdquo In contrast 207 genes (49)fall into the ldquocis trans (tgt c)rdquo category having a strongertrans effect than a cis effect These observations show theoverall prevalence of trans regulatory changes in our within-species comparison

Inheritance Mode of Gene Expression Level VersusASE Differences in Cis and Trans

To study the mode of inheritance the expression levels of thehybrid and the parental strains were compared for each genein three comparisons 1) the expression of the gene in theparental BY strain (ldquoBYrdquo) versus in the parental RM strain(ldquoRMrdquo) in coculture 2) the expression of the gene in BY versusthe total expression level in the hybrid and 3) the expressionof the gene in RM versus the total expression level in thehybrid A gene was classified as conserved if the expressiondifference in each of the three comparisons was not statisti-cally significant or was less than 25 This category comprised437 of the genes (18524237) The other 2385 genes(563) were nonconserved and assigned to one of the

2124

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

categories ldquoadditiverdquo ldquoBY dominantrdquo ldquoRM dominantrdquo ldquoover-dominantrdquo and ldquounderdominantrdquo (fig 2) The ldquoadditiverdquo cat-egory comprised 434 genes (102) Interestingly 1048 genes(247) were classified as ldquoRM dominantrdquo but only approx-imately half as many (445 genes 105) were classified as ldquoBYdominantrdquo In total 458 genes (108) were misexpressed(overdominant or underdominant) in the hybrid the under-dominant expression pattern was found in 294 genes (69)and the overdominant pattern in 164 genes (39)

To investigate how the molecular mechanism of geneexpression differences influences the inheritance mode ofthe expression level we examined whether an inheritancemode is enriched for genes belonging to a specific expressiondivergence pattern (table 1) Consistent with previous studies(Lemos et al 2008 McManus et al 2010) we found a weakbut significant relationship between cis regulation and addi-tive inheritance The median percent cis for genes with

additive inheritance (3984) was significantly higher thanfor those with the other inheritance modes (3746)(Wilcoxon rank sum test P value = 00014)

Additionally in agreement with previous findings (Lemoset al 2008) genes with dominant inheritance (either RM orBY dominant) showed a strong enrichment for trans regula-tory variation The median percent trans was significantlyhigher for genes with dominant inheritance (6883) thanfor the other genes (5934) (Wilcoxon rank sum testP valuelt 14 1014)

Furthermore we investigated whether genes in theldquocis transrdquo category disproportionately contributed to mis-expression in our within-species cross as previously describedfor between-species hybrids (Landry et al 2005 McManuset al 2010) Indeed we found an enrichment for misexpressedgenes in the ldquocis transrdquo category (Fisherrsquos exact testP valuelt 5 109 table 2) This relationship remains signif-icant even when both ldquoconservedrdquo and ldquonondifferentialrdquogenes are removed from the analysis (table 2)

Different Constraints on Cis and Trans RegulatoryComponents

We divided genes into different classes expected to be underrelatively weak or strong selective constraint using three cri-teria 1) the ratio of the rate of nonsynonymous substitutionto the rate of synonymous substitution () genes with an higher than the median value (~009) were classified as lessconserved and those with a lower as more conserved2) connectivity in proteinndashprotein interaction (PPI) networks(Stark et al 2006 Collins et al 2007) genes with more than

minus4 minus2 0 2

minus6

minus4

minus2

02

4

trans

cis

(a)

NonminusdifferentialCis OnlyTrans OnlyCis + TransCisminusTrans

Nonminusdifferential Cis Only Trans Only Cis + Trans Cis minusTrans

(b)

050

010

0015

0020

00

2077 (4902)

583 (1376)

893 (2108)

172 (406)

512 (1208)

FIG 1 Classification of genes according to cis or trans effects (a) Scatterplot Y axis the cis component [the log2-ratio of reads in the hybrid samplemapped to the RM and BY genomes log2(ecis) = log2(eHy) = log2(RMHyBYHy)] X axis the trans component [difference between parental and hybridlog2-transformed ASE ratios log2(etrans) = log2(eCoecis) = log2(RMCoBYCo) log2(RMHyBYHy)] Notations RMHy expression level of the RM allele inthe hybrid BYHy expression level of the BY allele in the hybrid eHy ASE ratio in the hybrid RMCo expression level of the RM allele in the cocultureBYCo expression level of the BY allele in the coculture and eCo ASE ratio in the coculture (b) The bar graph shows the number of genes in eachcistrans category

Table 1 Number of Genes Falling into Different Combinations ofInheritance and cistrans Categories

Inheritance mode Regulatory Effect

Nondifferential TransOnly

CisOnly

Cis +

TransCisTrans

Sum

Conserved 1265 199 165 6 217 1852

RM dominant 371 376 149 66 86 1048

BY dominant 147 116 85 24 73 445

Additive 53 134 141 67 39 434

Overdominant 69 11 23 1 60 164

Underdominant 172 57 20 8 37 294

Sum 2077 893 583 172 512 4237

2125

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

the median (four known interaction partners) were classifiedas more constrained whereas those with no known interac-tion partners as less constrained and 3) essentiality essentialgenes versus genes with a fitness gt085 in knock-out exper-iments (Deutschbauer et al 2005) We used the ratiop(p + d) [polymorphism(polymorphism + divergence)]as a measure to detect shifts toward polymorphism or diver-gence where p represents the absolute value of either the cisor the trans component in gene expression differences be-tween different strains of the same species whereas d repre-sents the respective value for the interspecies comparison(figs 3 and 4) (the divergence data were obtained fromTirosh et al 2009) The ratios pcis(pcis + dcis) and ptrans(ptrans + dtrans) were then each compared between categorieswith expected strong selective constraints or with expectedweak selective constraints for each of the three criteria ofselective constraints (figs 3 and 4) All three comparisonsshowed a significant relative abundance of polymorphismin trans for the less constrained category when comparedwith the more constrained category (Wilcoxon rank-sumtest P valueslt 001 in all three comparisons) In contrastin cis none of the three comparisons showed a significantdifference between the more and the less constrained cate-gory (P valuesgt 02) We tested for the equality of the distri-butions of p(p + d) in the more constrained and less

constrained categories and found significant differences forall three comparisons in trans (bootstrapped KolmogorovndashSmirnov [KS] tests P values 0002 in all three comparisons)but not in cis (bootstrapped KS tests P valuesgt 02 in allthree comparisons) In agreement with these observationswe find that essential genes are significantly less likely thannonessential genes to have a significant trans effect (supple-mentary table S6 Supplementary Material online Fisherrsquosexact test P value = 0018) whereas there is no significantdifference in cis (supplementary table S6 SupplementaryMaterial online Fisherrsquos exact test P value = 037)

Misexpressed inheritance modes are slightly under-repre-sented among essential genes for our within-species data thisdifference is not statistically significant in comparison withall other inheritance categories (supplementary table S7Supplementary Material online Fisherrsquos exact testP value = 035) However misexpressed genes are significantlyless likely to be essential if the comparison is restricted only togenes with a conserved total expression level (supplementarytable S7 Supplementary Material online Fisherrsquos exact testP value = 0022) This tendency for misexpressed genes to beless essential appears to be largely due to an enrichment ofnonessential genes among those with underdominant inher-itance Underdominant genes are significantly less likely to beessential in comparison with all other inheritance categories(table 3 Fisherrsquos exact test P valuelt 9763 109) and alsoin comparison with overdominant genes only (table 3 Fisherrsquosexact test P value = 114 1013)

As genes in the ldquocis transrdquo category are more likely to bemisexpressed (table 2) we test whether this category exhibitsa similar enrichment for nonessential genes Indeed nones-sential genes are more likely to fall into the ldquocis transrdquo cat-egory when compared with all other categories takentogether This is true not only for our within-species compar-ison (table 4 Fisherrsquos exact test P value = 0078) but also forthe between-species data of Tirosh et al (table 4P value = 00003) In contrast compared with ldquocis + transrdquogenes only this enrichment for nonessential genes is not

minus6 minus4 minus2 0 2

minus6

minus4

minus2

02

4

log2(Hybrid) minus log2(BY)

log 2

(Hyb

rid)

minus lo

g 2(R

M)

(a)

ConservedRM dominantBY dominantAdditiveOverdominantUnderdominant

ConservedRM

dominantBY

dominant AdditiveOver

dominantUnder

dominant

(b)

050

010

0015

0020

00 1852(4371)

1048(2473)

445(105)

434(1024)

164(387)

294(694)

FIG 2 Inheritance modes (a) The scatterplot compares the differences in expression level between the F1 hybrid and each of the parental strains(BY on the X axis and RM on the Y axis) (b) The bar graph shows the number of genes in each inheritance category

Table 2 Enrichment of Genes with Under- or OverdominantInheritance in the ldquoCis Transrdquo Category

Regulatory Effect Inheritance mode

Misexpressed Other

Cis trans vs other categoriesa 97 415361 3364

Cis transb vs other categoriesb 97 198120 1158

aMisexpressed genes are enriched for genes in the ldquocis transrdquo category Fisherrsquosexact test P value = 499 109bIndicates that both the ldquoconservedrdquo and ldquonondifferentialrdquo genes are removed fromthe analysis Fisherrsquos exact test P-valuelt 22 1016

2126

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

significant (P value = 049 within-species and P value = 051between species table 4)

DiscussionCis and trans regulatory factors differ in the way in which theyinfluence gene expression levels and their inheritance pat-terns Thus they may be subjected to different selection pres-sures and this should be reflected in the way gene regulatorynetworks evolve It remains difficult to tease apart these dif-ferent gene regulatory mechanisms and their evolutionarypathways However an elegant approach is the use ofhybrid experiments and the comparison of cis and trans ef-fects in crosses between and within species (Wittkopp et al2004) although some of the underlying assumptions mightlead to an overestimation of the relative contribution of cischanges to gene expression differences (Takahasi et al 2011)

Our new data set provides more power to detect geneexpression differences than our previous data (Emersonet al 2010) Indeed only 487 (10112078) of the geneswith a significant expression difference in coculture in thenew data set were also significantly different in the old dataset but 782 (10111293) of the genes with a significant

expression difference in the old data set are also significantlydifferent in the new data set (supplementary table S8Supplementary Material online) For ASE differences in thehybrid the respective numbers are 276 (3051106) and628 (305486) (supplementary table S8 SupplementaryMaterial online) The relatively small number of geneswhich were differentially expressed in the old data set butnot in the new one (282 in coculture and 181 in the hybrid)could be due to variation between biological or technicalreplicates and stochastic fluctuations (Busby et al 2011)

Our new analysis examines the different effects of func-tional constraints on the cis and trans components of geneexpression differences In general highly deleterious muta-tions would be quickly removed from the population andare unlikely to be observed either as differences betweenstrains or between species whereas slightly deleterious muta-tions may be found in polymorphisms The chance for slightlydeleterious mutations to be observed as trans polymorphismscould be high because of frequent trans changes (Wittkopp2005 Landry et al 2007) and the insufficient evolutionarytime for selection to remove the mutations from the popu-lation However they are unlikely to contribute significantly

Low w

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d) trans p(p+d)

Den

sity

00 02 04 06 08 10

00

05

10

15 High w

Den

sity

00 02 04 06 08 10

00

05

10

15 All genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Nonessential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

05

10

15

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

05

10

15

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

05

10

15

Both groups combined

Den

sity

00 02 04 06 08 100

00

51

01

5

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 3 The distribution of the ratio p(p + d) is significantly different between constrained and less constrained categories for all three classificationsystems in trans (andashc the ratio of the rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivityin PPI networks) Notations p trans polymorphism d trans divergence

2127

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

transcription start site (ie OPN genes) are more likely to bedifferentially expressed than those with such a region (ieDPN genes) (the gene sets were defined in Tirosh et al2008) This relationship is observed in trans and in cis (sup-plementary table S4 Supplementary Material online) but isstronger for the trans effect (Fisherrsquos exact test transP value = 3993 105 cis P value = 00027) We testedwhether genes with significant and consistent cis or transeffects (in the old and the new data set) were enriched inspecific biological processes or cellular components in theGene Ontology (GO) annotation using the FunSpec analysistool (Robinson et al 2002) Genes with a significant transcomponent were enriched in mitochondrial electrontransport (GO term ldquomitochondrial electron transport ubi-quinol to cytochrome crdquo [GO identifier 0006122] P value109 107 and overlapping terms) and in the biosynthesisof ergosterol (GO term ldquoergosterol biosynthetic processrdquo [GOidentifier 0006696] P value 18 107 and overlappingterms) a major component of the fungal cell membraneGenes with a significant cis effect were enriched in oxida-tionreduction among biological processes (GO term ldquooxida-tionndashreduction processrdquo GO identifier 0055114 P value945 107 and overlapping terms) and in the cell wallamong cellular components (GO term ldquocell wallrdquo GO iden-tifier 0005618 P value 219 109 and related terms) This isconsistent with the previous finding of an enrichment for cellwall related genes among those with local regulatory differ-ences between the RM and BY strains (Chen et al 2010) Toestimate the importance of transcription factors in trans reg-ulatory evolution versus changes in sensory and signalingmolecules or chromatin modifiers we compared gene pairswhich either share a common regulator (Teixeira et al 2006)but belong to different expression modules (Ihmels et al2002) or which belong to the same module(s) but are notknown to be regulated by an identical transcription factorWe did not find any significant difference between these twosets of genes regarding the probability of both genes in a pairhaving expression differences in trans in the same directionthat is favoring the allele from the same strain (either BY orRM) for both genes (supplementary table S5 SupplementaryMaterial online)

The genes under study were classified into five categoriesas in McManus et al (2010) but with some different categorynames as follows

1) Nondifferential no significant expression difference be-tween the RM and the BY allele in coculture or hybrid Itis the same as the ldquoconservedrdquo category in McManuset al (2010)

2) Cis only A significant cis- component but no significanttrans difference

3) Trans only A significant trans-component but no signif-icant cis difference

4) Cis + Trans The cis and trans components are bothsignificant and work in the same direction (supplemen-tary fig S1 Supplementary Material online)

5) Cis Trans The cis and trans components are both sig-nificant but have opposite effects It can be divided into

three subcategories according to the relative magnitudesof the cis and trans components (supplementary fig S2Supplementary Material online)a) ldquoCis Trans (tgt c)rdquo (ie ldquocis transrdquo with a

greater absolute trans effect) The log2 expressionratios in coculture and in the hybrid have differentsigns (the allele which is more highly expressed inthe hybrid has lower expression levels in the paren-tal comparison) it is equivalent to the ldquocis transrdquocategory in McManus et al (2010)

b) ldquoCis Trans (c = t)rdquo The cis and the trans compo-nent work in opposite directions and have approx-imately the same absolute value no significantexpression difference between the two alleles inthe parental strains it is equivalent to the ldquocom-pensatoryrdquo category in McManus et al (2010)

c) ldquoCis Trans (cgt t)rdquo (ie ldquocis transrdquo with agreater absolute cis effect) The cis and trans com-ponents have opposite signs but the log2 expres-sion ratios in hybrid and in coculture have the samesign (ie the same allele is favored in coculture andhybrid but the absolute expression ratio in hybrid isgreater than that in coculture) it was assigned tothe ldquocis + transrdquo category by McManus et al(2010)

Among the 4237 genes under study 2077 (49) showedno significant expression difference between the RM and BYalleles in hybrid or in coculture and were classified as nondif-ferential Among the 2160 (51) ldquodifferentially expressedrdquogenes 583 genes (138) were classified as ldquocis onlyrdquo whereas893 (211) as ldquotrans onlyrdquo (fig 1 and table 1) The groupldquocis + transrdquo comprises only 172 genes (41) The totalnumber of genes falling into the ldquocis ndash transrdquo category is512 (121) Among these 234 genes (55) have cis andtrans effects of approximately equal magnitude and wereclassified as ldquocis trans (c = t)rdquo Only 71 genes (17) in theldquocis transrdquo category have a larger cis component and wereclassified as ldquocis trans (cgt t)rdquo In contrast 207 genes (49)fall into the ldquocis trans (tgt c)rdquo category having a strongertrans effect than a cis effect These observations show theoverall prevalence of trans regulatory changes in our within-species comparison

Inheritance Mode of Gene Expression Level VersusASE Differences in Cis and Trans

To study the mode of inheritance the expression levels of thehybrid and the parental strains were compared for each genein three comparisons 1) the expression of the gene in theparental BY strain (ldquoBYrdquo) versus in the parental RM strain(ldquoRMrdquo) in coculture 2) the expression of the gene in BY versusthe total expression level in the hybrid and 3) the expressionof the gene in RM versus the total expression level in thehybrid A gene was classified as conserved if the expressiondifference in each of the three comparisons was not statisti-cally significant or was less than 25 This category comprised437 of the genes (18524237) The other 2385 genes(563) were nonconserved and assigned to one of the

2124

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

categories ldquoadditiverdquo ldquoBY dominantrdquo ldquoRM dominantrdquo ldquoover-dominantrdquo and ldquounderdominantrdquo (fig 2) The ldquoadditiverdquo cat-egory comprised 434 genes (102) Interestingly 1048 genes(247) were classified as ldquoRM dominantrdquo but only approx-imately half as many (445 genes 105) were classified as ldquoBYdominantrdquo In total 458 genes (108) were misexpressed(overdominant or underdominant) in the hybrid the under-dominant expression pattern was found in 294 genes (69)and the overdominant pattern in 164 genes (39)

To investigate how the molecular mechanism of geneexpression differences influences the inheritance mode ofthe expression level we examined whether an inheritancemode is enriched for genes belonging to a specific expressiondivergence pattern (table 1) Consistent with previous studies(Lemos et al 2008 McManus et al 2010) we found a weakbut significant relationship between cis regulation and addi-tive inheritance The median percent cis for genes with

additive inheritance (3984) was significantly higher thanfor those with the other inheritance modes (3746)(Wilcoxon rank sum test P value = 00014)

Additionally in agreement with previous findings (Lemoset al 2008) genes with dominant inheritance (either RM orBY dominant) showed a strong enrichment for trans regula-tory variation The median percent trans was significantlyhigher for genes with dominant inheritance (6883) thanfor the other genes (5934) (Wilcoxon rank sum testP valuelt 14 1014)

Furthermore we investigated whether genes in theldquocis transrdquo category disproportionately contributed to mis-expression in our within-species cross as previously describedfor between-species hybrids (Landry et al 2005 McManuset al 2010) Indeed we found an enrichment for misexpressedgenes in the ldquocis transrdquo category (Fisherrsquos exact testP valuelt 5 109 table 2) This relationship remains signif-icant even when both ldquoconservedrdquo and ldquonondifferentialrdquogenes are removed from the analysis (table 2)

Different Constraints on Cis and Trans RegulatoryComponents

We divided genes into different classes expected to be underrelatively weak or strong selective constraint using three cri-teria 1) the ratio of the rate of nonsynonymous substitutionto the rate of synonymous substitution () genes with an higher than the median value (~009) were classified as lessconserved and those with a lower as more conserved2) connectivity in proteinndashprotein interaction (PPI) networks(Stark et al 2006 Collins et al 2007) genes with more than

minus4 minus2 0 2

minus6

minus4

minus2

02

4

trans

cis

(a)

NonminusdifferentialCis OnlyTrans OnlyCis + TransCisminusTrans

Nonminusdifferential Cis Only Trans Only Cis + Trans Cis minusTrans

(b)

050

010

0015

0020

00

2077 (4902)

583 (1376)

893 (2108)

172 (406)

512 (1208)

FIG 1 Classification of genes according to cis or trans effects (a) Scatterplot Y axis the cis component [the log2-ratio of reads in the hybrid samplemapped to the RM and BY genomes log2(ecis) = log2(eHy) = log2(RMHyBYHy)] X axis the trans component [difference between parental and hybridlog2-transformed ASE ratios log2(etrans) = log2(eCoecis) = log2(RMCoBYCo) log2(RMHyBYHy)] Notations RMHy expression level of the RM allele inthe hybrid BYHy expression level of the BY allele in the hybrid eHy ASE ratio in the hybrid RMCo expression level of the RM allele in the cocultureBYCo expression level of the BY allele in the coculture and eCo ASE ratio in the coculture (b) The bar graph shows the number of genes in eachcistrans category

Table 1 Number of Genes Falling into Different Combinations ofInheritance and cistrans Categories

Inheritance mode Regulatory Effect

Nondifferential TransOnly

CisOnly

Cis +

TransCisTrans

Sum

Conserved 1265 199 165 6 217 1852

RM dominant 371 376 149 66 86 1048

BY dominant 147 116 85 24 73 445

Additive 53 134 141 67 39 434

Overdominant 69 11 23 1 60 164

Underdominant 172 57 20 8 37 294

Sum 2077 893 583 172 512 4237

2125

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

the median (four known interaction partners) were classifiedas more constrained whereas those with no known interac-tion partners as less constrained and 3) essentiality essentialgenes versus genes with a fitness gt085 in knock-out exper-iments (Deutschbauer et al 2005) We used the ratiop(p + d) [polymorphism(polymorphism + divergence)]as a measure to detect shifts toward polymorphism or diver-gence where p represents the absolute value of either the cisor the trans component in gene expression differences be-tween different strains of the same species whereas d repre-sents the respective value for the interspecies comparison(figs 3 and 4) (the divergence data were obtained fromTirosh et al 2009) The ratios pcis(pcis + dcis) and ptrans(ptrans + dtrans) were then each compared between categorieswith expected strong selective constraints or with expectedweak selective constraints for each of the three criteria ofselective constraints (figs 3 and 4) All three comparisonsshowed a significant relative abundance of polymorphismin trans for the less constrained category when comparedwith the more constrained category (Wilcoxon rank-sumtest P valueslt 001 in all three comparisons) In contrastin cis none of the three comparisons showed a significantdifference between the more and the less constrained cate-gory (P valuesgt 02) We tested for the equality of the distri-butions of p(p + d) in the more constrained and less

constrained categories and found significant differences forall three comparisons in trans (bootstrapped KolmogorovndashSmirnov [KS] tests P values 0002 in all three comparisons)but not in cis (bootstrapped KS tests P valuesgt 02 in allthree comparisons) In agreement with these observationswe find that essential genes are significantly less likely thannonessential genes to have a significant trans effect (supple-mentary table S6 Supplementary Material online Fisherrsquosexact test P value = 0018) whereas there is no significantdifference in cis (supplementary table S6 SupplementaryMaterial online Fisherrsquos exact test P value = 037)

Misexpressed inheritance modes are slightly under-repre-sented among essential genes for our within-species data thisdifference is not statistically significant in comparison withall other inheritance categories (supplementary table S7Supplementary Material online Fisherrsquos exact testP value = 035) However misexpressed genes are significantlyless likely to be essential if the comparison is restricted only togenes with a conserved total expression level (supplementarytable S7 Supplementary Material online Fisherrsquos exact testP value = 0022) This tendency for misexpressed genes to beless essential appears to be largely due to an enrichment ofnonessential genes among those with underdominant inher-itance Underdominant genes are significantly less likely to beessential in comparison with all other inheritance categories(table 3 Fisherrsquos exact test P valuelt 9763 109) and alsoin comparison with overdominant genes only (table 3 Fisherrsquosexact test P value = 114 1013)

As genes in the ldquocis transrdquo category are more likely to bemisexpressed (table 2) we test whether this category exhibitsa similar enrichment for nonessential genes Indeed nones-sential genes are more likely to fall into the ldquocis transrdquo cat-egory when compared with all other categories takentogether This is true not only for our within-species compar-ison (table 4 Fisherrsquos exact test P value = 0078) but also forthe between-species data of Tirosh et al (table 4P value = 00003) In contrast compared with ldquocis + transrdquogenes only this enrichment for nonessential genes is not

minus6 minus4 minus2 0 2

minus6

minus4

minus2

02

4

log2(Hybrid) minus log2(BY)

log 2

(Hyb

rid)

minus lo

g 2(R

M)

(a)

ConservedRM dominantBY dominantAdditiveOverdominantUnderdominant

ConservedRM

dominantBY

dominant AdditiveOver

dominantUnder

dominant

(b)

050

010

0015

0020

00 1852(4371)

1048(2473)

445(105)

434(1024)

164(387)

294(694)

FIG 2 Inheritance modes (a) The scatterplot compares the differences in expression level between the F1 hybrid and each of the parental strains(BY on the X axis and RM on the Y axis) (b) The bar graph shows the number of genes in each inheritance category

Table 2 Enrichment of Genes with Under- or OverdominantInheritance in the ldquoCis Transrdquo Category

Regulatory Effect Inheritance mode

Misexpressed Other

Cis trans vs other categoriesa 97 415361 3364

Cis transb vs other categoriesb 97 198120 1158

aMisexpressed genes are enriched for genes in the ldquocis transrdquo category Fisherrsquosexact test P value = 499 109bIndicates that both the ldquoconservedrdquo and ldquonondifferentialrdquo genes are removed fromthe analysis Fisherrsquos exact test P-valuelt 22 1016

2126

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

significant (P value = 049 within-species and P value = 051between species table 4)

DiscussionCis and trans regulatory factors differ in the way in which theyinfluence gene expression levels and their inheritance pat-terns Thus they may be subjected to different selection pres-sures and this should be reflected in the way gene regulatorynetworks evolve It remains difficult to tease apart these dif-ferent gene regulatory mechanisms and their evolutionarypathways However an elegant approach is the use ofhybrid experiments and the comparison of cis and trans ef-fects in crosses between and within species (Wittkopp et al2004) although some of the underlying assumptions mightlead to an overestimation of the relative contribution of cischanges to gene expression differences (Takahasi et al 2011)

Our new data set provides more power to detect geneexpression differences than our previous data (Emersonet al 2010) Indeed only 487 (10112078) of the geneswith a significant expression difference in coculture in thenew data set were also significantly different in the old dataset but 782 (10111293) of the genes with a significant

expression difference in the old data set are also significantlydifferent in the new data set (supplementary table S8Supplementary Material online) For ASE differences in thehybrid the respective numbers are 276 (3051106) and628 (305486) (supplementary table S8 SupplementaryMaterial online) The relatively small number of geneswhich were differentially expressed in the old data set butnot in the new one (282 in coculture and 181 in the hybrid)could be due to variation between biological or technicalreplicates and stochastic fluctuations (Busby et al 2011)

Our new analysis examines the different effects of func-tional constraints on the cis and trans components of geneexpression differences In general highly deleterious muta-tions would be quickly removed from the population andare unlikely to be observed either as differences betweenstrains or between species whereas slightly deleterious muta-tions may be found in polymorphisms The chance for slightlydeleterious mutations to be observed as trans polymorphismscould be high because of frequent trans changes (Wittkopp2005 Landry et al 2007) and the insufficient evolutionarytime for selection to remove the mutations from the popu-lation However they are unlikely to contribute significantly

Low w

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d) trans p(p+d)

Den

sity

00 02 04 06 08 10

00

05

10

15 High w

Den

sity

00 02 04 06 08 10

00

05

10

15 All genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Nonessential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

05

10

15

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

05

10

15

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

05

10

15

Both groups combined

Den

sity

00 02 04 06 08 100

00

51

01

5

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 3 The distribution of the ratio p(p + d) is significantly different between constrained and less constrained categories for all three classificationsystems in trans (andashc the ratio of the rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivityin PPI networks) Notations p trans polymorphism d trans divergence

2127

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

categories ldquoadditiverdquo ldquoBY dominantrdquo ldquoRM dominantrdquo ldquoover-dominantrdquo and ldquounderdominantrdquo (fig 2) The ldquoadditiverdquo cat-egory comprised 434 genes (102) Interestingly 1048 genes(247) were classified as ldquoRM dominantrdquo but only approx-imately half as many (445 genes 105) were classified as ldquoBYdominantrdquo In total 458 genes (108) were misexpressed(overdominant or underdominant) in the hybrid the under-dominant expression pattern was found in 294 genes (69)and the overdominant pattern in 164 genes (39)

To investigate how the molecular mechanism of geneexpression differences influences the inheritance mode ofthe expression level we examined whether an inheritancemode is enriched for genes belonging to a specific expressiondivergence pattern (table 1) Consistent with previous studies(Lemos et al 2008 McManus et al 2010) we found a weakbut significant relationship between cis regulation and addi-tive inheritance The median percent cis for genes with

additive inheritance (3984) was significantly higher thanfor those with the other inheritance modes (3746)(Wilcoxon rank sum test P value = 00014)

Additionally in agreement with previous findings (Lemoset al 2008) genes with dominant inheritance (either RM orBY dominant) showed a strong enrichment for trans regula-tory variation The median percent trans was significantlyhigher for genes with dominant inheritance (6883) thanfor the other genes (5934) (Wilcoxon rank sum testP valuelt 14 1014)

Furthermore we investigated whether genes in theldquocis transrdquo category disproportionately contributed to mis-expression in our within-species cross as previously describedfor between-species hybrids (Landry et al 2005 McManuset al 2010) Indeed we found an enrichment for misexpressedgenes in the ldquocis transrdquo category (Fisherrsquos exact testP valuelt 5 109 table 2) This relationship remains signif-icant even when both ldquoconservedrdquo and ldquonondifferentialrdquogenes are removed from the analysis (table 2)

Different Constraints on Cis and Trans RegulatoryComponents

We divided genes into different classes expected to be underrelatively weak or strong selective constraint using three cri-teria 1) the ratio of the rate of nonsynonymous substitutionto the rate of synonymous substitution () genes with an higher than the median value (~009) were classified as lessconserved and those with a lower as more conserved2) connectivity in proteinndashprotein interaction (PPI) networks(Stark et al 2006 Collins et al 2007) genes with more than

minus4 minus2 0 2

minus6

minus4

minus2

02

4

trans

cis

(a)

NonminusdifferentialCis OnlyTrans OnlyCis + TransCisminusTrans

Nonminusdifferential Cis Only Trans Only Cis + Trans Cis minusTrans

(b)

050

010

0015

0020

00

2077 (4902)

583 (1376)

893 (2108)

172 (406)

512 (1208)

FIG 1 Classification of genes according to cis or trans effects (a) Scatterplot Y axis the cis component [the log2-ratio of reads in the hybrid samplemapped to the RM and BY genomes log2(ecis) = log2(eHy) = log2(RMHyBYHy)] X axis the trans component [difference between parental and hybridlog2-transformed ASE ratios log2(etrans) = log2(eCoecis) = log2(RMCoBYCo) log2(RMHyBYHy)] Notations RMHy expression level of the RM allele inthe hybrid BYHy expression level of the BY allele in the hybrid eHy ASE ratio in the hybrid RMCo expression level of the RM allele in the cocultureBYCo expression level of the BY allele in the coculture and eCo ASE ratio in the coculture (b) The bar graph shows the number of genes in eachcistrans category

Table 1 Number of Genes Falling into Different Combinations ofInheritance and cistrans Categories

Inheritance mode Regulatory Effect

Nondifferential TransOnly

CisOnly

Cis +

TransCisTrans

Sum

Conserved 1265 199 165 6 217 1852

RM dominant 371 376 149 66 86 1048

BY dominant 147 116 85 24 73 445

Additive 53 134 141 67 39 434

Overdominant 69 11 23 1 60 164

Underdominant 172 57 20 8 37 294

Sum 2077 893 583 172 512 4237

2125

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

the median (four known interaction partners) were classifiedas more constrained whereas those with no known interac-tion partners as less constrained and 3) essentiality essentialgenes versus genes with a fitness gt085 in knock-out exper-iments (Deutschbauer et al 2005) We used the ratiop(p + d) [polymorphism(polymorphism + divergence)]as a measure to detect shifts toward polymorphism or diver-gence where p represents the absolute value of either the cisor the trans component in gene expression differences be-tween different strains of the same species whereas d repre-sents the respective value for the interspecies comparison(figs 3 and 4) (the divergence data were obtained fromTirosh et al 2009) The ratios pcis(pcis + dcis) and ptrans(ptrans + dtrans) were then each compared between categorieswith expected strong selective constraints or with expectedweak selective constraints for each of the three criteria ofselective constraints (figs 3 and 4) All three comparisonsshowed a significant relative abundance of polymorphismin trans for the less constrained category when comparedwith the more constrained category (Wilcoxon rank-sumtest P valueslt 001 in all three comparisons) In contrastin cis none of the three comparisons showed a significantdifference between the more and the less constrained cate-gory (P valuesgt 02) We tested for the equality of the distri-butions of p(p + d) in the more constrained and less

constrained categories and found significant differences forall three comparisons in trans (bootstrapped KolmogorovndashSmirnov [KS] tests P values 0002 in all three comparisons)but not in cis (bootstrapped KS tests P valuesgt 02 in allthree comparisons) In agreement with these observationswe find that essential genes are significantly less likely thannonessential genes to have a significant trans effect (supple-mentary table S6 Supplementary Material online Fisherrsquosexact test P value = 0018) whereas there is no significantdifference in cis (supplementary table S6 SupplementaryMaterial online Fisherrsquos exact test P value = 037)

Misexpressed inheritance modes are slightly under-repre-sented among essential genes for our within-species data thisdifference is not statistically significant in comparison withall other inheritance categories (supplementary table S7Supplementary Material online Fisherrsquos exact testP value = 035) However misexpressed genes are significantlyless likely to be essential if the comparison is restricted only togenes with a conserved total expression level (supplementarytable S7 Supplementary Material online Fisherrsquos exact testP value = 0022) This tendency for misexpressed genes to beless essential appears to be largely due to an enrichment ofnonessential genes among those with underdominant inher-itance Underdominant genes are significantly less likely to beessential in comparison with all other inheritance categories(table 3 Fisherrsquos exact test P valuelt 9763 109) and alsoin comparison with overdominant genes only (table 3 Fisherrsquosexact test P value = 114 1013)

As genes in the ldquocis transrdquo category are more likely to bemisexpressed (table 2) we test whether this category exhibitsa similar enrichment for nonessential genes Indeed nones-sential genes are more likely to fall into the ldquocis transrdquo cat-egory when compared with all other categories takentogether This is true not only for our within-species compar-ison (table 4 Fisherrsquos exact test P value = 0078) but also forthe between-species data of Tirosh et al (table 4P value = 00003) In contrast compared with ldquocis + transrdquogenes only this enrichment for nonessential genes is not

minus6 minus4 minus2 0 2

minus6

minus4

minus2

02

4

log2(Hybrid) minus log2(BY)

log 2

(Hyb

rid)

minus lo

g 2(R

M)

(a)

ConservedRM dominantBY dominantAdditiveOverdominantUnderdominant

ConservedRM

dominantBY

dominant AdditiveOver

dominantUnder

dominant

(b)

050

010

0015

0020

00 1852(4371)

1048(2473)

445(105)

434(1024)

164(387)

294(694)

FIG 2 Inheritance modes (a) The scatterplot compares the differences in expression level between the F1 hybrid and each of the parental strains(BY on the X axis and RM on the Y axis) (b) The bar graph shows the number of genes in each inheritance category

Table 2 Enrichment of Genes with Under- or OverdominantInheritance in the ldquoCis Transrdquo Category

Regulatory Effect Inheritance mode

Misexpressed Other

Cis trans vs other categoriesa 97 415361 3364

Cis transb vs other categoriesb 97 198120 1158

aMisexpressed genes are enriched for genes in the ldquocis transrdquo category Fisherrsquosexact test P value = 499 109bIndicates that both the ldquoconservedrdquo and ldquonondifferentialrdquo genes are removed fromthe analysis Fisherrsquos exact test P-valuelt 22 1016

2126

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

significant (P value = 049 within-species and P value = 051between species table 4)

DiscussionCis and trans regulatory factors differ in the way in which theyinfluence gene expression levels and their inheritance pat-terns Thus they may be subjected to different selection pres-sures and this should be reflected in the way gene regulatorynetworks evolve It remains difficult to tease apart these dif-ferent gene regulatory mechanisms and their evolutionarypathways However an elegant approach is the use ofhybrid experiments and the comparison of cis and trans ef-fects in crosses between and within species (Wittkopp et al2004) although some of the underlying assumptions mightlead to an overestimation of the relative contribution of cischanges to gene expression differences (Takahasi et al 2011)

Our new data set provides more power to detect geneexpression differences than our previous data (Emersonet al 2010) Indeed only 487 (10112078) of the geneswith a significant expression difference in coculture in thenew data set were also significantly different in the old dataset but 782 (10111293) of the genes with a significant

expression difference in the old data set are also significantlydifferent in the new data set (supplementary table S8Supplementary Material online) For ASE differences in thehybrid the respective numbers are 276 (3051106) and628 (305486) (supplementary table S8 SupplementaryMaterial online) The relatively small number of geneswhich were differentially expressed in the old data set butnot in the new one (282 in coculture and 181 in the hybrid)could be due to variation between biological or technicalreplicates and stochastic fluctuations (Busby et al 2011)

Our new analysis examines the different effects of func-tional constraints on the cis and trans components of geneexpression differences In general highly deleterious muta-tions would be quickly removed from the population andare unlikely to be observed either as differences betweenstrains or between species whereas slightly deleterious muta-tions may be found in polymorphisms The chance for slightlydeleterious mutations to be observed as trans polymorphismscould be high because of frequent trans changes (Wittkopp2005 Landry et al 2007) and the insufficient evolutionarytime for selection to remove the mutations from the popu-lation However they are unlikely to contribute significantly

Low w

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d) trans p(p+d)

Den

sity

00 02 04 06 08 10

00

05

10

15 High w

Den

sity

00 02 04 06 08 10

00

05

10

15 All genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Nonessential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

05

10

15

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

05

10

15

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

05

10

15

Both groups combined

Den

sity

00 02 04 06 08 100

00

51

01

5

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 3 The distribution of the ratio p(p + d) is significantly different between constrained and less constrained categories for all three classificationsystems in trans (andashc the ratio of the rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivityin PPI networks) Notations p trans polymorphism d trans divergence

2127

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

the median (four known interaction partners) were classifiedas more constrained whereas those with no known interac-tion partners as less constrained and 3) essentiality essentialgenes versus genes with a fitness gt085 in knock-out exper-iments (Deutschbauer et al 2005) We used the ratiop(p + d) [polymorphism(polymorphism + divergence)]as a measure to detect shifts toward polymorphism or diver-gence where p represents the absolute value of either the cisor the trans component in gene expression differences be-tween different strains of the same species whereas d repre-sents the respective value for the interspecies comparison(figs 3 and 4) (the divergence data were obtained fromTirosh et al 2009) The ratios pcis(pcis + dcis) and ptrans(ptrans + dtrans) were then each compared between categorieswith expected strong selective constraints or with expectedweak selective constraints for each of the three criteria ofselective constraints (figs 3 and 4) All three comparisonsshowed a significant relative abundance of polymorphismin trans for the less constrained category when comparedwith the more constrained category (Wilcoxon rank-sumtest P valueslt 001 in all three comparisons) In contrastin cis none of the three comparisons showed a significantdifference between the more and the less constrained cate-gory (P valuesgt 02) We tested for the equality of the distri-butions of p(p + d) in the more constrained and less

constrained categories and found significant differences forall three comparisons in trans (bootstrapped KolmogorovndashSmirnov [KS] tests P values 0002 in all three comparisons)but not in cis (bootstrapped KS tests P valuesgt 02 in allthree comparisons) In agreement with these observationswe find that essential genes are significantly less likely thannonessential genes to have a significant trans effect (supple-mentary table S6 Supplementary Material online Fisherrsquosexact test P value = 0018) whereas there is no significantdifference in cis (supplementary table S6 SupplementaryMaterial online Fisherrsquos exact test P value = 037)

Misexpressed inheritance modes are slightly under-repre-sented among essential genes for our within-species data thisdifference is not statistically significant in comparison withall other inheritance categories (supplementary table S7Supplementary Material online Fisherrsquos exact testP value = 035) However misexpressed genes are significantlyless likely to be essential if the comparison is restricted only togenes with a conserved total expression level (supplementarytable S7 Supplementary Material online Fisherrsquos exact testP value = 0022) This tendency for misexpressed genes to beless essential appears to be largely due to an enrichment ofnonessential genes among those with underdominant inher-itance Underdominant genes are significantly less likely to beessential in comparison with all other inheritance categories(table 3 Fisherrsquos exact test P valuelt 9763 109) and alsoin comparison with overdominant genes only (table 3 Fisherrsquosexact test P value = 114 1013)

As genes in the ldquocis transrdquo category are more likely to bemisexpressed (table 2) we test whether this category exhibitsa similar enrichment for nonessential genes Indeed nones-sential genes are more likely to fall into the ldquocis transrdquo cat-egory when compared with all other categories takentogether This is true not only for our within-species compar-ison (table 4 Fisherrsquos exact test P value = 0078) but also forthe between-species data of Tirosh et al (table 4P value = 00003) In contrast compared with ldquocis + transrdquogenes only this enrichment for nonessential genes is not

minus6 minus4 minus2 0 2

minus6

minus4

minus2

02

4

log2(Hybrid) minus log2(BY)

log 2

(Hyb

rid)

minus lo

g 2(R

M)

(a)

ConservedRM dominantBY dominantAdditiveOverdominantUnderdominant

ConservedRM

dominantBY

dominant AdditiveOver

dominantUnder

dominant

(b)

050

010

0015

0020

00 1852(4371)

1048(2473)

445(105)

434(1024)

164(387)

294(694)

FIG 2 Inheritance modes (a) The scatterplot compares the differences in expression level between the F1 hybrid and each of the parental strains(BY on the X axis and RM on the Y axis) (b) The bar graph shows the number of genes in each inheritance category

Table 2 Enrichment of Genes with Under- or OverdominantInheritance in the ldquoCis Transrdquo Category

Regulatory Effect Inheritance mode

Misexpressed Other

Cis trans vs other categoriesa 97 415361 3364

Cis transb vs other categoriesb 97 198120 1158

aMisexpressed genes are enriched for genes in the ldquocis transrdquo category Fisherrsquosexact test P value = 499 109bIndicates that both the ldquoconservedrdquo and ldquonondifferentialrdquo genes are removed fromthe analysis Fisherrsquos exact test P-valuelt 22 1016

2126

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

significant (P value = 049 within-species and P value = 051between species table 4)

DiscussionCis and trans regulatory factors differ in the way in which theyinfluence gene expression levels and their inheritance pat-terns Thus they may be subjected to different selection pres-sures and this should be reflected in the way gene regulatorynetworks evolve It remains difficult to tease apart these dif-ferent gene regulatory mechanisms and their evolutionarypathways However an elegant approach is the use ofhybrid experiments and the comparison of cis and trans ef-fects in crosses between and within species (Wittkopp et al2004) although some of the underlying assumptions mightlead to an overestimation of the relative contribution of cischanges to gene expression differences (Takahasi et al 2011)

Our new data set provides more power to detect geneexpression differences than our previous data (Emersonet al 2010) Indeed only 487 (10112078) of the geneswith a significant expression difference in coculture in thenew data set were also significantly different in the old dataset but 782 (10111293) of the genes with a significant

expression difference in the old data set are also significantlydifferent in the new data set (supplementary table S8Supplementary Material online) For ASE differences in thehybrid the respective numbers are 276 (3051106) and628 (305486) (supplementary table S8 SupplementaryMaterial online) The relatively small number of geneswhich were differentially expressed in the old data set butnot in the new one (282 in coculture and 181 in the hybrid)could be due to variation between biological or technicalreplicates and stochastic fluctuations (Busby et al 2011)

Our new analysis examines the different effects of func-tional constraints on the cis and trans components of geneexpression differences In general highly deleterious muta-tions would be quickly removed from the population andare unlikely to be observed either as differences betweenstrains or between species whereas slightly deleterious muta-tions may be found in polymorphisms The chance for slightlydeleterious mutations to be observed as trans polymorphismscould be high because of frequent trans changes (Wittkopp2005 Landry et al 2007) and the insufficient evolutionarytime for selection to remove the mutations from the popu-lation However they are unlikely to contribute significantly

Low w

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d) trans p(p+d)

Den

sity

00 02 04 06 08 10

00

05

10

15 High w

Den

sity

00 02 04 06 08 10

00

05

10

15 All genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Nonessential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

05

10

15

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

05

10

15

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

05

10

15

Both groups combined

Den

sity

00 02 04 06 08 100

00

51

01

5

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 3 The distribution of the ratio p(p + d) is significantly different between constrained and less constrained categories for all three classificationsystems in trans (andashc the ratio of the rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivityin PPI networks) Notations p trans polymorphism d trans divergence

2127

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

significant (P value = 049 within-species and P value = 051between species table 4)

DiscussionCis and trans regulatory factors differ in the way in which theyinfluence gene expression levels and their inheritance pat-terns Thus they may be subjected to different selection pres-sures and this should be reflected in the way gene regulatorynetworks evolve It remains difficult to tease apart these dif-ferent gene regulatory mechanisms and their evolutionarypathways However an elegant approach is the use ofhybrid experiments and the comparison of cis and trans ef-fects in crosses between and within species (Wittkopp et al2004) although some of the underlying assumptions mightlead to an overestimation of the relative contribution of cischanges to gene expression differences (Takahasi et al 2011)

Our new data set provides more power to detect geneexpression differences than our previous data (Emersonet al 2010) Indeed only 487 (10112078) of the geneswith a significant expression difference in coculture in thenew data set were also significantly different in the old dataset but 782 (10111293) of the genes with a significant

expression difference in the old data set are also significantlydifferent in the new data set (supplementary table S8Supplementary Material online) For ASE differences in thehybrid the respective numbers are 276 (3051106) and628 (305486) (supplementary table S8 SupplementaryMaterial online) The relatively small number of geneswhich were differentially expressed in the old data set butnot in the new one (282 in coculture and 181 in the hybrid)could be due to variation between biological or technicalreplicates and stochastic fluctuations (Busby et al 2011)

Our new analysis examines the different effects of func-tional constraints on the cis and trans components of geneexpression differences In general highly deleterious muta-tions would be quickly removed from the population andare unlikely to be observed either as differences betweenstrains or between species whereas slightly deleterious muta-tions may be found in polymorphisms The chance for slightlydeleterious mutations to be observed as trans polymorphismscould be high because of frequent trans changes (Wittkopp2005 Landry et al 2007) and the insufficient evolutionarytime for selection to remove the mutations from the popu-lation However they are unlikely to contribute significantly

Low w

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d)

trans p(p+d) trans p(p+d)

Den

sity

00 02 04 06 08 10

00

05

10

15 High w

Den

sity

00 02 04 06 08 10

00

05

10

15 All genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Nonessential genes

Den

sity

00 02 04 06 08 10

00

05

10

15

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

05

10

15

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

05

10

15

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

05

10

15

Both groups combined

Den

sity

00 02 04 06 08 100

00

51

01

5

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 3 The distribution of the ratio p(p + d) is significantly different between constrained and less constrained categories for all three classificationsystems in trans (andashc the ratio of the rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivityin PPI networks) Notations p trans polymorphism d trans divergence

2127

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Low w

cis p (p +d) cis p (p +d) cis p (p +d)

cis p (p +d)cis p (p +d)cis p (p +d)

cis p (p +d) cis p (p +d) cis p (p +d)

Den

sity

00 02 04 06 08 10

00

10

20

30

High w

Den

sity

00 02 04 06 08 10

00

10

20

30

All genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Nonessential genes

Den

sity

00 02 04 06 08 10

00

10

20

30

Essential and nonessentialgenes combined

Den

sity

00 02 04 06 08 10

00

10

20

30

High number of interaction partners

Den

sity

00 02 04 06 08 10

00

10

20

30

No known interaction partnersin PPI network

Den

sity

00 02 04 06 08 10

00

10

20

30

Both groups combined

Den

sity

00 02 04 06 08 10

00

10

20

30

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

FIG 4 In cis the distribution of the ratio p(p + d) is not significantly different between constrained and less constrained categories (andashc the ratio ofthe rate of nonsynonymous substitution to the rate of synonymous substitution [] dndashf essentiality gndashi connectivity in PPI networks) Notations p cispolymorphism d cis divergence

Table 3 Proportions of Essential Genes in the Underdominant Inheritance Category and in the Other Inheritance Categories

Essentiality Inheritance Mode

Underdominant Only All Other Inheritance Categories

Total Overdominant Nonmisexpressed

Essential 24ab 831a 59b 772

Nonessential (fitnessgt 085) 243ab 2783a 86b 2697

aUnderdominant genes are significantly less likely to be essential in comparison with all other inheritance categories Fisherrsquos exact test P valuelt 9763 109bEssential genes are significantly less likely to be underdominant than overdominant Fisherrsquos exact test P value = 114 1013

Table 4 Enrichment of the ldquoCis Transrdquo Category in Nonessential Genes

Regulatory Effect within Species (RM vs BY) Regulatory Effect between Species (S cerevisiae vs S paradoxus)

Essentiality Cis Trans Other Categories (Cis + Trans) Cis ndash Trans Other Categories (Cis + Trans)

Essential 91 764 (34) 166 689 (182)

Nonessential (fitnessgt 085) 392 2634 (123) 766 2260 (776)

NOTEmdashCisndashtrans genes are enriched for nonessential genes compared with genes in all other categories (within species Fisherrsquos exact test P value = 0078 between species Fisherrsquosexact test P value = 00003) But there is no significant difference between the ldquocis transrdquo and ldquocis + transrdquo (numbers in brackets) categories in the proportion of essential genes(within species Fisherrsquos exact test P value = 049 between species Fisherrsquos exact test P value = 051)

2128

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

to fixed expression differences between species Thus a trendtoward polymorphism in less constrained categories (repre-senting slightly deleterious mutations) can be expected intrans if different selective constraints are important in theevolution of trans regulation Conversely if gene expressionevolution in trans is primarily neutral and mutation-drivenno such trend is expected Our data show a clear shift towardpolymorphism for less constrained categories compared tohighly constrained categories in trans (fig 3) and a significantdifference between essential and nonessential genes inthe probability of having a significant trans component (sup-plementary table S6 Supplementary Material online) Thepositive association of divergence with polymorphism intrans may imply that some of the trans mutations that con-tribute to within-species differences are neutral or nearlyneutral

In cis the earlier-mentioned trends are largely absent Theassociation between polymorphism and divergence is muchweaker in cis than in trans (supplementary table S9Supplementary Material online) As in previous studies(Wittkopp et al 2008a Emerson et al 2010) we found astronger impact of cis regulatory divergence on gene expres-sion differences between species than on those within speciesin comparison with the trans effect Indeed for the cis effect56 (530) of the 940 genes showing significant polymorphismshow significant divergence and 52 (1174) of the 2263genes showing nonsignificant polymorphism show significantdivergence whereas for the trans effect the correspondingproportions are only 29 and 21 These observations sug-gest positive selection has contributed to cis expressiondivergence

Thus our data are compatible with the view that transregulatory factors are subjected to stronger selective con-straint than cis regulatory factors As the mutational targetsize in trans is larger than that in cis trans differences con-tribute relatively more to gene expression differences withinspecies However as trans changes are subjected to strongerselective constraint they contribute less to between speciesdivergence than cis changes (supplementary table S9Supplementary Material online)

The fact that changes in cis and trans regulators impactgene expression in different ways is reflected in different in-heritance patterns In accordance to previous studies geneswith cis regulatory variants tend to show an additive inheri-tance pattern while those with trans regulatory differencesare enriched for dominant inheritance of expression level(Lemos et al 2008 McManus et al 2010) The lowernumber of ldquoBY dominantrdquo genes might be due to fixationof rare recessive alleles in the BY laboratory strain Genes withantagonistic cisndashtrans interactions are more likely to be mis-expressed in hybrids in agreement with previous findings(Lemos et al 2008 McManus et al 2010) The percentageof misexpressed genes (108) in our study is higher thanthat found in an interspecies-hybrid between S cerevisiaeand S paradoxus (2ndash8) (Tirosh et al 2009) These valuesare difficult to compare because of the differing sensitivity ofthe different experimental tools used detecting subtle differ-ences in gene expression is easier with next generation

sequencing (our within-species data) than with microarrays(Tirosh et alrsquos between-species data) Otherwise this findingmight be surprising as misexpression is expected to contrib-ute to hybrid incompatibilities and speciation Essential geneshave a significantly lower probability to be segregating formutations exhibiting underdominant inheritance (table 3)If we assume that the allelic differences between the twostrains in the majority of their polymorphisms are presentin ldquonaturalrdquo S cerevisiae populations especially in human-associated environments (eg vineyards as for RM) andcould thus be found in heterozygotes (Magwene et al2011) this result may be expected as genes which are re-quired for reproduction and survival of the organism tend tobe under stabilizing selection If mutations in two or moreregulatory loci occur which in combination lead to signifi-cantly lower expression levels of these essential genes theywill be removed from the population quickly

Our data showed that cisndashtrans genes are enriched formisexpressed genes Although more of these misexpressedgenes are overdominant (60) than underdominant (37table 1) we might also expect essential genes to be under-represented in the ldquocis transrdquo category This is true not onlyfor our within-species comparison but also for the between-species data A possible explanation is that most of the reg-ulatory differences in the ldquocis transrdquo category are due to thetwo mutations having occurred in the same lineage with thefirst mutation being slightly deleterious and the second (par-tially) compensating For the first mutation to become fixedin the population it cannot have a strongly deleterious effectAlternatively this observation might just reflect that essentialgenes are less likely to accumulate several regulatory muta-tions over time which is a necessary condition for a gene tobe classified as either ldquocis transrdquo or ldquocis + transrdquo The factthat the ldquocis + transrdquo category is not significantly differentfrom the ldquocis transrdquo category (table 4) and is equallyenriched for nonessential genes in the interspecific compar-ison supports this simpler hypothesis

Materials and Methods

Yeast Strains and Culturing

Two yeast strains were used one designated as ldquoBYrdquo is ahaploid laboratory strain officially named BY4741 (MATahis31 leu20 met150 ura30) and is a descendant ofS288C (Brachmann et al 1998) and the other designated asldquoRMrdquo is formally named either RM11-1a (MATa lys20ura30 hoKAN) or RM11-1 (MAT lys20 ura30hoKAN) both of which are haploid and derived fromBb32(3) a natural isolate described previously (Mortimeret al 1994 Brem et al 2002) The hybrid of BY (MATa) andRM (MAT) constructed in our laboratory is named WL201

Two culture types were prepared coculture and hybridThe coculture is a mixture of approximately equal numbers ofBY (MATa) cells and RM (MATa) cells whereas the hybridstrain was derived from a cross between BY (MATa) and RM(MAT) All strains were grown on the standard YPADmedium at 30 C with 250 rpm shaking as described previ-ously (Emerson et al 2010)

2129

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Transcriptome Sequencing

To sequence the transcriptomes of hybrid and coculturedyeast strains RNA was extracted using the Hot AcidicPhenol method (Kohrer and Domdey 1991) and subjectedto mRNA-seq library preparation using the Illumina TruSeqmRNA-seq Sample Prep kit with some modificationsBriefly 5mg of total RNA from each sample was used topurify for polyA-RNA and the mRNA was fragmented byheat at 94 C for 8 min Double-stranded cDNA was syn-thesized by random priming end repaired and ligated tothe Y-shaped TruSeq adaptors Samples were cleaned byAMPure beads (Agencourt) and split into two halveswhich were independently assembled into two polymerasechain reaction (PCR) reactions one using the PCR reagentsprovided in the Illumina kit and the other using the KAPAPCR reagent (KAPA HiFi HotStart ReadyMix) All reactionswent through 12 cycles of PCR amplifications and werecleaned by AMPure beads to remove primer dimers Thepurified products were quantified by Qubit (Invitrogen)and BioAnalyzer 2100 High Sensitivity DNA Assay(Agilent) The library profiles showed a wide spectrum offragment sizes ranging from 220 to 500 bp with a peak atapproximately 285 bp Each of the four libraries (two librar-ies amplified from the coculture and the hybrid experi-ments) were put in one lane of PE sequencing onIllumina GA IIx (IGA-IIx) in the High ThroughputSequencing Core Facility of Academia Sinica Taiwan Thesequencing data were processed by CASAVA 182 to gen-erate pass-filtered reads for downstream analyses

Whole-Genome Sequencing

To quantify the relative cell numbers of the two yeast strainsin coculture genomic DNA (gDNA) from the coculturedyeast sample was sequenced (Emerson et al 2010) ThegDNA was extracted using the Qiagen Q100 GenomicPurification Kit (Qiagen) and 1mg of the gDNA was sonicatedto fragments of approximately 300ndash400 bp Using theIllumina Paired-End (PE) DNA Sample Prep Kit the fragmentswere end-repaired A-tailed and ligated to the PE adaptors Tocontrol the precise fragment size for downstream mappingthe ligation product was fractionated on agarose gel andfragments ranging from 400 to 500 bp in length were gelpurified and amplified using the KAPA PCR kit (same asmentioned earlier) The library was cleaned using Ampurebeads and assayed on Qubit and BioAnalyzer 2100 The libraryshowed a narrow distribution with a peak approximately437 bp and was sequenced in the same manner as themRNA-seq libraries

Mapping Reads to the Reference Genomes

To map sequencing reads to the reference genomes andobtain genome-specific reads the software tool ASAP(ldquoAllele-Specific Alignment Pipelinerdquo) was downloaded fromthe Bioinformatics Group at the Babraham Institute (httpwwwbioinformaticsbbsrcacukprojects last accessed Sep-tember 2012) To determine whether a given sequencematches one of the two reference genomes specifically it

performs alignments against both sequences in parallelusing the Bowtie program (Langmead et al 2009) In ouranalysis the seed length was set to 40 and the maximumnumber of mismatches permitted in the seed was set to 2

The BY reference genome was downloaded from theSGD project ldquoSaccharomyces Genome Databaserdquo (httpdownloadsyeastgenomeorgsequenceS288C_referencegenome_releases last accessed April 2008) The RM referencegenome was downloaded from the Saccharomyces cerevisiaeRM11-1a Sequencing Project Broad Institute (httpwwwbroadinstituteorgannotationgenomesaccharomyces_cerevisiae last accessed April 2008) In addition 893 error sitesagainst our strains 309 from the BY strain and 584 from theRM strain detected previously (Emerson et al 2010) werecorrected in our updated reference genomes

From the two channels of cDNA IGA-IIx sequencing foreach of the two samples we obtained in total 60909895and 52151477 raw reads from the hybrid sample and thecoculture sample respectively Genes that are known to bemating-type specific or have been found to be differentiallyexpressed between mating-types or between haploids anddiploids were excluded from all further analyses (Galitskiet al 1999 Tirosh et al 2009)

Assigning Gene Expression Differences to Cis- orTrans-Regulatory Changes

The ASE ratios and their cistrans contributions wereestimated as previously described (Emerson et al 2010) Toaccount for the difference between cell numbers of BY andRM in coculture we estimated a normalization parameterbased on the gDNA ratio of the two strains in the cocultureexperiment

We calculated the cis-regulatory component (ecis) of geneexpression differences as the ratio of the reads mapped to theRM genome and those mapped to the BY genome in thehybrid sample ecis = eHy = RMHyBYHy whereRMHy is the ex-pression level of the RM allele in the hybrid BYHy is theexpression level of the BY allele in the hybrid and eHy isthe ASE ratio in the hybrid The expression difference betweenthe two parental strains can be attributed to both cis andtrans effects (supplementary fig S1 Supplementary Materialonline) We assume that cis and trans effects are multiplica-tive (additive if the logarithm of the ASE ratios is considered)Thus eCo = etransecis and log2(eCo) = log2(etrans) + log2(ecis)or etrans = eCoecis = (RMCoBYCo)(RMHyBYHy) whereRMCo = expression level of the RM allele in the cocultureBYCo = expression level of the BY allele in the cocultureand eCo = ASE ratio in the coculture We obtain the P valueof a hypothesis test using the likelihood ratio test with onedegree of freedom under the null hypotheses of ecis = 1and etrans = 1 The FDR cutoff for each test was set to 25to give a combined FDR of approximately 5 (Benjaminiand Hochberg 1995) Under the above formulationgenes are sorted into five categories using R (v 2141CRAN) with the methodology described by McManus et al(2010)

2130

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Sequence Analysis and GO Term Enrichment

The SNP density of each gene was calculated as the number ofits SNPs between RM and BY divided by the length of thetranscribed region in bp The promoter region was defined as500 bp upstream of the transcription start site Promoterregions with a putative loss or gain of a transcription factorbinding site (D Wang unpublished data) were excluded fromthis analysis Gene classification into those with TATA boxcontaining promoter regions and TATA-less genes was takenfrom Basehoar et al (2004) Gene sets of DPN genes (geneswith a well-defined nucleosome-free region close to the tran-scription start site) and OPN genes (genes without such aregion) were defined as in Tirosh et al (2008) Enrichment inspecific biological processes or cellular components in the GOannotation was analyzed using the FunSpec analysis tool withBonferroni correction for multiple testing (Robinson et al2002) It uses the hypergeometric distribution to calculatethe probability (P value) that the intersection of a givengene list with any given functional category occurs by chance

Transcription Factors and Gene Expression Modules

We obtained regulatory associations between transcriptionfactors and their target genes from the YEASTRACT database(Teixeira et al 2006) Gene expression modules as groups ofcoexpressed genes under several conditions (Ihmels et al2002) were downloaded from the Barkai lab website(httpbarkai-servweizmannacilModulespagedetailshtmllevel 10 last accessed May 2013) We only considered geneswith significant trans effects present in one of the co-expres-sion modules and with at least one known transcriptionfactor regulating the gene Genes were also required tohave consistent expression differences between our old andour new data set for hybrid and coculture (either the BY orthe RM allele must be higher in both data sets) Among thesegenes we compared the values for log2(etrans) of all possiblegene pairs which either share a common regulator but belongto different modules or which belong to the same module(s)but are not known to be regulated by any identical transcrip-tion factor

Inheritance Mode Classification

To account for the unequal total amounts of mRNA in dif-ferent samples we calculate the RNA ratios for BY and RM bydividing the total number of mapped RNA reads for each ofthe two strains by that of the hybrid after removing outlierswith extreme ratios (ie values below the 25 quantile orabove the 975 quantile) The total expression for each geneis normalized by dividing the number of mapped mRNAreads by the respective RNA ratio for example the mRNAreads of each gene for RM are divided by the RNA ratioRMHy The parental and hybrid data sets are analyzed forevidence of differential expression using the exact binomialtest We set the FDR cutoff to 1696 and only the P valuesbelow this threshold are considered significant so that theprobability of a false positive in one of three comparisons(discussed later) is 5 (Benjamini and Hochberg 1995)We determine the mode of expression level inheritance for

a gene by comparing the following three expression levels 1)the total expression level in the hybrid (referred to as ldquohy-bridrdquo) 2) the expression level of the gene in the parental RMstrain (measured as the expression level of the RM allele incoculture referred to as ldquoRMrdquo) and 3) the expression level ofthis gene in the parental BY strain (measured as the expres-sion level of the BY allele in coculture referred to as ldquoBYrdquo) Ineach of these comparisons expression levels are considered asldquosimilarrdquo if their difference is not statistically significant or lessthan 125-fold Genes are categorized into six different inher-itance modes using R (v 2141 CRAN) according to the clas-sification of McManus et al (2010) conserved additive BYdominant RM dominant overdominant and underdomi-nant (supplementary fig S3 Supplementary Materialonline) Genes with similar total expression levels in hybridand the parental strains are classified as ldquoconservedrdquo Geneswith more than 25 expression difference and with a signif-icant exact binomial test in at least one of the three compar-isons (BY-RM BY-hybrid and RM-hybrid) are classified asnonconserved and further assigned to one of the five noncon-served categories

ldquoAdditiverdquo The expression level in the hybrid lies in be-tween the levels of the parental strains

ldquoBY dominantrdquo The expression level of the hybrid is sim-ilar to the parental BY strain but significantly differentfrom RM

ldquoRM dominantrdquo The expression level of the hybrid issimilar to the parental RM strain but significantly dif-ferent from BY

ldquoUnderdominantrdquo The expression level is significantlylower in the hybrid than in both parent strains

ldquoOverdominantrdquo The expression level is significantlyhigher in the hybrid than in both parent strains

Selective Constraint Analysis

The cis and trans components in within-species differences(labeled here as pcis and ptrans to represent polymorphism)were derived from our own data set as previously described(Emerson et al 2010) The cis and trans components for thedivergence between S cerevisiae and S paradoxus (labeledhere as dcis and dtrans to represent divergence) were obtainedfrom Tirosh et al (2009) To analyze different constraints intrans and cis changes we compared the ratio p(p + d) forthe genes in two different categories of expected weak orstrong selective constraint Three such comparisons were ex-ecuted with gene groupings according to three different char-acteristics 1) the connectivity in PPI networks (Stark et al2006 Collins et al 2007) contrasting ldquohigh connectivityrdquo(more than four known interaction partners which is thedata setrsquos median) against no known interaction partner 2)sequence divergence the ratio of the rate of nonsynonymoussubstitution to the rate of synonymous substitution () be-tween S cerevisiae and S paradoxus (lower than the data setrsquosmedian of 00925 vs 00925) and 3) essentiality (essentialgenes which are lethal when knocked out vs nonessentialgenes defined as those genes exhibiting a fitness of more than

2131

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

085 in knock-out experiments) (Deutschbauer et al 2005)The ratio pcis(pcis + dcis) (or ptrans[ptrans + dtrans]) was thencompared between categories with strong selective constraintproxies (low high connectivity and high essentiality) andweak expected selective constraint (high no known inter-action partner and low essentiality) using the Wilcoxon rank-sum test Distributions of the values for strongly constrainedand weakly constrained categories were tested for equalityusing the bootstrap version of the KolmogorovndashSmirnov testin R (Sekhon 2011)

Supplementary MaterialSupplementary tables S1ndashS9 and figures S1ndashS3 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

This work was supported by Academia Sinica Taiwan andNational Science Council (NSC) grants (99-2628-B-001-009-MY3 and 99-2321-B-001-041-MY2) The authors thank Fu-Jung Yu for experimental assistance Daryi Wang for providingunpublished data and Krishna B Swamy and Nathan Barhamfor thoughtful discussion of the manuscript They also thanktwo anonymous reviewers for constructive criticisms andvaluable suggestions

ReferencesBasehoar AD Zanton SJ Pugh BF 2004 Identification and distinct

regulation of yeast TATA box-containing genes Cell 116(5)699ndash709Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdasha

practical and powerful approach to multiple testing J R Stat Soc SerB Stat Methodol 57(1)289ndash300

Brachmann CB Davies A Cost GJ Caputo E Li J Hieter P Boeke JD1998 Designer deletion strains derived from Saccharomyces cerevi-siae S288C a useful set of strains and plasmids for PCR-mediatedgene disruption and other applications Yeast 14(2)115ndash132

Brem RB Kruglyak L 2005 The landscape of genetic complexity across5700 gene expression traits in yeast Proc Natl Acad Sci U S A102(5)1572ndash1577

Brem RB Storey JD Whittle J Kruglyak L 2005 Genetic interactionsbetween polymorphisms that affect gene expression in yeast Nature436(7051)701ndash703

Brem RB Yvert G Clinton R Kruglyak L 2002 Genetic dissection oftranscriptional regulation in budding yeast Science296(5568)752ndash755

Busby MA Gray JM Costa AM Stewart C Stromberg MP Barnett DChuang JH Springer M Marth GT 2011 Expression divergencemeasured by transcriptome sequencing of four yeast species BMCGenomics 12635

Chang YW Robert Liu FG Yu N Sung HM Yang P Wang D Huang CJShih MC Li WH 2008 Roles of cis- and trans-changes in the regu-latory evolution of genes in the gluconeogenic pathway in yeast MolBiol Evol 25(9)1863ndash1875

Chen K van Nimwegen E Rajewsky N Siegal ML 2010 Correlating geneexpression variation with cis-regulatory polymorphism inSaccharomyces cerevisiae Genome Biol Evol 2697ndash707

Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FCWeissman JS Krogan NJ 2007 Toward a comprehensive atlas of thephysical interactome of Saccharomyces cerevisiae Mol CellProteomics 6(3)439ndash450

Cowles CR Hirschhorn JN Altshuler D Lander ES 2002 Detection ofregulatory variation in mouse genes Nat Genet 32(3)432ndash437

Denver DR Morris K Streelman JT Kim SK Lynch M ThomasWK 2005 The transcriptional consequences of mutation and

natural selection in Caenorhabditis elegans Nat Genet37(5)544ndash548

Deutschbauer AM Jaramillo DF Proctor M Kumm J Hillenmeyer MEDavis RW Nislow C Giaever G 2005 Mechanisms of haploinsuffi-ciency revealed by genome-wide profiling in yeast Genetics169(4)1915ndash1925

Emerson JJ Hsieh LC Sung HM Wang TY Huang CJ Lu HH Lu MY WuSH Li WH 2010 Natural selection on cis and trans regulation inyeasts Genome Res 20(6)826ndash836

Emerson JJ Li WH 2010 The genetic basis of evolutionary change ingene expression levels Philos Trans R Soc Lond B Biol Sci365(1552)2581ndash2590

Field Y Fondufe-Mittendorf Y Moore IK Mieczkowski P Kaplan NLubling Y Lieb JD Widom J Segal E 2009 Gene expression diver-gence in yeast is coupled to evolution of DNA-encoded nucleosomeorganization Nat Genet 41(4)438ndash445

Galitski T Saldanha AJ Styles CA Lander ES Fink GR 1999 Ploidyregulation of gene expression Science 285(5425)251ndash254

Ihmels J Friedlander G Bergmann S Sarig O Ziv Y Barkai N 2002Revealing modular organization in the yeast transcriptional networkNat Genet 31(4)370ndash377

Khan Z Bloom JS Amini S Singh M Perlman DH Caudy AA Kruglyak L2012 Quantitative measurement of allele-specific protein expres-sion in a diploid yeast hybrid by LC-MS Mol Syst Biol 8602

King MC Wilson AC 1975 Evolution at two levels in humans andchimpanzees Science 188(4184)107ndash116

Kohrer K Domdey H 1991 Preparation of high molecular weight RNAMethods Enzymol 194398ndash405

Landry CR Lemos B Rifkin SA Dickinson WJ Hartl DL 2007 Geneticproperties influencing the evolvability of gene expression Science317(5834)118ndash121

Landry CR Wittkopp PJ Taubes CH Ranz JM Clark AG Hartl DL 2005Compensatory cis-trans evolution and the dysregulation of geneexpression in interspecific hybrids of Drosophila Genetics171(4)1813ndash1822

Langmead B Trapnell C Pop M Salzberg SL 2009 Ultrafast andmemory-efficient alignment of short DNA sequences to thehuman genome Genome Biol 10(3)R25

Lemos B Araripe LO Fontanillas P Hartl DL 2008 Dominance and theevolutionary accumulation of cis- and trans-effects on gene expres-sion Proc Natl Acad Sci U S A 105(38)14471ndash14476

Li CM Tzeng JN Sung HM 2012 Effects of cis and trans regulatoryvariations on the expression divergence of heat shock responsegenes between yeast strains Gene 506(1)93ndash97

Magwene PM Kayikci O Granek JA Reininga JM Scholl Z Murray D2011 Outcrossing mitotic recombination and life-history trade-offsshape genome evolution in Saccharomyces cerevisiae Proc NatlAcad Sci U S A 108(5)1987ndash1992

McManus CJ Coolon JD Duff MO Eipper-Mains J Graveley BRWittkopp PJ 2010 Regulatory divergence in Drosophila revealedby mRNA-seq Genome Res 20(6)816ndash825

Mortimer RK Romano P Suzzi G Polsinelli M 1994 Genome renewal anew phenomenon revealed from a genetic study of 43 strains ofSaccharomyces cerevisiae derived from natural fermentation ofgrape musts Yeast 10(12)1543ndash1552

Muller CA Nieduszynski CA 2012 Conservation of replication timingreveals global and local regulation of replication origin activityGenome Res 22(10)1953ndash1962

Ohno S 1972 Argument for genetic simplicity of man and other mam-mals J Hum Evol 1(6)651ndash662

Robinson MD Grigull J Mohammad N Hughes TR 2002 FunSpec aweb-based cluster interpreter for yeast BMC Bioinformatics 335

Rockman MV Kruglyak L 2006 Genetics of global gene expression NatRev Genet 7(11)862ndash872

Ronald J Akey JM 2007 The evolution of gene expression QTL inSaccharomyces cerevisiae PLoS One 2(7)e678

Rosin D Hornung G Tirosh I Gispan A Barkai N 2012 Promoter nu-cleosome organization shapes the evolution of gene expressionPLoS Genet 8(3)e1002579

2132

Schaefke et al doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from

Rossouw D Jacobson D Bauer FF 2012 Transcriptional regulation andthe diversification of metabolism in wine yeast strains Genetics190(1)251ndash261

Sekhon JS 2011 Multivariate and propensity score matching softwarewith automated balance optimization the matching package for RJ Stat Softw 42(7)1ndash52

Stark C Breitkreutz BJ Reguly T Boucher L Breitkreutz A Tyers M 2006BioGRID a general repository for interaction datasets Nucleic AcidsRes 34(Database issue)D535ndashD539

Takahasi KR Matsuo T Takano-Shimizu-Kouno T 2011 Twotypes of cis-trans compensation in the evolution of tran-scriptional regulation Proc Natl Acad Sci U S A 108(37)15276ndash15281

Teixeira MC Monteiro P Jain P Tenreiro S Fernandes AR Mira NPAlenquer M Freitas AT Oliveira AL Sa-Correia I 2006 TheYEASTRACT database a tool for the analysis of transcription regu-latory associations in Saccharomyces cerevisiae Nucleic Acids Res34(Database issue)D446ndashD451

Tirosh I Barkai N 2008 Two strategies for gene regulation by promoternucleosomes Genome Res 18(7)1084ndash1091

Tirosh I Reikhav S Levy AA Barkai N 2009 A yeast hybrid providesinsight into the evolution of gene expression regulation Science324(5927)659ndash662

Tirosh I Sigal N Barkai N 2010 Divergence of nucleosome positioningbetween two closely related yeast species genetic basis and func-tional consequences Mol Syst Biol 6365

Tirosh I Weinberger A Bezalel D Kaganovich M Barkai N 2008 On therelation between promoter divergence and gene expression evolu-tion Mol Syst Biol 4159

Tsankov AM Thompson DA Socha A Regev A Rando OJ 2010 Therole of nucleosome positioning in the evolution of gene regulationPLoS Biol 8(7)e1000414

Wittkopp PJ 2005 Genomic sources of regulatory variation in cis and intrans Cell Mol Life Sci 62(16)1779ndash1783

Wittkopp PJ Haerum BK Clark AG 2004 Evolutionary changes in cisand trans gene regulation Nature 430(6995)85ndash88

Wittkopp PJ Haerum BK Clark AG 2008a Independent effects of cis-and trans-regulatory variation on gene expression in Drosophilamelanogaster Genetics 178(3)1831ndash1835

Wittkopp PJ Haerum BK Clark AG 2008b Regulatory changes under-lying expression differences within and between Drosophila speciesNat Genet 40(3)346ndash350

Yvert G Brem RB Whittle J Akey JM Foss E Smith EN Mackelprang RKruglyak L 2003 Trans-acting regulatory variation in Saccharomycescerevisiae and the role of transcription factors Nat Genet35(1)57ndash64

2133

Selective Constraints on Gene Regulatory Changes doi101093molbevmst114 MBE at A

cademia SinicaL

ife Science Library on A

ugust 22 2013httpm

beoxfordjournalsorgD

ownloaded from