Integrating genotypic and expression data in a segregating mouse population to identify...

10
Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits Margarete Mehrabian 1 , Hooman Allayee 2 , Jirina Stockton 1 , Pek Yee Lum 3 , Thomas A Drake 4 , Lawrence W Castellani 1 , Michael Suh 1 , Christopher Armour 3 , Stephen Edwards 3 , John Lamb 3 , Aldons J Lusis 1,5–7 & Eric E Schadt 3 Forward genetic approaches to identify genes involved in complex traits such as common human diseases have met with limited success. Fine mapping of linkage regions and validation of positional candidates are time-consuming and not always successful. Here we detail a hybrid procedure to map loci involved in complex traits that leverages the strengths of forward and reverse genetic approaches. By integrating genotypic and expression data in a segregating mouse population, we show how clusters of expression quantitative trait loci linking to regions of the genome accurately reflect the underlying perturbation to the transcriptional network induced by DNA variations in genes that control the complex traits. By matching patterns of gene expression in a segregating population with expression responses induced by single-gene perturbation experiments, we show how genes controlling clusters of expression and clinical quantitative trait loci can be mapped directly. We demonstrate the utility of this approach by identifying 5-lipoxygenase as underlying previously identified quantitative trait loci in an F 2 cross between strains C57BL/6J and DBA/2J and showing that it has pleiotropic effects on body fat, lipid levels and bone density. One of the primary goals of forward and reverse genetic approaches is to uncover genes that control biological traits of interest. In forward genetic approaches, human or animal populations are studied to identify genomic regions that cosegregate with a traits of interest. Identified regions are finely mapped by genotyping additional markers in larger populations until, ideally, the region is narrowed to a single gene. On the other hand, reverse genetic approaches characterize a gene functionally by perturbing it in an experimental system, thus elucidating its effect on the traits of interest. Forward genetic approaches have been successful for identifying genes underlying mendelian traits, but less so for analyzing common human diseases involving multiple genes and interactions with environmental factors. Reverse genetic approaches provide a straightforward way to assess gene function, but such investigations typically take place in an oversimplified context where interactions between a gene of interest and genetic background are eliminated and where compensatory changes that occur through development can confound the interpre- tation of results. Furthermore, functional consequences of changes in gene activity are often asymmetric. For example, knockout mice lacking glutathione peroxidase 1 have no obesity- or diabetes- associated phenotype compared with controls, whereas transgenic mice overexpressing glutathione peroxidase 1 do 1 . Several groups recently proposed a combined genetics–gene expression approach to elucidate the genetics of complex traits 2–12 . Molecular profiling of traits that are genetically controlled by loci that also control complex traits (e.g., disease) and that are significantly correlated with such traits can begin to elucidate the portion of the transcriptional network that underlies the phenotypic trait. Patterns of gene expression can be used to infer gene function, given the coregulation of genes of unknown function with genes of known function 13–17 . By matching patterns of gene expression in a segregat- ing population with the expression signature in single-gene perturba- tion experiments, we show how genes responsible for clusters of expression and clinically relevant quantitative trait loci (QTLs) can be mapped directly. To illustrate this approach, we focus on mapping a gene underlying QTLs underlying various traits related to cardiovascular and metabolic diseases in a previously reported BXD intercross 9,18,19 . Specifically, we identified a locus on mouse chromosome 6 with pleiotropic effects on adiposity, plasma lipoprotein levels and bone density 18 . Using an Received 15 February; accepted 21 June; published online 2 October 2005; doi:10.1038/ng1619 1 Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California 90095-1679, USA. 2 Department of Preventive Medicine and Institute for Genetic Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California 90089-9075, USA. 3 Rosetta Inpharmatics, 401 Terry Ave. North, Seattle, Washington 98109, USA. 4 Departments of Pathology and Laboratory Medicine, 5 Microbiology, Immunology and Molecular Genetics and 6 Human Genetics and 7 Molecular Biology Institute, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California 90095-1679, USA. Correspondence should be addressed to E.E.S. ([email protected]). 1224 VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 NATURE GENETICS ARTICLES © 2005 Nature Publishing Group http://www.nature.com/naturegenetics

Transcript of Integrating genotypic and expression data in a segregating mouse population to identify...

Integrating genotypic and expression data in a segregatingmouse population to identify 5-lipoxygenase as asusceptibility gene for obesity and bone traitsMargarete Mehrabian1, Hooman Allayee2, Jirina Stockton1, Pek Yee Lum3, Thomas A Drake4,Lawrence W Castellani1, Michael Suh1, Christopher Armour3, Stephen Edwards3, John Lamb3,Aldons J Lusis1,5–7 & Eric E Schadt3

Forward genetic approaches to identify genes involved in complex traits such as common human diseases have met withlimited success. Fine mapping of linkage regions and validation of positional candidates are time-consuming and not alwayssuccessful. Here we detail a hybrid procedure to map loci involved in complex traits that leverages the strengths of forwardand reverse genetic approaches. By integrating genotypic and expression data in a segregating mouse population, weshow how clusters of expression quantitative trait loci linking to regions of the genome accurately reflect the underlyingperturbation to the transcriptional network induced by DNA variations in genes that control the complex traits. Bymatching patterns of gene expression in a segregating population with expression responses induced by single-geneperturbation experiments, we show how genes controlling clusters of expression and clinical quantitative trait loci canbe mapped directly. We demonstrate the utility of this approach by identifying 5-lipoxygenase as underlying previouslyidentified quantitative trait loci in an F2 cross between strains C57BL/6J and DBA/2J and showing that it has pleiotropiceffects on body fat, lipid levels and bone density.

One of the primary goals of forward and reverse genetic approaches isto uncover genes that control biological traits of interest. In forwardgenetic approaches, human or animal populations are studied toidentify genomic regions that cosegregate with a traits of interest.Identified regions are finely mapped by genotyping additional markersin larger populations until, ideally, the region is narrowed to a singlegene. On the other hand, reverse genetic approaches characterize agene functionally by perturbing it in an experimental system, thuselucidating its effect on the traits of interest. Forward geneticapproaches have been successful for identifying genes underlyingmendelian traits, but less so for analyzing common human diseasesinvolving multiple genes and interactions with environmental factors.Reverse genetic approaches provide a straightforward way to assessgene function, but such investigations typically take place in anoversimplified context where interactions between a gene of interestand genetic background are eliminated and where compensatorychanges that occur through development can confound the interpre-tation of results. Furthermore, functional consequences of changesin gene activity are often asymmetric. For example, knockoutmice lacking glutathione peroxidase 1 have no obesity- or diabetes-

associated phenotype compared with controls, whereas transgenicmice overexpressing glutathione peroxidase 1 do1.

Several groups recently proposed a combined genetics–geneexpression approach to elucidate the genetics of complex traits2–12.Molecular profiling of traits that are genetically controlled by loci thatalso control complex traits (e.g., disease) and that are significantlycorrelated with such traits can begin to elucidate the portion of thetranscriptional network that underlies the phenotypic trait. Patterns ofgene expression can be used to infer gene function, given thecoregulation of genes of unknown function with genes of knownfunction13–17. By matching patterns of gene expression in a segregat-ing population with the expression signature in single-gene perturba-tion experiments, we show how genes responsible for clusters ofexpression and clinically relevant quantitative trait loci (QTLs) canbe mapped directly.

To illustrate this approach, we focus on mapping a gene underlyingQTLs underlying various traits related to cardiovascular and metabolicdiseases in a previously reported BXD intercross9,18,19. Specifically, weidentified a locus on mouse chromosome 6 with pleiotropic effects onadiposity, plasma lipoprotein levels and bone density18. Using an

Received 15 February; accepted 21 June; published online 2 October 2005; doi:10.1038/ng1619

1Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California 90095-1679, USA. 2Department ofPreventive Medicine and Institute for Genetic Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California 90089-9075, USA.3Rosetta Inpharmatics, 401 Terry Ave. North, Seattle, Washington 98109, USA. 4Departments of Pathology and Laboratory Medicine, 5Microbiology, Immunology andMolecular Genetics and 6Human Genetics and 7Molecular Biology Institute, David Geffen School of Medicine, University of California Los Angeles, Los Angeles,California 90095-1679, USA. Correspondence should be addressed to E.E.S. ([email protected]).

1 22 4 VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 NATURE GENETICS

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

integrative genomics approach involving sequence comparisons, thegenetics of gene expression, pattern matching between single-geneperturbation experiments and expression QTL (eQTL) data, and theclinical characterization of mouse models, we show that the pleio-tropic metabolic effects of the chromosome 6 locus in the BXD crosscan be attributed, at least in part, to the gene encoding 5-lipoxygenase(5-LO; Alox5).

RESULTSExamination of trait and genetic variation in the BXD crossIn the BXD cross, female F2 mice homozygous with respect to theDBA/2J (DBA) allele at the Alox5 locus had fat mass, leptin levels,bone density and very-low-density/low-density lipoprotein (VLDL/LDL) levels compared with mice homozygous with respect to theC57BL/6J (B6) allele (Table 1), demonstrating the underlying geneticbasis for linkage to this locus on chromosome 6 (SupplementaryFig. 1 online). Because precise positioning of QTLs in a moderatelysized cross such as ours is problematic20, the interval supporting thelinkage is large, extending from 85 Mb to 125 Mb on chromosome 6,and contains 331 known genes21.

We treated gene expression levels from livers of BXD mice asquantitative traits in a standard QTL analysis, as described previously9.Of the 23,574 genes on the BXD microarrays used in this analysis,20,107 genes had eQTLs dispersed throughout the genome with loglod scores 42 (P E 0.01). Notably, 1,991 (nearly 10%) of these geneshad eQTLs over the 40-Mb linkage region on chromosome 6described above (Supplementary Table 1 online), whereas only 236(B1%) would have been expected by chance. This large number ofeQTLs at this location, even after normalizing for the number of genesin the region, is extremely significant (P o 1.0 � 10–16), suggestingthat this locus is a hot spot for gene expression activity, as others havenoted12. We excluded the possibility that this eQTL cluster was aresponse to the obesity state by constructing a diet-induced obesity(DIO) gene expression signature and showing that the 1,991 geneslinked to the chromosome 6 locus were not enriched for genes in thisDIO signature (Supplementary Methods online).

The genes linked to chromosome 6 were enriched for genescorrelated with the metabolic phenotypes that also linked to thislocus (e.g., the omental fat mass (OFM) trait; Fig. 1). With respect tothe 23,574 genes represented on the microarray, 28% had expressionlevels correlated with OFM at the 0.05 significance level (Fig. 1b). Toestimate empirically the null distribution for these correlations, wepermuted the set of 23,574 genes 100 times so that the correlationstructure among these genes was preserved. The false discovery rate for

correlation between the permuted genes and OFM was 9% (Fig. 1a),indicating that the liver transcriptional network in this case wasassociated with fat mass traits. Of the 1,991 genes with eQTLs overthe chromosome 6 locus, however, 1,177 (59%) were correlated withOFM (Fig. 1c), a significant increase compared with the 28% detectedover the entire set of genes (Fisher’s exact test, P o 1.0 � 10–16).

Using the Gene Ontology Biological Process categories, we nexttested whether there was enrichment in the set of 1,991 genes linked tothe chromosome 6 locus for biological pathways known to beassociated with obesity-related traits. Table 2 lists the top 11 pathwaysover-represented in the set of 1,991 genes. Two of these pathways werefatty acid and amino acid metabolism, processes that are central toenergy storage and expenditure. These results represent a degree ofpathway coherence that directly implicates a substantial subset of the1,991 genes linking to chromosome 6 in obesity-related processes.

Refining positional candidates in the chromosome 6 locusThe cluster of eQTL and clinical QTL (cQTL) activity on chromosome6 suggests that one or more DNA differences in this region betweenB6 and DBA are the ultimate cause of the QTLs. Because B6 andDBA shared a common ancestor, they will share regions of the genomethat are largely identical by descent (IBD) or, more precisely,identical subspecies by descent. Given the availability of B6 andDBA genomic sequences22,23, it is possible to define precisely theIBD regions between B6 and DBA by examining SNP frequencies.Experimental support that low-frequency SNP regions betweenthese two strains are IBD comes from a recent report showing thatnearly 97% of genes with cis-acting eQTLs detected in the BXD crossfell in non-IBD regions between B6 and DBA (P r 10–300 forenrichment)24. This observation supports the ideas that most com-mon variation between strains originates in non-IBD regions andthat the SNP activity observed in IBD regions may only rarely give riseto trait variation.

We determined the IBD status of the 40-Mb interval on chromo-some 6 using previously described methods24 (Fig. 2). Of the 331known genes in this interval, 172 fell in non-IBD regions and 159 werein IBD regions. Genes falling in IBD regions are not considered strongpositional candidates for the QTLs. For example, peroxisome prolif-erator activated receptor gamma (Pparg) falls in an IBD region(Fig. 2). This transcription factor is involved in adipocyte differentia-tion and insulin sensitivity and is the target of pharmaceuticalagonists, such as rosiglitazone, that are used to treat type 2 diabetes.Without knowledge of the IBD status, this gene would have beenfavored as a positional candidate for the BXD metabolic trait QTLs.

Table 1 Comparison of clinical traits between BXD F2 and Alox5–/– mice

BXD F2 cross Alox5 mice

DBA homozygotes B6 homozygotes Alox5–/– Alox5+/+

Trait Mean N Mean N P Mean N Mean N P

Total fat pad mass (g) 3.90 ± 0.50* 26 1.59 ± 0.32* 19 0.000036 3.29 ± 0.59* 5 0.86 ± 0.12* 7 0.0001

Omental fat pad mass (g) 0.40 ± 0.04* 26 0.21 ± 0.04* 18 0.00057 0.44 ± 0.06* 5 0.19 ± 0.02* 7 0.0001

Plasma leptin levels (pg/ml)/percent body fat 12.5/0.6 26 9.0/0.8 19 0.0021 50.1/10.3 5 11.0/8.7 7 0.007

Bone mineral density (mg/ccm) 300 ± 7.2* 39 275 ± 7.5* 36 0.0021 580 ± 7.1* 5 560 ± 6.5* 7 0.03

VLDL/LDL cholesterol (mg/dL) 13.3 ± 1.1* 37 10.6 ± 0.9* 35 0.033 39 ± 2.5* 5 33 ± 1.9* 7 0.008

HDL cholesterol (mg/dL) 63.3 ± 1.5* 30 54.8 ± 1.5* 34 0.0005 86.8 ± 4.5* 5 64.7 ± 2.1* 7 0.0006

*Mean ± s.e.m. All mice phenotyped were female.P values represent the significance of the t statistic under the null hypothesis that the difference in mean trait values between the two groups is equal to 0.

NATURE GENETICS VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 1 22 5

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Using DNA sequence data from multiple sources22,23, we estimatedthat 82.2% of the B6 sequence in the chromosome 6 locus wassupported by DBA sequence and that 90.7% of all SNPs identifiedin transcribed regions of protein-coding genes (coding sequence,putative untranslated regions and acceptor and donor sites) weresimilarly supported by DBA sequence (Supplementary Methods). Ofthe 172 genes identified as falling in non-IBD regions, 154 (90%) wererepresented on the microarray. This set of 154 candidate genes can befurther narrowed by requiring the gene to be expressed, to becorrelated with the metabolic traits of interest and to have cis-actingeQTL behavior in the BXD cross, as previously proposed byothers2,5,7–11,25,26. Of the 154 genes, 44 were expressed and gave riseto cis-acting eQTLs with lod scores 42.0. These 44 genes represent thehighest-confidence set of candidate gene expression traits for themetabolic phenotypes in this data set, where variations in the geneitself could mediate changes in transcript abundances, which in turnmediate changes in the metabolic traits. Of these 44 candidate genes,only 3 were correlated with the metabolic phenotypes at the 0.01significance level: Ogg1, Anxa4 and Anubl1. Ogg1 and Anxa4 aresupported by two strong cis-acting eQTLs in the chromosome 6locus (lod scores of 37.4 and 27.4, respectively). But the genetic andgene expression data strongly support the idea that the correlationbetween the metabolic traits and expression traits for these two genesis an artifact of two closely linked QTLs being in gametic phasedisequilibrium, as discussed previously24 and supported by applica-tion of a statistical test to assess such relationships26. The third gene,Anubl1, had transcript abundances that were not correlated with thechromosome 6 QTL genotypes, conditional on the metabolic pheno-types, and the correlation between the metabolic phenotypes and QTLgenotypes conditional on Anubl1 expression levels was markedlydifferent from zero, supporting the possibility that Anubl1 was reactiveto the metabolic phenotypes26. Therefore, application of a QTL-mapping strategy2,8–12 did not identify any causal candidate geneswith liver expression associated with cis-acting eQTLs in the chromo-some 6 locus.

Failure to identify genes using this approach could be due to anumber of factors, including polymorphisms in the gene(s) under-lying the QTL that affect protein function but not expression

and polymorphisms in the relevant genes that affect expression butnot in liver tissue. Therefore, we focused on the 32 genes located innon-IBD regions of the chromosome 6 locus that had SNPs leading tocodon changes. We searched for SNPs in these 32 genes known toresult in amino acid changes and found that a missense mutationin Alox5 altered enzymatic activity (Supplementary Fig. 2 online).Alox5 is also physically centered at the peak of the joint lod scorecurve for the composite trait18, making this gene a good positionalcandidate (Supplementary Fig. 1). In total, 66 SNPs were poly-morphic between B6 and DBA in the 51-kb region encompassingAlox5, including 63 intronic SNPs, 1 missense mutation and 2 SNPs inthe 3¢ untranslated region. The single Alox5 missense mutationin DBA is identical to a V646I substitution that was also identifiedin strain CAST/Ei27 and, in recent studies, markedly decreased levelsand activity of 5-LO28. This suggests that DBA is similar to CAST/Ei(and Alox5–/– mice) in this phenotypic regard. Our search did notidentify any other SNPs with known effects on protein function in theother 31 genes.

Associating Alox5 and metabolic traits by pattern matchingBy profiling livers from Alox5 knockout (Alox5–/–) mice, we tested whe-ther perturbations of the transcriptional network induced by completeinactivation of Alox5 overlapped with expression traits linked to thechromosome 6 locus. Overlap of these patterns of expression wouldsupport the possibility that Alox5 is a susceptibility gene for themetabolic traits (Fig. 3). We identified 444 genes whose expressionwas different between Alox5–/– and control B6 mice (SupplementaryTable 2 online). Of these 444 genes, 104 (23.4%) had eQTLs directlyover Alox5 in the BXD cross (Fig. 2 and Supplementary Table 3online). Because only 44 (B10%) genes in the Alox5–/– signaturewould be expected to give eQTLs over Alox5 by chance, the B2.5-foldenrichment is highly significant (P ¼ 3.3 � 10–17).

Next we examined the correlation of OFM with the 444 genes in theAlox5–/– signature and found that 58% were correlated (Fig. 1d), aconsiderable increase over the 28% expected by chance. Of the 104genes that fell in the Alox5–/– signature set and were linked to Alox5in the BXD cross, however, 84% (87 genes) were correlated withOFM (Fig. 1e). Overall, only 1,177 (5.9%) of the genes in the BXD

0.0 0.2 0.4 0.6 0.8

0

500

1,000

2,000

3,000

0

1,000

2,000

3,000

4,000

5,000

50

0

100

150

200

250

40

20

0

60

0.0 0.2 0.4 0.6 0.8

20

15

10

5

0

9% 28% 59% 58%

84%

Absolute value of the correlation forgenes linked to Alox5 locus

Absolute value of the correlation forAlox5 –/– genes lined to Alox5 locus

Absolute value of the correlation forAlox5 –/– signature genes

Absolute value of the correlation forall genes on array

Absolute value of the correlation forpermuted data sets

0.0 0.2 0.4 0.6 0.80.0 0.2 0.4 0.6 0.80.0 0.2 0.4 0.6 0.8

a b c d

eFigure 1 Histograms of correlation coefficients computed between OFM and gene expression levels in five different

gene sets. The Pearson correlation coefficient was computed between OFM and every gene expression trait in the

set of 23,574 genes on the BXD microarray. After taking the absolute value of the correlation coefficients computed

for each set, the histograms were plotted with the number of genes on the y axis and the correlation coefficients

on the x axis. Highlighted in each plot is the percentage of genes in the given set that had significant correlationcoefficients at the 0.05 level. The different distributions represented in each panel were obtained from (a) the

permutation procedure (represents the null distribution); (b) the set of 23,574 gene expression traits represented

in the BXD data set; (c) the set of 1,991 genes whose eQTLs give lod scores 42 and map to the 18-cM

Alox5 interval; (d) the set of 444 genes in the Alox5–/– liver expression signature; and (e) the set of 104 genes

from the Alox5–/– liver expression signature whose eQTLs give lod scores 42 in the BXD cross and map to the

18-cM Alox5 interval.

1 22 6 VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 NATURE GENETICS

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

data set were linked to the chromosome 6 locus and correlatedwith OFM, compared with 87 (19.6%) genes in the Alox5–/–

signature (Fig. 2). This nearly fourfold enrichment is highly significant(P ¼ 7.3 � 10–24). We obtained similar results for total fat mass,VLDL/LDL cholesterol, leptin levels and measures of bone density(data not shown).

These results provide compelling evidence that Alox5 is one of thegenes underlying the metabolic QTLs in the BXD cross. We furthertested whether the same level of pathway coherence observed in the setof genes linked to the chromosome 6 locus was also present in theAlox5–/– signature. Table 2 lists the top 11 Gene Ontology BiologicalProcess categories over-represented in the set of genes in the Alox5–/–

signature, of which 8 are the same as those over-represented in thechromosome 6 eQTL cluster. Therefore, pathways affected by com-plete inactivation of 5-LO are the same pathways affected by the ‘soft’perturbation induced by the chromosome 6 QTL, which includes amutation in Alox5 that decreases enzymatic activity.

129 sequence in Alox5–/– mice cannot explain the QTL effectsAlox5–/– mice were originally generated using 129 embryonic stemcells injected into B6 blastocysts, resulting in B6/129 chimeric micethat were then backcrossed to B6 for more than 15 generations. Weidentified B14 Mb of 129 genomic sequence flanking the Alox5 locusin Alox5–/– mice (Supplementary Fig. 3 online). Part of this 129sequence was also IBD with the DBA sequence (SupplementaryFig. 3), potentially confounding interpretation of the phenotypicand gene expression similarities between Alox5–/–and BXD micewith respect to the Alox5 locus. For such confusion to occur, threeconditions must hold at a given locus of interest: (i) 129 and DBAmust be IBD; (ii) 129 and B6 must not be IBD; and (iii) B6 and DBA

must not be IBD. Less than 3.3 Mb of the 14-Mb 129 sequenceflanking the Alox5 locus met all three conditions (SupplementaryFig. 3). Of the 331 genes in the 40-Mb chromosome 6 interval, 32overlapped the 129 regions (Supplementary Fig. 3). None ofthe corresponding gene expression traits in this set gave rise tocis-acting eQTLs that correlated with the metabolic traits in BXDmice. Of the eight genes in this set that carry SNPs leading to aminoacid changes, only those in Alox5 are known to lead to changes inprotein activity.

This bioinformatic analysis cannot absolutely exclude the possibilitythat genes falling in the 129 regions flanking the Alox5 locuscontribute to the Alox5–/– liver expression signature or the chromo-some 6 eQTL hot spot. To confirm that Alox5 is at least partiallyresponsible for the Alox5–/– expression signature and chromosome 6QTL, we artificially perturbed the 5-LO pathway in primary humanmonocytes (Supplementary Methods). 5-LO catalyzes one of theinitial steps in the production of leukotrienes from arachidonic acid.LTB4, a powerful chemoattractant, is one of the primary products ofthis 5-LO–catalyzed reaction29. Therefore, we cultured primaryhuman monocytes in medium containing LTB4 to perturb thetranscriptional network specific to the 5-LO pathway. Genes whoseexpression was significantly different between perturbed and controlcells comprised the LTB4 expression signature. These genes weresignificantly enriched in the Alox5–/– liver gene expression signature(P ¼ 0.0000078) and in the set of genes with liver expression values inBXD mice linked to the Alox5 locus and correlated with the metabolictraits (P ¼ 0.00011). This overlap confirms that the Alox5–/– liver geneexpression signature and the set of genes linked to the chromosome 6locus in BXD mice are an important component of the 5-LO pathwaythat is associated with the metabolic traits.

Table 2 Gene Ontology Biological Process categories represented in the set of genes with eQTLs that link to Alox5 in the BXD cross.

BXD F2 cross Alox5–/– mice Rosiglitazone-treated mice

Category Rank P

Bonferroni-

corrected P Rank P

Bonferroni-

corrected P Rank P

Bonferroni-

corrected P

Electron transport 1 3.04 � 10–12 5.56 � 10–9 1 o10–15 o10–15 4 o10–15 o10–15

Amino acid metabolism 2 1.91 � 10–11 3.48 � 10–8 3 o10–15 o10–15 18 1.50 � 10–6 0.0028

Amine metabolism 3 5.70 � 10–11 1.04 � 10–7 5 4.08 � 10–11 2.40 � 10–8 7 2.24 � 10–9 4.21 � 10–6

Amino acid and derivative metabolism 4 7.22 � 10–11 1.32 � 10–7 2 o10–15 o10–15 15 9.18 � 10–7 0.0017

Energy derivation by oxidation of

organic compounds

5 7.51 � 10–11 1.37 � 10–7 20 5.28 � 10–7 0.00097 17 1.49 � 10–6 0.0028

Fatty acid metabolism 6 8.04 � 10–11 1.47 � 10–7 4 1.31 � 10–11 2.40 � 10–8 2 1.41 � 10–11 2.65 � 10–8

Energy pathways 7 8.76 � 10–11 1.6 � 10–7 17 1.71 � 10–7 0.00031 5 2.56 � 10–10 4.80 � 10–7

Main pathways of carbohydrate metabolism 8 9.80 � 10–11 1.79 � 10–7 25 7.36 � 10–6 0.013 9 6.47 � 10–8 0.00012

Antiapoptosis 9 1.03 � 10–8 1.87 � 10–5 10 6.14 � 10–9 1.12 � 10–5 NS NA NA

Negative regulation of apoptosis 10 2.57 � 10–8 4.70 � 10–5 8 2.01 � 10–9 3.68 � 10–6 NS NA NA

Negative regulation of programmed cell death 11 2.79 � 10–8 5.10 � 10–5 9 2.08 � 10–9 3.80 � 10–6 NS NA NA

Amine catabolism NS NA NA 6 8.75 � 10–10 1.60 � 10–6 NS NA NA

Amino acid catabolism NS NA NA 7 1.89 � 10–9 3.46 � 10–6 NS NA NA

Cofactor metabolism 29 7.34 � 10–6 0.013 11 7.65 � 10–9 3.46 � 10–6 11 1.47 � 10–7 0.00028

Fatty acid b-oxidation NS NA NA NS NA NA 1 o10–15 o10–15

Fatty acid oxidation NS NA NA NS NA NA 3 4.43 � 10–11 8.29 � 10–8

Lipid biosynthesis 17 2.21 � 10–6 0.0041 NS NA NA 6 1.49 � 10–9 2.68 � 10–6

Alcohol metabolism 13 4.05 � 10–7 0.00076 24 2.50 � 10–6 0.0047 8 2.62 � 10–8 4.92 � 10–5

VLC fatty acid metabolism NS NA NA NS NA NA 10 1.40 � 10–7 0.00026

Of the 11 most significant Gene Ontology Biological Process categories represented in the set of genes with eQTLs that link to Alox5 in the BXD cross, 8 are represented fromthe expression signature of Alox5–/– mice and 5 from that of the rosiglitazone-treated mice. P values represent the significance of the Fisher’s exact test statistic under the nullhypothesis that the frequency of Gene Ontology Biological Process genes for the functional category indicated is the same between a reference set of 30,102 genes and the setof genes comprising the perturbation signature. NA, not applicable; NS, not significant.

NATURE GENETICS VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 1 22 7

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Chromosome 6 eQTLs not enriched for Pparg-responsive genesPparg is transcribed in diverse tissues and was expressed in B6 andDBA livers (Supplementary Fig. 4 online). Pparg expression in liverwas approximately twofold higher in DBA mice than in B6 mice.Given previous eQTL results on the BXD cross9, Pparg has acomplicated eQTL signature, consisting of four eQTLs with lod scoresZ3.0 and nine eQTLs with lod scores Z1.5, with the most significanteQTL mapping to chromosome 10 at 24 cM with a lod score of 4.6.Despite this strong eQTL signature for Pparg, none of the variation inPparg expression in the BXD cross can be explained by the chromo-some 6 locus, indicating that the expression variation observed in thiscross can be attributed to trans-acting genetic factors and othernongenetic factors. We also note there are only two informativeSNPs (both in introns) between B6 and DBA in the 130,212-bpgenomic sequence containing the Pparg gene in the Celera MouseGenome Database. Between a full-length cDNA clone for the adipose-specific isoform of Pparg derived from DBA and the completecoverage of the two most 5¢ exons of the liver-specific isoform ofPparg in DBA sequence in the Celera Mouse Genome Database, thereis 100% coverage of the coding sequence of Pparg by DBA sequence,

which perfectly matches the corresponding B6 sequence (from build33 of the National Center for Biotechnology Information mousegenome assembly).

To assess further whether Pparg could partially explain this linkageof the metabolic phenotypes to chromosome 6, we identified liver-specific gene expression signatures in three independent Pparg per-turbation experiments: wild-type B6 mice treated with rosiglitazone(595 genes; signature set E1), Ppara–/– B6 mice treated with rosigli-tazone (457 genes; signature set E2) and Ppara-null B6 mice over-expressing Pparg in the liver30 (313 genes; signature set E3;Supplementary Table 4 online). Genes in set E1 significantly over-lapped genes in set E2 (P ¼ 4.97 � 10–90), indicating that a robust,Pparg-specific expression signature in the liver was induced by treat-ment with rosiglitazone. In addition, both signature sets E1 and E2significantly overlapped set E3 (P ¼ 3.60 � 10–21 and 1.04 � 10–18,respectively). Taken together, these data indicate that a robustPparg-specific signature is induced in the livers of mice treatedwith rosiglitazone.

Despite overlaps among the three independent Pparg perturbationsignatures, none of these signatures was enriched for genes with

595 genes in therosiglitazone

signature

signature1,991 (1,177)

444 genes in

genes linked tothe Alox5 locus

the Alox5

20,107 genesconsidered

56 genes overlappingenrichment P = 0.68

104 (87) genes overlappingenrichment P = 3.3 × 10–17 (7.3 × 10–24)

14013513012512011511010510095908580

Chromosome 6 (85–125 Mb)

Chromosome 6 (116–120 Mb)

80

60

40

20

0

60

40

20

0116.0 116.5 117.0 117.5 118.0 118.5 119.0 119.5 120.0

Alox5 (66 SNPs)Pparg (2 SNPs)

SN

P c

ount

(sm

ooth

ed 2

5-kb

win

dow

s)

Figure 2 Enrichment of genes linked to the Alox5 locus in the BXD set for genes in the Alox5–/– perturbation signature. The upper panels represent a

frequency plot in the genomic region supporting the Alox5 locus for SNPs that are polymorphic between B6 and DBA. The horizontal line represents a

threshold of five SNPs per 25-kb region. Regions with five or more SNPs in a 25-kb interval were designated IBD between B6 and DBA, as previously

described24. Highlighted in the upper panel are Pparg and Alox5 residing in IBD and non-IBD regions, respectively. Shown in the lower panel are overlaps

between genes linked to the Alox5 locus and the Alox5 and Pparg perturbation signatures. Of the 20,107 genes with eQTLs with lod scores 42, 1,991

(9.9%) were in an 18-cM window encompassing Alox5. Restricting attention to the 444 genes from the Alox5–/– signature, 104 (23.1%) genes have eQTLs

with lod scores 42.0 in the 18-cM interval. The Alox5–/– signature is more enriched for genes linked to the Alox5 locus in the BXD set and correlated with

OFM (shown in parentheses). Of the 20,107 genes with eQTLs with lod scores 42.0, 1,177 (5.9%) are correlated with OFM at the 0.05 significance level

and map to the 18-cM Alox5 interval. Restricting attention to the 444 genes from the Alox5–/– signature, 87 genes (19.6%) have eQTLs with lod scores

42.0 that map to the interval and correlate with OFM at the 0.05 significance level.

1 22 8 VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 NATURE GENETICS

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

eQTLs that linked to the Alox5 locus. For example, using the sametype of analysis described for the Alox5–/– expression signature, only 56of the 595 genes represented in signature set E1 had expressionvalues linked to the chromosome 6 locus, which is not significantlydifferent from the 59 expected by chance (Fisher’s exact test, P value ¼0.68). The P values for enrichment for signature sets E2 and E3 were0.67 and 0.99, respectively. These data, combined with the fact thatPparg resides in an IBD region of chromosome 6, do not support theidea that Pparg is the gene underlying the chromosome 6 QTLs in theBXD cross.

Characterization of Alox5–/– mice for QTL traits in the BXD crossThe expression data provide strong evidence that Alox5, at least in part,explains the variation in the metabolic phenotypes linked to thechromosome 6 locus. If this is the case, Alox5–/– mice should havegreater fat mass, bone density, leptin levels and cholesterol levels than

wild-type mice. On a chow diet, Alox5–/– females had 32% body fat, asmeasured by nuclear magnetic resonance, compared with 16% for wild-type mice (Table 1). Notably, this difference was not isolated to one fatpad, as all four depots had greater mass. We observed similar differenceson the HFC diet (data not shown). With respect to leptin levels,Alox5–/– mice had greater plasma levels than wild-type mice, and thisdifference was much greater than that observed in mice from the BXDcross (Table 1). To determine whether these differences resulted frombehavioral differences, we measured daily food intake over a 6-d period.Food consumption was not different between Alox5–/– and wild-typemice (4.0 ± 0.27 g d–1 versus 4.3 ± 0.05 g d–1, respectively), and themice had similar activity levels when observed during the light cycle.

Given the QTLs for plasma lipids and bone density in the BXDcross, we phenotyped Alox5–/– mice for these traits as well. On a chowdiet, Alox5–/– mice had elevated total, high-density lipoprotein (HDL)and LDL/VLDL cholesterol levels compared with wild-type mice

D6Mit200

D6Mit198

D6Mit25

Gene G1D6Mit44

D6Mit149

D6Mit102

D6Mit16

D6Mit50

X

F1

F2

G1

G1

AA

CG

TT

AA

AC

GG

TT

C

Obtain knockout for gene of interest

Obtain wild-type mousefor reference

Profiletissue

Profiletissue

b

Identifygenes with

eQTL inlinkage region

Identifyperturbation signature

a

Map perturbation signature to

eQTL signature and assess significance of overlap

Figure 3 Intersecting perturbation signatures in gene expression data to map genes for complex traits. (a) QTLs for complex traits in an F2 population

are mapped. A genetic locus at G1 is highlighted as linked to a trait of interest in the F2 population. A SNP in G1 is highlighted as the causal variant

underlying the complex trait: the green allele leads to decreased G1 activity, whereas the red allele leads to increased G1 activity. Tissues relevant to the

complex trait are monitored using microarrays. Expression traits found to be genetically linked to the G1 locus are determined. The network to the lefthighlights expression traits linked to the G1 locus (blue nodes), in addition to expression traits interacting with genes linked to the G1 locus (white nodes),

with G1 denoted by the red node. (b) Tissues from mice genetically modified with respect to G1 are profiled. Genes that are differentially regulated between

the perturbed and unperturbed system are identified. Highlighted to the left is the portion of the transcriptional network that is observed to change when

gene G1 is perturbed (knocked out, in this case). This perturbation signature is then compared with the eQTL signature defined in a. If expression traits

controlled by the G1 locus are enriched for expression traits that are differentially regulated as described in b (blue nodes), then this matched pattern of

expression provides direct experimental support that G1 is the gene underlying the linkage to the complex trait in the F2 population.

NATURE GENETICS VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 1 22 9

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

(Table 1). Similarly, both femoral bone mineral density and bonemineral content were higher in Alox5–/– mice than in wild-type mice(Table 1). For each of these traits in the Alox5–/– mice, the differenceswere perfectly consistent with the trends observed in the BXD cross,where F2 mice homozygous with respect to the DBA allele at the Alox5locus had greater body fat, leptin levels, lipid levels and bone densitythan did B6 homozygotes (Table 1).Pparg falls in a region that is IBD between the B6 and 129 strains of

mice, where there is only a single informative SNP between thesestrains in the 130-kb region spanning the Pparg gene. Furthermore,between a full-length Pparg cDNA sequence derived from 129S1/SvImJ and the 129S1/SvImJ sequence data represented in the CeleraMouse Genome Database, the coding sequence of the liver-specificisoform of Pparg is completely supported by 129 sequence, and thereare no sequence differences between this sequence and the B6sequence (National Center for Biotechnology Information build 33of the mouse genome). This observation is important, given that theAlox5–/– mouse retains a sizeable stretch of 129 sequence flanking theAlox5 gene, which includes Pparg. If Pparg activity varied between B6

and 129 and this variation was due to the cis effects of polymorphismsin Pparg, this could potentially explain part of the phenotypicvariation noted in the Alox5–/– mice, given the role of Pparg in obesityand diabetes phenotypes.

Reconstructing gene networksWe recently developed a new gene network reconstruction algorithmthat was applied to a set of 1,088 of the most transcriptionally activegenes in the BXD liver data31. The resulting network could predictcausal relationships among genes in the network31. Of the 104 genes inthe Alox5–/– signature and linking to the Alox5 locus in the BXD cross,31 intersected with the set of 1,088 genes used to reconstruct the BXDliver gene network, and 25 of these 31 genes were connected to manydifferent nodes in the network (i.e., not singleton nodes or isolatedclusters). Figure 4 highlights the portion of the BXD liver genenetwork31 that contains these 25 genes, in addition to genes in thenetwork within a path length of 1 of any of these 25 genes. All genes inthis subnetwork are within a path length of 4 of one another.Therefore, of the 1,088 genes comprising the BXD liver gene

2310057H16Rik

Slpi

Mmp7

Sftpd

Osbpl3Sort1 Idb1

Ibd3

Ubd

Ddr1

Spp1

Cml5Ttr

Hsd11b13110049J23Rik

Pon1

Sult1c2

Es31

Cml1

Rgn

Akr1c20

Paqr9

Akr1c6

Elovl5

Fads2

Fmnl2

Ndrl

Phlda3

Ly6d Timp3

Lcn2 Amotl1

Aldh1b1Sult2a2

Slc17a2

Pank1

Cyp2b9

Semcap2

Rgs10

Re x3

Vldlr

Rbp1

Cd63

Bcl2a1d

Gsn

1200009I06Rik

Crip1

S100a6

Mfge8

Sparc

Lmo2

AI503986

Mup1Fmnl1

5330421C15Rik

Abcg12010321J07Rik

Cyp2c37

Fabp4

Cyp2b13

Ghr

Zfp90

Atf3

1300012D20Rik

Ddit4

Slc41a3

Dhrs6Ephb1

Cyba

Igsf6

Cyp2f2

Pcp4l1

Ugt2b5

Aldh1a1

Cyp2j52610207L16Rik

Laptm5

Iqgap1Sdro

Lgmn

2310046G15RikPrima1

Agpat4

Arpc1b

Car1

Car3

Cte1

Cklfsf7Plaur

Akr1b7

Decr1

Tm7sf2

Cyp2c38

Cyp4f14

1600020H07Rik

Thrsp

Apoa4

Ptpn1

Cyp2c29

Add2

Cxcl9

Hpgd

t

Hmgn2

Mod1

Mt2

Slc38a1

Tfpi2

Arhgdib

Efcbp1

Ptpn18

AA407930

Gnmt

2310010L16Rik

Krt1-18

Fs

AK018104

Rbm3

Cirbp

Pex3

0610039P13Rik

Bbox1

Vnn1

Mt2 cluster

Spp1 cluster

Figure 4 Genetic subnetwork of genes in the Alox5–/– signature and linked to the Alox5 locus in the BXD cross. Subnetwork derived from the BXD liver genenetwork previously described31. The light purple nodes represent genes that are in the Alox5–/– liver signature and linked to the Alox5 locus in the BXD cross.

The white nodes represent genes in the BXD liver network31 that were within a path length of 1 of the light purple nodes. Two subclusters in the subnetwork

containing the genes Spp1 and Mt2, which affect bone density and obesity traits, respectively, are highlighted.

1 23 0 VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 NATURE GENETICS

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

network31, the set of 25 genes central to the Alox5–/–signature genesand linked to the Alox5 locus in the BXD cross falls in the same regionof the network, and the path between any two of these genes consistsof no more than four links.

Several groupings of the genes in this portion of the networkare highlighted in Figure 4. The first is a cluster of genes influencedby the gene osteopontin (Spp1). Spp1 knockout mice are resistantto ovariectomy-induced bone resorption, which is a model forpostmenopausal osteoporosis32. The second gene cluster containsmetallothionein 2 (Mt2) as a key component of the network sub-structure. Mt2-null mice are more sensitive to oxidative stress andhave greater fat mass and plasma leptin levels than wild-type con-trols33. The Alox5–/– gene expression signature shows that Spp1 andMt2 are both downregulated in Alox5–/– mice relative to wild-typecontrols. In agreement with these single-gene perturbation experi-ments, the eQTLs for Spp1 and Mt2 at the Alox5 locus in the BXDcross show downregulation of both genes in mice that are homo-zygous with respect to the DBA allele of Alox5 relative to mice that arehomozygous with respect to the B6 allele.

These data provide direct experimental evidence that perturbationsto Alox5 (i.e., those reducing 5-LO activity) lead to variations in othergenes that have a causal role in the same metabolic phenotypes thatare associated with reduced 5-LO activity, namely bone density andobesity-related measures. Furthermore, the pattern of transcriptionaldownregulation of these genes induced by reducing 5-LO activity isconsistent with the phenotypes achieved in the corresponding knock-out mice. Zfp90, a gene we recently identified for involvement withobesity26, is also represented in the network in Figure 4.

DISCUSSIONUsing an integrative genomics approach involving genotypic, geneexpression and clinical trait data in segregating mouse populationsand genetically targeted mice, we show that 5-LO influences variousmetabolic parameters, such as adiposity, lipoprotein levels and bonedensity. Multiple lines of experimental evidence support these con-clusions. First, QTLs for these traits map to Alox5 in a cross betweenB6 and DBA. Second, the Alox5 locus is enriched for eQTLs withcorresponding expression levels that are correlated and enriched forpathways associated with the metabolic traits. Third, the DBA allele ofthe Alox5 gene carries a mutation that reduces enzymatic activity.Fourth, there is enrichment and overlap between the liver geneexpression signature of Alox5–/– mice with those genes that haveeQTLs over Alox5 in the BXD cross. Fifth, genes in the Alox5–/– liversignature that link to the Alox5 locus in the BXD cross were correlatedand enriched for pathways associated with the metabolic traits. Sixth,Alox5–/– mice have the same differences in QTL traits as F2 micehomozygous with respect to the DBA allele of Alox5, which substan-tially decreases 5-LO levels and activity. Seventh, an independent 5-LOperturbation in human monocytes produced an expression signaturethat overlapped with the Alox5–/– liver gene expression signature andthe set of genes linked to the chromosome 6 locus, indicating thatchanges in 5-LO activity and not allelic differences between B6 and129 accounted for the overlap between the Alox5–/– liver gene expres-sion signature and the set of genes linked to the chromosome 6 locus.Eighth, there was no comparative significant enrichment in theexpression signature obtained by overexpression and pharmacologicalmanipulation of Pparg, a favored positional candidate gene thatinfluences metabolic processes including adiposity, insulin resistanceand bone density34. Finally, genetic networks constructed from thegenes in the Alox5 signature that also link to the Alox5 locus in theBXD cross involve those known to affect the metabolic traits of

interest. Taken together, these data provide compelling evidence thatAlox5 and its role in inflammatory processes influence a variety ofphysiological pathways related to human metabolic disorders.

Previous studies used microarray approaches in mice and humansto identify gene expression patterns that redefine a complex diseasetrait, to identify subtypes of a given disease, to elucidate the complexgenetic networks of causal and reactive expression changes and toenhance the ability to identify the key drivers of disease9,31,35,36. In thisstudy, we proposed an important variation to this approach. Giventhat patterns of expression underlie biological processes associatedwith complex traits13–17,37, we propose mapping susceptibility genes bycomparing the perturbed transcriptional network in segregating popu-lations with that in single-gene perturbation experiments of positionalcandidate genes (Fig. 3). This method addresses the weaknesses of theforward and reverse genetic approaches by leveraging their comple-mentary strengths. Specifically, the segregating population is used tostudy the complex traits that manifest themselves in this setting andthat are associated with complex networks of naturally occurringvariation. Single-gene perturbation experiments are used to definethe effects on the transcriptional network, where achieving physiolo-gical relevance is not required, but instead the perturbation signature ismapped back into the segregating population and interpreted in thecontext where the complex traits had initially been observed.

Our results provide an example for this paradigm by identifyingAlox5 as controlling various metabolic phenotypes. The most plausibleexplanation for this conclusion is that variations in Alox5 in the BXDcross gives rise to similar transcriptional network perturbations as thatassociated with 5-LO deficiency. Because strain DBA has an aminoacid substitution in Alox5 that decreases enzyme levels and activity, theresults of the liver microarray experiments in the Alox5–/– and BXDmice are entirely consistent. Therefore, even without directly observingchanges in 5-LO expression in the BXD cross (a valid probe for Alox5was not present on the microarray), the expression signature thatresults from perturbations to this gene is enough to map Alox5 as atleast one of the genes underlying the expression signatures and clinicaltraits linked to the Alox5 locus. Supplementary Figure 4 shows thatAlox5 is expressed in a number of tissues and is expressed in liver andadipose tissues in the B6 strain (similar expression patterns wereobserved in DBA; data not shown). At the same time, we were able toexclude Pparg as a strong causal candidate in this particular cross,given that this gene is IBD between B6 and DBA and that multiple,independent direct perturbations of this gene did not result in anexpression response that overlapped with the set of genes controlled bythe Alox5 locus in the BXD cross. Although the gold standard formapping genes for QTLs remains the construction of geneticallymodified animals carrying the strain-specific allele associated withthe trait of interest, this time-consuming method can be potentiallycircumvented by the more rapid matching of perturbations signatureswith eQTL or cQTL hot spots. Furthermore, the construction of BACtransgenics or other modified animals carrying the strain-specificallele causing the clinical traits is still not definitive, given that thestrain-specific allele may cause regulatory changes in flanking genes asopposed to changes in the gene in which the mutation occurs. Withthe method presented here, the connection to the putative causal geneis potentially stronger, given that the connection is based on hundredsor even thousands of molecular traits that underlie the clinical trait ofinterest and that vary with perturbations to the putative causal gene.

We recently identified C3ar1 as a causal gene for OFM in the BXDcross used here to map Alox5 (ref. 26). Using the approach describedherein, however, C3ar1 was not identified as a fat mass susceptibilitygene in the chromosome 6 linkage region for several reasons. First, as

NATURE GENETICS VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 1 23 1

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

with Pparg, C3ar1 falls in a region that is IBD between the B6 andDBA strains, with no polymorphic SNPs identified in the CeleraMouse Genome Database between these two strains. In addition, agenomic sequence derived from 129/SvJ containing the entire C3ar1coding sequence matched the B6 genomic sequence 100%. Given that129/SvJ is predicted to carry the same haplotype as B6 and DBA in thisregion, we can conclude there are no sequence variants in this genebetween B6 and DBA. Second, C3ar1 is slightly more than 7 Mb distalto Alox5 and is not supported in the 99% confidence interval for thechromosome 6 fat mass linkage. Finally, we detected C3ar1 as causal inthe BXD cross because the liver expression of this gene was controlledby multiple genetic loci (trans-acting eQTL), where DNA variants inother genes lead to changes in expression of C3ar1, and where wepredicted and later validated that changes in C3ar1 expression lead tochanges in fat mass.

The power of this approach to prioritize and identify QTL genes ina more direct fashion can be extrapolated to other complex diseases.The data and methods presented here also suggest that screening ofpositional candidate genes using more accessible cell-based systems,such as RNA interference, may lead to accelerated identification ofQTL genes. While investigators might still need to construct acongenic, knockout or transgenic animal to validate the phenotypicassociation directly, this approach could greatly accelerate identifica-tion of the primary drivers of disease beyond what has been achievedto date. In addition, these data fit very naturally into methodscurrently under development that integrate genotypic, expressionand clinical data to reconstruct more reliable genetic networks under-lying complex disease traits26,31.

METHODSAnimal models. We bred Alox5–/– mice on a B6 background in-house from

homozygous parental breeders, which were backcrossed to B6 for more than

ten generations. We either bred control B6 mice in-house or purchased them

from the Jackson Laboratories. We housed mice four to a cage at 25 1C on a

10-h dark/14-h light cycle and maintained them on either a chow diet

(Purina) or a HFC containing 15% fat, 1.25% cholesterol and 0.5% cholic

acid (Harlan-Teklad). The Alox5–/– mice used in the experiments were of both

sexes and age-matched between 4 and 7 months of age.

We housed Ppara–/– mice on a B6 background and wild-type B6 mice used

in the rosiglitazone treatment and DIO experiments in microisolater cages

(Labproducts) on a 12-h light/12-h dark cycle. To establish an obesity

expression signature in liver, we housed 20 8-week-old male B6 mice indivi-

dually and divided them into lean and DIO groups. Lean mice were fed

regular mouse chow (13% kcal from fat, 3.41 kcal g–1; Harlan Teklad). We

maintained DIO mice for 6 weeks on a high fat diet (59.4% kcal from fat,

24.5% kcal from carbohydrate, 16.2% kcal from protein, 5.29 kcal g–1; Bio-

Serv). We collected liver tissues for gene expression profiling after the 6-week

high-fat feeding cycle.

The BXD F2 mouse population and associated liver gene expression data

used in this study were previously described9,18. An F2 population consist-

ing of 111 mice was constructed from two inbred strains of mice, B6 and

DBA. Only female mice were maintained in this population. Mice were on a

rodent chow diet up to 12 months of age and then switched to an

atherogenic high-fat, high-cholesterol diet for another 4 months. At 16

months of age, the mice were killed and their livers extracted for gene

expression profiling. The mice were genotyped for 139 microsatellite

markers uniformly distributed over the mouse genome to allow for the

genetic mapping of the gene expression and disease traits.

All procedures were done in accordance with the current National Research

Council Guide for the Care and Use of Laboratory Animals and were approved

by the University of California Los Angeles Animal Research Committee

(for the BXD and Alox5–/– mice) and by the Merck Research Laboratories-

Rahway Institutional Animal Care and Use (for the rosiglitazone-treated and

DIO mice).

Treatment of mice with rosiglitazone. We treated male B6 mice (n ¼ 9 per

treatment group, 9–11 weeks old) daily with either rosiglitazone (100 mg per kg

body weight) or vehicle (0.25% methylcellulose) by oral gavage for 7 d. We fed

mice standard rodent chow (Harlan Teklad) for at least 1 week before study

initiation and weighed them daily during treatment. Mice were killed 6 h after

the last treatment, and livers were removed for RNA isolation and microarray

analysis. We used the same experimental paradigm for male Ppara–/– mice

(n ¼ 4 per treatment group) treated with either rosiglitazone (100 mg per kg

body weight) or vehicle (0.25% methylcellulose) by oral gavage for 7 d.

Probe selection for mouse gene expression arrays. The mouse microarray

used for the BXD cross was previously described9. The mouse microarray used

here is an updated version, containing 23,574 noncontrol oligonucleotide

probes for mouse genes and 2,186 control oligos. We extracted full-length

mouse sequences from Unigene clusters (build 168, February 2004) and

combined them with RefSeq mouse sequences from Release 3 (January 2004)

and RIKEN full-length sequences (version fantom1.01). We clustered this

collection of full-length sequences and selected one representative sequence

per cluster. To complete the array, we selected 3¢ expressed-sequence tags from

Unigene clusters that did not cluster with any full-length sequence from

Unigene, RefSeq or RIKEN. To select a probe for each gene sequence, we used

a series of filtering steps, taking into account repeat sequences, binding energies,

base composition, distance from the 3¢ end, sequence complexity and potential

cross-hybridization interactions38. For each gene, every potential 60-bp

sequence was examined and the 60-bp oligonucleotide that best satisfied the

criteria was printed on the microarray. All microarrays used in this study were

manufactured by Agilent Technologies, Inc.

Preparation of labeled cDNA and hybridizations to microarrays. After killing

mice, we removed livers from Alox5–/–, rosiglitazone-treated B6, control and

DIO mice, immediately flash-froze them in liquid nitrogen and stored them at

–80 1C. We purified total RNA from 25-mg portions using an RNeasy Mini kit

in accordance with the manufacturer’s instructions (Qiagen). We prepared liver

cDNA in the same fashion as for the F2 mice in BXD cross, as described

previously9. We hybridized fluorescently labeled cRNA (5 mg) from each

Alox5–/–, rosiglitazone-treated B6, control and DIO mouse against different

pools of RNAs. There were RNA pools specific to each experiment set (the

Alox5–/– set, the wild-type rosiglitazone treatment set, the Ppara–/– rosiglitazone

treatment set and the DIO set) constructed from equal aliquots of RNA from

the control mice in each experiment set. We constructed five fluor-reverse pairs

for the Alox5–/– experiment set, with RNA from five individual Alox5–/– mice

hybridized against the pool of five wild-type mice. We constructed three fluor-

reverse pairs for the wild-type rosiglitazone experiment set, with RNA from

three pools of three rosiglitazone-treated mice hybridized against the pool of

RNA from nine vehicle-treated mice. We constructed four fluor-reverse pairs

for the Ppara–/– rosiglitazone-treated set, with RNA from four Ppara–/–

rosiglitazone-treated mice hybridized against the pool of RNA from four

vehicle-treated Ppara–/– mice. We constructed ten fluor-reverse pairs for the

DIO set, with RNA from ten B6 mice on a high-fat diet competitively

hybridized against a pool of RNA from ten B6 mice on a chow diet.

We removed livers from the BXD mice for expression profiling and treated

them as described above. RNAs from each BXD mouse were hybridized against

a pool of RNAs constructed from equal aliquots of RNA from each mouse.

Analysis of expression data. We processed array images as previously described

to obtain background noise, single-channel intensity and associated measure-

ment error estimates9. We quantified expression changes between two samples

as log10 (expression ratio), where the expression ratio was taken to be the ratio

between normalized, background-corrected intensity values for the two chan-

nels (red and green) for each spot on the array. We applied an error model for

the log ratio as previously described to quantify the significance of expression

changes between two samples39.

Plasma measurements. Mice were fasted overnight and bled retro-orbitally

2–3 h into the light cycle under isoflurane anesthesia. We carried out enzymatic

assays for total cholesterol, HDL cholesterol and triglycerides as described

previously40 and analyzed leptin levels using a murine leptin ELISA kit (R&D

1 23 2 VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 NATURE GENETICS

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Systems). All measurements were done in duplicate or triplicate and in

accordance with the manufacturers’ instructions.

Body composition. We determined whole body fat, fluids and lean tissue mass

of isoflurane-anesthetized mice using a Bruker Optics Minispec NMR analyzer

in accordance with the manufacturer’s recommendations. After killing the

mice, we dissected out individual fat depots (retroperitoneal, epididymal,

subcutaneous and omental) and weighed them separately.

Food intake. We caged Alox5–/– and wild-type B6 mice individually in minimal

bedding and fed them 6 g of chow per day. Every 24 h for 6 d, we carefully

removed and weight the remaining uneaten food to determine differences in

food intake.

Bone density. For the Alox5–/– and wild-type controls, we determined bone

density of individual mouse femurs (right side) by a peripheral quantitative

computed tomographic small bone scanner, using a STRATEC XCT 960M unit

(Norland Medical Instruments). Femurs were scanned full-length at 2-mm

intervals with a resolution of 0.100 mm per voxel, yielding eight 1-mm-thick

cross-sections representing eight axial levels of the femur. We selected the

center-most scan (based on image morphology) or the mean of two scans

sharing the center position for data analyses41. For BXD mice, we determined

bone density measures as previously described18.

Statistical analyses. We carried out QTL analysis of clinical and expression data

in the BXD cross as described previously9. We predicted genotypes in BXD F2

mice at the Alox5 locus by imputing the genotype probability distribution using

markers flanking this locus and selecting the genotype that provided the largest

contribution to the lod score at the Alox5 location for the OFM trait. We

determined differences in measured variables between F2 mice predicted to be

homozygous with respect to the DBA allele and F2 mice predicted to be

homozygous with respect to the B6 allele using a standard t-test (SPLUS 6.1).

Similarly, we determined differences in measured variables between Alox5–/–

and control mice, between B6 wild-type rosiglitazone-treated mice and con-

trols, between B6 Ppara–/– rosiglitazone-treated mice and controls, and between

B6 mice on a high-fat diet and B6 mice on a chow diet using a standard t-test

(SPLUS 6.1).

Accession codes. GenBank: adipose-specific isoform of Pparg derived from

DBA, AY208184; full-length Pparg cDNA sequence derived from 129S1/

SvImJ, AY243585; genomic sequence derived from 129/SvJ containing the

entire C3ar1 coding sequence, U77461. GEO: probe-related data and gene

expression data, GSE2008.

Note: Supplementary information is available on the Nature Genetics website.

ACKNOWLEDGMENTSWe thank the Rosetta Gene Expression Lab for microarray work; J. Berger,K. Wong, J. Thompson, E. Tan and E. Muise for sharing the Pparg expressiondata; J.G. Menke for sharing the LTB4 data; and J. Zhu for discussion onnetwork analysis. This work was supported in part by grants from the USNational Institutes of Health (A.J.L.).

COMPETING INTERESTS STATEMENTThe authors declare that they have no competing financial interests.

Published online at http://www.nature.com/naturegenetics/

Reprints and permissions information is available online at http://npg.nature.com/

reprintsandpermissions/

1. McClung, J.P. et al. Development of insulin resistance and obesity in mice over-expressing cellular glutathione peroxidase. Proc. Natl. Acad. Sci. USA 101, 8852–8857 (2004).

2. Brem, R.B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic dissection of transcriptionalregulation in budding yeast. Science 296, 752–755 (2002).

3. Grupe, A. et al. In silico mapping of complex disease-related traits in mice. Science292, 1915–1918 (2001).

4. Karp, C.L. et al. Identification of complement factor 5 as a susceptibility locus forexperimental allergic asthma. Nat. Immunol. 1, 221–226 (2000).

5. Klose, J. et al. Genetic analysis of the mouse brain proteome. Nat. Genet. 30, 385–393 (2002).

6. Liao, G. et al. In silico genetics: identification of a functional element regulating H2-Ealpha gene expression. Science 306, 690–695 (2004).

7. Monks, S.A. et al. Genetic inheritance of gene expression in human cell lines. Am. J.Hum. Genet. 75, 1094–1105 (2004).

8. Morley, M. et al. Genetic analysis of genome-wide variation in human gene expression.Nature 430, 743–747 (2004).

9. Schadt, E.E. et al. Genetics of gene expression surveyed in maize, mouse and man.Nature 422, 297–302 (2003).

10. Bystrykh, L. et al. Uncovering regulatory pathways that affect hematopoietic stem cellfunction using ‘genetical genomics’. Nat. Genet. 37, 225–232 (2005).

11. Chesler, E.J. et al. Complex trait analysis of gene expression uncovers polygenic andpleiotropic networks that modulate nervous system function. Nat. Genet. 37, 233–242(2005).

12. Hubner, N. et al. Integrated transcriptional profiling and linkage analysis for identifica-tion of genes underlying disease. Nat. Genet. 37, 243–253 (2005).

13. Chu, S. et al. The transcriptional program of sporulation in budding yeast. Science282, 699–705 (1998).

14. Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display ofgenome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868(1998).

15. Hughes, T.R. et al. Functional discovery via a compendium of expression profiles. Cell102, 109–126 (2000).

16. Mootha, V.K. et al. PGC-1alpha-responsive genes involved in oxidative phosphorylationare coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).

17. Toth, A. et al. Functional genomics identifies monopolin: a kinetochore protein requiredfor segregation of homologs during meiosis i. Cell 103, 1155–1168 (2000).

18. Drake, T.A. et al. Genetic loci determining bone density in mice with diet-inducedatherosclerosis. Physiol. Genomics 5, 205–215 (2001).

19. Villa-Colinayo, V., Shi, W., Araujo, J. & Lusis, A.J. Genetics of atherosclerosis:the search for genes acting at the level of the vessel wall. Curr. Atheroscler. Rep. 2,380–389 (2000).

20. Darvasi, A. Experimental strategies for the genetic dissection of complex traits inanimal models. Nat. Genet. 18, 19–24 (1998).

21. Hubbard, T. et al. The Ensembl genome database project. Nucleic Acids Res. 30,38–41 (2002).

22. Mural, R.J. et al. A comparison of whole-genome shotgun-derived mouse chromosome16 and the human genome. Science 296, 1661–1671 (2002).

23. Waterston, R.H. et al. Initial sequencing and comparative analysis of the mousegenome. Nature 420, 520–562 (2002).

24. Doss, S., Schadt, E.E., Drake, T.A. & Lusis, A.J. Cis-acting expression quantitative traitloci in mice. Genome Res. 15, 681–691 (2005).

25. Klein, R.F. et al. Regulation of bone mass in mice by the lipoxygenase gene Alox15.Science 303, 229–232 (2004).

26. Schadt, E.E. et al. An integrative genomics approach to infer causal associationsbetween gene expression and disease. Nat. Genet. 37, 710–717 (2005).

27. Mehrabian, M. et al. Identification of 5-lipoxygenase as a major gene contributing toatherosclerosis susceptibility in mice. Circ. Res. 91, 120–126 (2002).

28. Kuhn, H., Anton, M., Gerth, C. & Habenicht, A. Amino acid differences in the deduced5-lipoxygenase sequence of CAST atherosclerosis-resistance mice confer impairedactivity when introduced into the human ortholog. Arterioscler. Thromb. Vasc. Biol. 23,1072–1076 (2003).

29. Mehrabian, M. & Allayee, H. 5-lipoxygenase and atherosclerosis. Curr. Opin. Lipidol.14, 447–457 (2003).

30. Yu, S. et al. Adipocyte-specific gene expression and adipogenic steatosis in the mouseliver due to peroxisome proliferator-activated receptor gamma1 (PPARgamma1) over-expression. J. Biol. Chem. 278, 498–505 (2003).

31. Zhu, J. et al. An integrative genomics approach to the reconstruction of gene networksin segregating populations. Cytogenet. Genome Res. 105, 363–374 (2004).

32. Yoshitake, H., Rittling, S.R., Denhardt, D.T. & Noda, M. Osteopontin-deficient mice areresistant to ovariectomy-induced bone resorption. Proc. Natl. Acad. Sci. USA 96,8156–8160 (1999).

33. Beattie, J.H. et al. Obesity and hyperleptinemia in metallothionein (-I and -II) nullmice. Proc. Natl. Acad. Sci. USA 95, 358–363 (1998).

34. Pei, L. & Tontonoz, P. Fat’s loss is bone’s gain. J. Clin. Invest. 113, 805–806 (2004).35. Golub, T.R. et al. Molecular classification of cancer: class discovery and class

prediction by gene expression monitoring. Science 286, 531–537 (1999).36. Ramaswamy, S., Ross, K.N., Lander, E.S. & Golub, T.R. A molecular signature of

metastasis in primary solid tumors. Nat. Genet. 33, 49–54 (2003).37. Zhang, W. et al. The functional landscape of mouse gene expression. J. Biol. 3, 21

(2004).38. Hughes, T.R. et al. Expression profiling using microarrays fabricated by an ink-jet

oligonucleotide synthesizer. Nat. Biotechnol. 19, 342–347 (2001).39. He, Y.D. et al. Microarray standard data set and figures of merit for comparing

data processing methods and experiment designs. Bioinformatics 19, 956–965(2003).

40. Mehrabian, M. et al. Genetic locus in mice that blocks development of atherosclerosisdespite extreme hyperlipidemia. Circ. Res. 89, 125–130 (2001).

41. Parhami, F. et al. Atherogenic high-fat diet reduces bone mineralization in mice.J. Bone Miner. Res. 16, 182–188 (2001).

NATURE GENETICS VOLUME 37 [ NUMBER 11 [ NOVEMBER 2005 1 23 3

ART I C LES©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s