Genetic mapping of allometric scaling laws

10
Genetic mapping of allometric scaling laws FEI LONG 1 , YING QING CHEN 2 , JAMES M. CHEVERUD 3 AND RONGLING WU 1 * 1 Department of Statistics, University of Florida, Gainesville, FL 32611, USA 2 Program in Biostatistics, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA 3 Department of Anatomy and Neurobiology, Washington University School of Medicine, St Louis, MO 631110, USA (Received 26 July 2005 and in revised form 4 December 2005 ) Summary Many biological processes, from cellular metabolism to population dynamics, are characterized by particular allometric scaling relationships between rate and size (power laws). A statistical model for mapping specific quantitative trait loci (QTLs) that are responsible for allometric scaling laws has been developed. We present an improved model for allometric mapping of QTLs based on a more general allometry equation. This improved model includes two steps : (1) use model II regression analysis to estimate the parameters underlying universal allometric scaling laws, and (2) substitute the estimated allometric parameters in the mixture-based mapping model to obtain the estimation of QTL position and effects. This model has been validated by a real example for a mouse F 2 progeny, in which two QTLs were detected on different chromosomes that determine the allometric relationship between growth rate and body weight. 1. Introduction Among a vast range of species from microorganisms to the largest mammals, many biological variables seem to bear a specific quarter-power scaling re- lationship to overall body size (Fig. 1A). For ex- ample, various biological times (e.g. lifespan and the time between heartbeats) scale with body mass to the 1/4 power, and resting metabolic rate scales with body mass to the 3/4 power (McMahon, 1973 ; Calder III, 1984; Schmidt-Nielsen, 1984; Enquist et al., 1998, 1999; Brown & West, 2000). Several at- tempts have been made recently to derive such general allometric scaling laws based on maximum efficiency (West et al., 1997, 1999 a ; Banavar et al., 1999), which has been regarded as the fundamental design principle for biological systems. Andresen et al. (2002) argued that maximum efficiency built on evolutionary (Bonner & Horn, 2000) as well as thermodynamic grounds (Dewey & Donne, 1998) may suffer from in- ternal inconsistencies when it is used to explain scaling laws. Allometric scaling laws can be described math- ematically by a power function B=aM b , where B is a biological variable, M is the body weight, a is the constant and b is the scaling exponent. Allometric relationships result from the regulation of scale and proportion in living organisms and are thought to have a genetic component (Wu et al., 2003). Wu et al. (2002) presented a basic statistical framework for mapping quantitative trait loci (QTLs) responsible for universal quarter-power scaling laws of structure and function with the entire body size. A key issue for QTL mapping of allometry is how to explain the genetic value of a biological variable in terms of that of body weight under the scaling relationship. Wu et al.’s model takes advantage of a linear relationship between B and M after the power function is log- transformed and, therefore, is statistically straight- forward to derive. When more complicated power functions, for which the log-transformation cannot lead to a linear relationship, are needed to describe allometric scaling laws, a different model based on Taylor’s approximation is developed (Ma et al., 2003). Although Taylor’s approximation can establish a general model for allometric mapping, its actual * Corresponding author. Tel: +1 (352) 3923806. Fax: +1 (352) 3928555. e-mail : [email protected]fl.edu Genet. Res., Camb. (2006), 87, pp. 207–216. f 2006 Cambridge University Press 207 doi:10.1017/S0016672306008172 Printed in the United Kingdom

Transcript of Genetic mapping of allometric scaling laws

Genetic mapping of allometric scaling laws

FEI LONG1, YING QING CHEN2, JAMES M. CHEVERUD3AND RONGLING WU1*

1Department of Statistics, University of Florida, Gainesville, FL 32611, USA2Program in Biostatistics, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA3Department of Anatomy and Neurobiology, Washington University School of Medicine, St Louis, MO 631110, USA

(Received 26 July 2005 and in revised form 4 December 2005 )

Summary

Many biological processes, from cellular metabolism to population dynamics, are characterized byparticular allometric scaling relationships between rate and size (power laws). A statistical model formapping specific quantitative trait loci (QTLs) that are responsible for allometric scaling laws hasbeen developed. We present an improved model for allometric mapping of QTLs based on a moregeneral allometry equation. This improved model includes two steps: (1) use model II regressionanalysis to estimate the parameters underlying universal allometric scaling laws, and (2) substitutethe estimated allometric parameters in the mixture-based mapping model to obtain the estimationof QTL position and effects. This model has been validated by a real example for a mouse F2

progeny, in which two QTLs were detected on different chromosomes that determine the allometricrelationship between growth rate and body weight.

1. Introduction

Among a vast range of species from microorganismsto the largest mammals, many biological variablesseem to bear a specific quarter-power scaling re-lationship to overall body size (Fig. 1A). For ex-ample, various biological times (e.g. lifespan and thetime between heartbeats) scale with body mass tothe 1/4 power, and resting metabolic rate scales withbody mass to the 3/4 power (McMahon, 1973;Calder III, 1984; Schmidt-Nielsen, 1984; Enquistet al., 1998, 1999; Brown & West, 2000). Several at-tempts have been made recently to derive such generalallometric scaling laws based on maximum efficiency(West et al., 1997, 1999a ; Banavar et al., 1999), whichhas been regarded as the fundamental design principlefor biological systems. Andresen et al. (2002) arguedthat maximum efficiency built on evolutionary(Bonner & Horn, 2000) as well as thermodynamicgrounds (Dewey & Donne, 1998) may suffer from in-ternal inconsistencies when it is used to explain scalinglaws.

Allometric scaling laws can be described math-ematically by a power function B=aMb, where B isa biological variable, M is the body weight, a is theconstant and b is the scaling exponent. Allometricrelationships result from the regulation of scale andproportion in living organisms and are thought tohave a genetic component (Wu et al., 2003). Wu et al.(2002) presented a basic statistical framework formapping quantitative trait loci (QTLs) responsible foruniversal quarter-power scaling laws of structure andfunction with the entire body size. A key issue forQTL mapping of allometry is how to explain thegenetic value of a biological variable in terms of thatof body weight under the scaling relationship. Wuet al.’s model takes advantage of a linear relationshipbetween B and M after the power function is log-transformed and, therefore, is statistically straight-forward to derive. When more complicated powerfunctions, for which the log-transformation cannotlead to a linear relationship, are needed to describeallometric scaling laws, a different model based onTaylor’s approximation is developed (Ma et al.,2003).

Although Taylor’s approximation can establisha general model for allometric mapping, its actual

* Corresponding author. Tel: +1 (352) 3923806. Fax:+1 (352) 3928555. e-mail : [email protected]

Genet. Res., Camb. (2006), 87, pp. 207–216. f 2006 Cambridge University Press 207doi:10.1017/S0016672306008172 Printed in the United Kingdom

application presents a significant problem. The ex-pression of the B mean as a function of the Mmean depends on the order with which Taylor’sseries is expanded. Although the expansions ofhigher order can theoretically result in a more precisedescription of the B–M relationship than those oflower order, the former tend to be computationallymore expensive than the latter. Because the pre-cision of the model and computational efficiency areequally important, it would be difficult to make acompromise between these two aspects for a practicalproblem.

In this article, we develop an ad-hoc model forgenetic mapping of general allometric scaling laws,

aimed at the simultaneous improvement of the pre-cision of the model and computational efficiency. Ourmodel includes two different steps. First, model IIregression analysis based on a loss function is used tofit observed bivariate data using a known allometricequation (Ebert & Russell, 1994). According to allo-metric scaling laws, differences among differentorganisms can be described by the same power func-tion. Thus, it is reasonable to assume that differentgenotypes at an underlying QTL share a commonpower function which can be described by the sameset of parameters. Second, the estimated parametersfor the power function are substituted in a QTLmapping framework built on the finite mixture model.Instead of the estimation of all parameters, we needto estimate only the parameters describing the QTLeffects on the dependent variable (i.e. body weight)and QTL position. A reduced number of modelparameters being estimated increases the precision ofparameter estimation. We derived the observed in-formation matrix to investigate the precision of ourstatistical model. A worked example for mouse bodyweight growth illustrates the usefulness of model. Inthis example, we have identified two QTLs on differ-ent mouse chromosomes that regulate the allometricscaling relationship between growth rate and bodyweight.

2. Linear model for allometry mapping

Assume that our mapping population is an F2 pro-geny of size N founded by two inbred lines. In this F2

population, we measure one biological variable, B,and body weight, M, for each individual and con-struct a genetic map based on polymorphic markers.Suppose these two measured traits B and M are re-lated by a power equation,

B=aMb: (1)

The estimation of the constant parameter a and ex-ponential power b in this equation is obtained by alinear regression following the log-transformation, i.e.

ln(B)=a+bln(M)

or

y=a+bx, (2)

where y=ln(B) and x=ln(M). An interval mappingapproach has been developed to locate QTLs thataffect allometric scaling laws based on this log-transformed linear regression function (Wu et al.,2002). We will modify this approach by making ap-propriate transformations to minimize the number ofunknown parameters being estimated. An in Wu et al.(2002), two common mechanisms for trait corre-lations, pleiotropic and linkage, will be considered.

A

B

0 2 4 6

Body mass

8 10 12 140

1

2

3

4

5

6

7

8

9

10

11

Tot

al m

etab

olic

rat

e

100

Mouse

Dove

Chicken Cat

GoatDogMen

10

1

0·1

0·01 0·1 1 10Body mass (kg)

Tot

al m

etab

olic

rat

e (w

alts

)

100 1000 10,000

1000

Woman

CowHorse

Elephant

Fig. 1. Allometric scaling laws across different species(A) (downloaded from http://biology.unm.edu/jhbrown/Research/Scaling/Scaling.htm) and within species (B).Three different genotypes in (B) are assumed, coded byr, # and +. The log-transformed genotypic values forthe three genotypes are distributed along a straight line.If the coordinates of the three genotypic means do notoverlap, this means that there exists a gene to affect theallometric law.

F. Long et al. 208

(i) Pleiotropic model

The pleiotropic model proposes that a common QTLaffects variation in both log-transformed traits x andy. At a putative QTL, the F2 population can be sortedinto three genotype groups, QQ, Qq and qq, coded by2, 1 and 0, respectively. These three groups form threeclusters in the coordinate of x and y. If the allometricchange of trait y with respect to trait x, as describedby equation (1), results from the pleiotropic effect ofthis QTL, then the three points, each representing apair of the expected mean values of the two traits inone genotype group, should be significantly differentfrom each other but should be on the same line de-scribed by the log-transformed allometry equation(2) (Fig. 1B). With such a linear relationship, theexpected mean value of the transformed trait y can bepredicted exactly from the mean value of the trans-formed trait x.

The differences in x or y among the three QTL-genotypic means reflect the magnitude of the geneticeffects of the QTL on the corresponding trait. Denotea and d as the additive and dominant effects of theQTL on x. Thus, the phenotypic values of the two log-transformed traits for individual i are expressed bylinear statistical models,

xi=m+jia+fid+exi ,

yi=a+bxi+eyi=a+b( m+jia+fid )+bexi +e y

i ,

where m is the overall mean for x ; ji and fi arethe dummy variables indicating the QTL genotype ofindividual i, with ji denoted as 1 for QQ, 0 for Qqand x1 for qq and fi denoted as 1 for Qq and 0 forQQ or qq ; ei

xyN(0, sx2 ) and ei

yyN(0, sy2) are the error

terms for traits x and y, respectively, which arecorrelated among individuals with correlation coef-ficients R. Because the genetic effects are fixed effects,the variances for traits x and y and their correlationcan be expressed, respectively, as

v2x=s2x,

v2y=b2s2x+s2

y,

r=bsx+Rsyffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffib2s2

x+s2y

q :

Let mjx and mj

y be the mean values of x and y for QTLgenotypes j ( j=2, 1, 0), respectively. According toequation (2), the relationship between the means ofthe two transformed traits can be modelled by

myj=a+bmx

j , (3)

for QTL genotype j.Suppose that the QTL is bracketed by two flanking

markers M1 and M2 with recombination frequency of

r. The recombination frequencies between M1 and theQTL and between the QTL and M2 are r1 and r2, re-spectively. The QTL position can be specified using r1or r2. The conditional probability of a QTL genotypegiven each of the nine two-marker genotypes can bederived and expressed as a function of r, r1 and r2. Weuse pj|i to denote such a conditional probability forindividual i to carry QTL genotype j.

A conventional allometry mapping model, ad-vocated by Wu et al. (2002), estimates allometrycoefficients, QTL effects, QTL position and residual(co)variance, arrayed by a unknown vector V=( m, a,d, a, b, r1, sx

2 , sy2, R)T. The likelihood of the sample of

bi-variate measurements can be expressed by a mix-ture model as

L(Vjx, y)=Yni=1

g2

j=0pjji fj (xi, yi)� �

(4)

where the two-dimensional normal density, fj (xi, yi),is expressed as

fj(xi, yi)=1

2pvxvyffiffiffiffiffiffiffiffiffiffiffiffi1xr2

p exp

(x

1

2(1xr2)

rxixmx

j

vx

� �2

x2r(xixmx

j )(yixm yj )

vxvy+

yixm yj

vy

� �2" #)

:

The maximum likelihood estimators (MLEs) of theunknown vector V can be obtained by differentiatingthe log-likelihood function (equation (4)) with respectto each unknown, setting the derivatives equal to zeroand solving the log-likelihood equation. By definingYjji

=pjji fj (xi, yi)

g2j0=0 pij k fj k(xi, yi)

� � , (5)

which could be thought of as a posterior probabilitythat individual i has QTL genotype j, the EM algor-ithm is implemented to solve the likelihood function.The posterior probabilities,

Qjji, are calculated for

each individual and each QTL genotype in the E stepand they are then used to obtain the MLEs of eachparameter contained in V in the M step which arederived from the log-likelihood equations. For de-tailed iterative EM steps, refer to Wu et al. (2002).

(ii) Linked QTL model

The allometric scaling of organisms may also be af-fected by two QTLs that are genetically linked on thesame chromosome, one exerting an effect on trait xand the other on trait y. Under such a linked-QTLmodel, two putative QTLs may be located withinthe same marker interval or in different marker in-tervals. We consider a special linkage model in whicheach of the two linked QTL affects a different traitx and y.

Genetic mapping of allometric scaling laws 209

Consider two linked QTLs of which one (P) affectstrait x and the other (Q) affects trait y. Let j1 denote aQTL genotype PP, Pp and pp, coded as 2, 1 and 0,respectively. Similarly, we use j2 to denote QTL geno-types QQ, Qq and qq, coded as 2, 1 and 0. If the twoQTLs are located in different marker intervals, twodifferent pairs of flanking markers should be usedsimultaneously to specify the likelihood of the data.Denote by H1 the matrix for the conditional prob-ability of QTL genotypes at P given the markerinterval M1–M2. Similarly, the matrix for the con-ditional probability of QTL genotypes at Q giventhe marker interval N1–N2 is denoted by H2. Theconditional probability of joint QTL genotypes atP and Q given the two marker intervals can be ex-pressed as

H1 �H2,

where � is the Kronecker product. If two linkedQTLs are located within the same marker interval, theconditional probability of joint QTL genotype giventhe marker genotypes is derived.

Under our linkage model, the additive and domi-nant effects of QTL P on x are denoted by a and d,whereas the additive and dominant effects of QTL Qand y are denoted as a+ba and a+bd, respectively.The likelihood of the data under the linked-QTLmodel can be expressed as

L(V)jx, y)=Yni=1

g2

j1=0g2

j2=0pj1j2ji fj1j2(xi, yi)� �

(6)

where the vector V contains the same unknowns as inthe pleiotropic QTL model, except with one moreparameters describing the position of the secondQTL, pj1 j2ji is the conditional probability of the jointQTL genotype j1 j2 given a specific individual ithat carries a known marker genotype and fj1 j2 (xi, yi)is the joint normal distribution of x and y for a jointQTL genotype. Similarly, the EM algorithm can beused to estimate the unknowns under the linkagemodel.

(iii) Precision analysis

After the point estimates of parameters have beenobtained by the EM algorithm, it is necessary to de-rive the variance–covariance matrix and evaluate thestandard errors of the estimates. Because the EM al-gorithm does not automatically provide the estimatesof the asymptotic variance–covariance matrix forparameters, an additional procedure has been devel-oped to estimate this matrix (and thereby standarderrors) when the EM algorithm is used (Louis, 1982;Meng & Rubin, 1991). Meng & Rubin (1991) pro-posed a so-called supplemented EM algorithm orSEM algorithm to estimate the asymptotic variance–

covariance matrices. However, in this study, Louis’(1982) approach is used to calculate the standarderrors for the MLEs of QTL parameters for allometrymapping.

(iv) Hypothesis tests

Several hypotheses about the QTL affecting quarter-power scalings of organisms can be formulated for ourmodel. These hypotheses include: (i) there is a QTL ina linkage group that affects two traits, (ii) this signifi-cant QTL is pleiotropic with effect on both traits, or itactually presents two linked QTLs (one affecting eachtrait), and (iii) under a best-fitting model, the QTLdetected affects the two traits in a quarter-powerscaling. In addition, we can ask whether the normal-ization constant a is a characteristic of species orpopulations (Niklas, 1994). This can also be testedto find out whether the mapping population usedconforms to a general scaling pattern. However, arecent survey by Niklas & Enquist (2001) suggestedthat all plants have similar allometric exponents andnormalization constants and, therefore, comply witha single allometric formula.

The evidence for QTL on traits x and y can be testedby hypothesizing a single QTL versus no QTL on alinkage group, i.e.

H0: a=d=0

H1: At least one of these equalities above

does not hold:

8><>: (7)

The H0 states that no QTL affects trait x (the reducedmodel), whereas the H1 proposes that there is such aQTL (the full model). The test statistic for testing thehypotheses (7) is calculated as the log-likelihood ratioof the reduced to the full model :

LR=�2[lnL(eVVjx, y)xlnL( bVVjx, y)], (8)

where eVV and bVV denote the MLEs of the unknownparameters under H0 and H1, respectively. Underthe null hypothesis, the LR is asymptotically x2-distributed with two degrees of freedom. An empiricalapproach for determining the critical threshold isbased on permutation tests. By repeatedly shufflingthe relationships between marker genotypes andphenotypes, a series of maximum log-likelihood ratiosare calculated, from the distribution of which thecritical threshold is determined.

To test whether this detected QTL affects allometricscaling laws, we need to perform two additional testsfor the significance of the exponential power b. Thefirst is to test b=0 versus bl0, which is associatedwith the pleiotropic effect of this QTL on trait y.The second is to test b=k versus blk, where k is amultiplier of a quarter such as 1/4 or 3/4. Only after

F. Long et al. 210

both the null hypotheses above about x and y arerejected, is the underlying QTL suggested to bepleiotropic for allometric scaling laws.

If a marker interval is detected to carry two QTLseach affecting a different trait, it is important to testwhether the correlation between the two traits is dueto the pleiotropic effect of the same QTL or the link-age between the two QTLs. Let two QTLs have pos-itions symbolized by p(1) for the QTL associated withtrait x and p(2) for the QTL associated with trait y.Whether or not these two QTLs are actually the samecan be tested by formulating the hypothesesp(1)=p(2) versus p(1)lp(2). If the null hypothesis isaccepted, this means that the pleiotropic effect ofQTLs has a more important contribution to traitcorrelation than the linkage. But its rejection, i.e. theexistence of the two QTLs, may result from two possi-bilities : (1) each QTL affects a different trait and (2)each QTL affects two traits simultaneously. These twopossibilities can be further tested by formulating twoalternative hypotheses. If the first possibility is true,the linkage is more important than the pleiotropy inaffecting the trait correlation. If the second possibilityis true, both the pleiotropy and linkage are important.

(v) An improved mapping model

In the previous sections, we described a conventionalapproach for allometry mapping. In this section, animproved model is proposed to increase the compu-tational efficiency of allometry mapping. This im-proved model is based on a two-step estimationprocess. In step 1, the power parameters that governallometric scaling laws are estimated by a regressionmodel. In step 2, by substituting the estimated powerparameters in the mapping model constructed by amixture model, the effects and position of the under-lying QTL are estimated using the EM algorithm. Asshown in equations (1) and (2), the log-transform-ation makes two allometrically related traits linearlyrelated and, at this time, the power parameters can beestimated, using a least squares approach, as

�bb=ngn

i=1xiyix�gn

i=1xi

��gn

i=1yi

�ngn

i=1x2ix

�gn

i=1xi

�2 ,

�aa=1

ngn

i=1yixb g

n

i=1xi

� �:

By viewing �aa and �bb as the constants, we define anew variable

zi=yix�bbxix�aa, (9)

which is normally distributed as N(0, sz2). It can

be seen that xi and zi are independent. The jointdistribution of xi and zi for the three different QTL

genotypes can be written as

f2(xi, zi)=f2(xi)*f(zi)

=1

2psxsz

r exp x1

2

(xixmxa)2

s2x

+z2is2z

� �,

f1(xi, zi)=f1(xi)*f(zi)

=1

2psxsz

r exp x1

2

(xixmxd)2

s2x

+z2is2z

� �,

f0(xi, zi)=f0(xi)*f(zi)

=1

2psxsz

r exp x1

2

(xixm+a)2

s2x

+z2is2z

� �:

The likelihood of the unknown parameters given theobserved trait x and the newly defined variable z canbe written under the pleiotropic (equation (4)) orlinked QTL model (equation (6)). The maximizationof this likelihood with respect to the unknownparameters leads to the following log-likelihoodequations for the pleiotropic model, expressed as afunction of the posterior probabilities (equation (5)) :

m=1

2

gn

i=1xi

Q0ji

gn

i=1

Q0ji

+gn

i=1xi

Q2ji

gn

i=1

Q2ji

" #

a=gn

i=1xi

Q1ji

gn

i=1

Q1ji

x1

2

gn

i=1xi

Q0ji

gn

i=1

Q0ji

+gn

i=1xi

Q2ji

gn

i=1

Q2ji

" #

d=1

2

gn

i=1xi

Q0ji

gn

i=1

Q0ji

xgn

i=1xi

Q2ji

gn

i=1

Q2ji

" #

s2x=

1

ngn

i=1

Y2ji

(xixmxa)2+Y1ji

(xixmxd)2

"

+Y0ji

(xi+m+a)2

#

s2z=

1

ngn

i=1(yix�aax�bbxi)

2:

The EM algorithm is implemented to estimate theseparameters. Compared with the traditional model,this improved model estimates fewer parametersand, thus, is expected to provide more power to detectallometric QTLs.

3. Non-linear model for allometry mapping

Although the simple allometry equation (1) has beenused to model allometric scaling relationships, it islimited for the precise description of many importantbiological phenomena. For example, this equation,which forces two variables to pass through the origin,i.e. when x=0 then y=0, cannot describe the relation-ship between two developmentally asynchronous fea-tures, such as reproductive timing and body weight.

To accurately describe the scaling relationship be-tween any biological traits, we need to extend the

Genetic mapping of allometric scaling laws 211

simple allometry equation to accommodate generalbiological phenomena. We propose a two-step pro-cedure to estimate the position and effects for a QTLthat affects more general allometric scaling laws.

(i) Step 1: Modelling general allometry equations

A number of mathematical equations have beenproposed by earlier biologists to describe generalallometric scaling relationships. Among them, threerepresentative allometry equations are:

y=axb+c (Robb, 1929)

a(xxc)b (Reeve & Huxley, 1945)

a(xxc)b+d (Lumer, 1937),

8><>: (10)

where Lumer’s four-parameter equation can beviewed as the most general. Ebert & Russell (1994)

introduced a model II non-linear regression analysisto estimate the parameters contained in the allometryequations (10). Unlike model I regression that mini-mizes the squared distance between the coordinate ofa data pair on the y-axis and the function, model IIregression minimizes the area connecting the co-ordinates of the data pair and the function and thusdeals with variation in both traits. Model II regressionanalysis assumes an equal error variance for bothtraits and is a special case of more general error-in-variables models (Seber & Wild, 1989).

Consider an individual i from a mapping popu-lation of size N. The observations of this individualfor traits xi and yi present a point (A) with the co-ordinate (xi, yi) as shown in Fig. 2. To detect anallometry curve that has a minimum deviation to thiscoordinate, we define four more points B, C,D and E,whose coordinates are expressed as

B= xi,a(xixd)b+c)� �

C=(zi, yi)

D=(xi, 0)

E=(zi, 0),

where

zi=

yixc

a

� �1b

Robb equation

yia

� �1b+c Reeve–Huxley equation

yixc

a

� �1b+d Lumer equation:

8>>>>>>><>>>>>>>:The area of the rectangle ADEC is

Ai(ADEC)=yijzixxij:

The area under the curve, BDEC, is

0

A

B

C

D E

Fig. 2. The diagram for calculating the area under a curveusing the loss function approach. Adapted from Ebert &Russell (1994).

Ai(BDEC)

=

Z zi

xi

(axbi +d)dxi Robb equationZ zi

xi

a(xixc)bdxi Reeve–Huxley equationZ zi

xi

a(xixc)b+d� �

dxi Lumer equation

8>>>>>>>><>>>>>>>>:

=

a

b+1xd+1i x

yixd

a

� �b+1b

����������+d xix

yixd

a

� �1b

���������� Robb equation

a

b+1(xixc)b+1x

yia

� �b+1b

���� ���� Reeve–Huxley equation

a

b+1(xixc)b+1x

yixd

a

� �b+1b

xc

����������+d xix

yixd

a

� �b+1b

xc

���������� Lumer equation:

8>>>>>>>>>><>>>>>>>>>>:

(11)

F. Long et al. 212

We define the absolute value of the difference be-tween Ai (ADEC) and Ai (BDEC), i.e. area

Ai(ABC)=Ai(ADEC)xAi(BDEC),

as the loss function for individual i. The loss functionfor all individuals is expressed as

A(ABC)= gn

i=1Ai(ABC): (12)

The estimates of a, b, c and d fitting the allometryequation can be obtained by minimizing the lossfunction defined by areas (12). It is impossible to de-rive their analytical solutions for this non-linearfunction. However, their numerical solutions can beobtained by using the simplex algorithm (Nelder &Mead, 1965). The advantage of the simplex algorithmis that it is derivative-free and easy to implement withcurrent software, such as Matlab. A reason for cau-tion with this algorithm is the possibility of obtaininglocal optimal solutions for the loss function (12). Bycarefully selecting the initial values, however, thisproblem can be the minimum if there exist globaloptimal solutions. If no global optimal solutions exist,we can take the minimum in the space of these par-ameters.

For a practical data set, it is essential to determinethe best allometry equation that can describe theallometric scaling relationship. The criterion for thedetermination can be based on the values of lossfunction summed over all individuals. For the samevalue of the loss function, an allometry equation withfewer parameters is better than those with moreparameters.

(ii) Step 2: The mapping process

As pointed out above, the parameters, a, b, c and d,for more general allometry equations are regarded asuniversal parameters and they should be the sameamong different QTL genotypes. After these par-ameters have been estimated frommodel II non-linearregression by minimizing the loss function (12), wesubstitute these estimates, indexed by �aa, �bb, �cc, and �dd, tothe likelihood functions described by equation (4) or(6). Before doing so, we take a simple change of moreallometry equations for individual i :

ln(yix�cc)=�aa+�bb ln(xi) Robb equation

ln(yi)=�aa+�bb ln(xix�cc) Reeve–Huxley equation

ln(yix�dd)=�aa+�bb ln(xix�cc) Lumer equation

8><>:or

ln(yki)=�aa+�bb ln(xi) Robb equation

ln(yi)=�aa+�bb ln(xik) Reeve–Huxley equation

ln(yki)=�aa+�bb ln(xik) Lumer equation

8><>: (13)

where ‘ k ’ denotes the transformation of the two traits.Because �aa, �bb, �cc, and �dd can be regarded as knownconstants obtained from model II regression analysis,xik and yik can be calculated and display the samestatistical distribution as raw data xi and yi. Thus, thenew relationships described by equation (13) areidentical to a linear log-transformed allometry equa-tion (2). By using the improved model as describedabove, we estimate the remaining parameters VR=(m, a, d, sx

2 , sz2, r1)

T. The existence of the underlyingQTL for more general allometry equations can betested by formulating the hypotheses H0 : a=d=0versus H1 : at least one equality does not hold.

4. Results

The model proposed here is used to map age-depen-dent QTLs in a model system: the mouse. Cheverudet al. (1996) constructed a linkage map composed of19 chromosomes based on 75 microsatellite markersin 535 F2 progeny population derived from twostrains, Large and Small. The F2 hybrids wereweighted at 10 weekly intervals starting at age 7 days.The raw weights were corrected for the effects of eachcovariate due to dam, litter size at birth, parity andsex. The growth rate at each time interval [t, t+1] wascalculated for each mouse my subtracting bodyweight at time t from body weight from time t+1.The mean growth rate across all the time intervalswas then calculated for each mouse. Since we did notobserve a marked trend that the variance increaseswith the mean, the model II non-linear regression isused to estimate the parameters for the equation thatspecifies the allometric relationship between growthrate and body weight (Niklas, 1994).

The four allometry equations (1) and (10) were usedto fit the relationship between the growth rate andbody weight. By comparing the values of loss functionamong these equations, Robb’s equation was foundto be the most parsimonious. Based on Robb’s equa-tion, we estimated the three parameters underlyingthe allometry equation ~aa=0�024, ~bb=0�68 and~cc=x0�016. As shown by their sampling errors, theseparameters estimates have reasonable precision(Table 1). It is interesting to find that the estimated bvalue is not significantly different from 0.75, whichsupports the three-quarter law for a biological processto scale with body weight (West et al., 1997).

These estimated parameters from model II re-gression by minimizing the loss function (12) weresubstituted in the mapping model build by a log-transformed linear regression model as shown byequation (13). Through such a substitution, we onlyneed to estimate the remaining parameters includingthe QTL position, QTL effects for body weight, andthe residual (co)variance matrix between growth rateand body weight. We scan all the 19 chromosomes for

Genetic mapping of allometric scaling laws 213

the existence of QTLs affecting the allometric scalingrelationship between growth rate and body weight.Fig. 3 gives the profile of the log-likelihood ratio (LR)

test statistics for claiming the existence of QTLsacross the entire mouse genome. There are two peaksfor the LR profile: one (16.47) between markersD6Nds5 and D6Mit15 on chromosome 6 and theother (22.46) between D7Nds1 and D7Mit17 onchromosome 7. These two LR values are well beyondthe genome-wide critical threshold (13.1) at the 0.001significance level determined on the basis of permu-tation tests. These tests suggest the existence of twoQTLs that are located at the positions correspondingto the peaks of the profile.

The genetic effects of these two QTLs on bodyweight and other model parameters were estimated inTable 1. In general, the estimates of these parametersin this mouse example from our model exhibit goodprecision. The QTL detected on chromosome 6 affectsbody weight in a partially dominant manner, whereasthe QTL on chromosome 7 displays a strong domi-nant or overdominant effect on body weight. Ourdetection is broadly consistent with simple intervalmapping analysis of the same material by Cheverudet al. (1996). Of 16 chromosomes observed to carry theQTLs for body weight, the QTLs on chromosomes 6

0

5

10

15

20L

R1 23 4 5 6

0

5

10

15

20

LR

7 8 9 10 11

0

5

10

15

20

LR

12 13 14 15 16 17 18 19

20 cM

0·1% cutoff

0·1% cutoff

0·1% cutoff

2

Fig. 3. The profiles of the log-likelihood ratios (LRs) between the full and reduced (no QTL) model for allometric scalingof mean growth rate to body weight across the entire genome using the linkage map constructed from microsatellitemarkers (Cheverud et al., 1996). The genomic positions corresponding to the peaks of the curves are the maximumlikelihood estimators of the QTL positions. The genome-wide threshold values for claiming the existence of a QTL areshown as the horizontal lines. Tick marks on the x-axis represent the positions of markers on the linkage group, the namesof which are given by Cheverud et al. (1996).

Table 1. Estimates of genetic effects at two detectedQTLs on chromosome 6 and 7 and the residual (co)variance matrix between two allometrically relatedtraits x and y in the F2 mouse progeny

Parameter Chromosome 6 Chromosome 7

m 3.91 3.90a 0.35 0.25d 0.19 0.26s2x 0.006 0.006

s2y 0.007 0.007

R 0.62 0.62

a 0.024b 0.68c x0.02

The maximum likelihood estimates (MLE) of parametersare symbolized by hats, and the estimates by model II re-gression are symbolized by breves.

F. Long et al. 214

and 7 were detected with larger LOD scores, explain-ing larger percentages of the observed variation, thanthose on the other chromosomes.

5. Discussion

Even though understanding the regulation ofallometry would have broad implications for fur-thering our knowledge of developmental ontogeny,regeneration, population growth and evolutionaryprocesses (West et al., 1997, 1999a, b), it is unfortunatethat no general genetic model exists to mechanisticallyexplain scaling laws. R. Wu and co-workers have, forthe first time, incorporated allometric rules in a QTLmapping framework (Wu et al., 2002; Ma et al.,2003). Their models were validated by a real examplefrom forest trees in which a QTL was detected togovern the allometric relationship of third-year stemheight with third-year stem biomass. The result sug-gested that the QTL detected from the model is onespecifically responsible for the allometric scaling. Inthis article, we proposed an improved model formapping specific QTLs that are responsible forallometric scaling laws based on more generalallometry equations.

Compared with previous models (Wu et al., 2002;Ma et al., 2003), our model is advantageous in severalrespects. First, it significantly reduces the number ofparameters to be estimated, thus increasing compu-tational efficiency. The derivation of our model isbased on the fundamental principle for allometricscaling with which a particular biological variable (B)scales as one-quarter or three-quarters of body weight(M) across an incredible range of species (Fig. 1A).Reduced from interspecific to intraspecific allometricscaling, these quarter-power laws can be seen acrossdifferent genotypes (Fig. 1B). If there is a particularQTL that affects allometric scaling laws, the means ofdifferent QTL genotypes should have significantly dif-ferent coordinates for traits B and M but they shouldbe located on a common straight line. As a result ofthis, we can first estimate the allometry parametersbased on all genotypes using a conventional statisticalmethod, such as least squares analysis, and then sub-stitute these allometry parameters in a maximum-likelihood-based QTL mapping framework built by amixture model. With the estimates of allometryparameters, we further make a simple transformationto remove the covariance between the response vari-able and body weight. A reduced number of theparameters about the QTL effects, QTL position andresidual variances between the two traits are esti-mated by implementing the EM algorithm.

Second, our model take advantage of the log-linearproperty of the power equation as used in Wu et al.(2002). For more general allometry equations (10),this property is not applicable, in which Taylor series

of different orders were used to approximate the genotypic means of the response variable based on bodyweight (Ma et al., 2003). Whereas a lower-order ap-proximation may lead to system errors, a higher-orderapproximation demands substantial computationalload. The model proposed in this article does notrely on Taylor’s approximation. Third, our modeldivides the whole estimation procedure into two stepsand, therefore, can be readily extended to any com-plicated allometry equations involving multivariatevariables.

Our model has been tested by an example of amouse F2 progeny. Using this model, we detected twoQTLs that govern allometric scaling laws betweengrowth rate and body weight. These two QTLs de-tected on chromosomes 6 and 7were in agreement withthe results from interval mapping of single growthtraits (Cheverud et al., 1996). It has been shown thatthese two chromosomes are more likely to carry moresignificant QTLs than other chromosomes.

There are several areas in which our model can bemodified. First, for model II non-linear regressionused to estimate the allometry parameters (Ebert &Russell, 1994), we assume that error variances for traitsx and y are equal. Although this may be reasonable incertain situations, incorporation of specified errorvariances of x and y is necessary for general allometryissues. Second, to better characterize the geneticarchitecture of allometric scaling laws, we shouldinclude modelling and analysis of epistatically inter-acting QTLs. Growing evidence has been observedfor the role of epistasis in organ development(Cheverud, 2000; Wolf et al., 2000). Third, for sim-plicity, the example used to test our model dealswith the allometric scaling between mean growth rateand body weight. However, growth rate is an age-dependent trait. The integration of age-specificallometric relationships in the QTL mapping frame-work will provide great insights into the genetic mech-anisms for the developmental control of allometricscaling laws.

We thank three anonymous referees for their constructivecomments on this manuscript. This work is supported byNIH grant DK055736 to J.M.C. and a grant from theNational Science Foundation of China (09 95671) and NSFgrant (0540745) to R.W.

References

Andresen, B., Shiner, J. S. & Uehlinger, D. E. (2002).Allometric scaling and maximum efficiency in physio-logical eigen time. Proceedings of the National Academy ofSciences of the USA 99, 5822–5824.

Banavar, J. R., Maritan, A. & Rinaldo, A. (1999). Size andform in efficient transportation networks. Nature 399,130–132.

Bonner, J. T. & Horn, H. S. (2000). Allometry and naturalselection. In Scaling in Biology (ed. J. H. Brown & G. B.

Genetic mapping of allometric scaling laws 215

West), pp. 25–35. Oxford: Santa Fe Institute and OxfordUniversity Press.

Brown, J. H. &West, G. B. (eds.) (2000). Scaling in Biology.Oxford: Oxford University Press.

Calder III, W. A. (1984). Size, Function and Life History.Cambridge, MA: Harvard University Press.

Cheverud, J. M. (2000). Detecting epistasis among quanti-tative trait loci. In Epistasis and the Evolutionary Process(ed. J. B.Wolf, E. D. Brodie III &M. J.Wade), pp. 58–71.New York: Oxford University Press.

Cheverud, J. M., Routman, E. J., Duarte, F. A. M., vanSwinderen, B., Cothran, K. & Perel, C. (1996).Quantitative trait loci for murine growth. Genetics 142,1305–1319.

Churchill, G. A. & Doerge, R. W. (1994). Empiricalthreshold values for quantitative trait mapping. Genetics138, 963–971.

Dewey, T. G. & Donne, M. D. (1998). Non-equilibriumthermodynamics of molecular evolution. Journal ofTheoretical Biology 193, 593–599.

Ebert, T. A. & Russell, M. P. (1994). Allometry and ModelII nonlinear regression. Journal of Theoretical Biology168, 367–372.

Enquist, B. J., Brown, J. H. & West, G. B. (1998).Allometric scaling of plant energetics and populationdensity. Nature 395, 163–165.

Enquist, B. J., West, G. B., Charnov, E. L. & Brown,J. H. (1999). Allometric scaling of production andlife-history variation in vascular plants. Nature 401,907–911.

Louis, T. A. (1982). Finding the observed informationmatrix when using the EM algorithm. Journal of theRoyal Statistical Society Series B 44, 226–233.

Lumer, H. (1937). The consequences of sigmoid growth forrelative growth functions. Growth 1, 140–154.

Ma, C. X., Casella, G., Littell, R. C., Khuri, A. I. &Wu, R. L. (2003). Exponential mapping of quantitativetrait loci governing allometric relationships in organisms.Journal of Mathematical Biology 47, 313–324.

McMahon, T. (1973). Size and shape in biology. Science179, 1201–1204.

Meng, X. & Rubin, D. (1992). Using EM to obtain asymp-totic variance-covariance matrices : The SEM algorithm.

Journal of the American Statistical Association 86,899–909.

Nelder, J. A. & Mead, R. (1965). A simplex methodfor function minimization. Computer Journal 7,308–313.

Niklas, K. L. (1994). Plant Allometry: The Scaling of Formand Process. Chicago, IL: University of Chicago.

Niklas, K. J. & Enquist, B. J. (2001). Invariant scaling re-lationships for interspecific plant biomass productionrates and body size. Proceedings of the National Academyof Sciences of the USA 98, 2922–2927.

Reeve, E. C. R. & Huxley, J. S. (1945). Some problems inthe study of allometric growth. In Essays on ‘Growth andForm ’ presented to D’Arcy Wentworth Thompson (ed.W. E. Le Gros Clark & P. B. Medawar), pp. 121–156.Oxford: Clarendon Press.

Robb, R. C. (1929). On the nature of heredity size-limitation. II. The growth of parts in relation to thewhole.British Journal of Experimental Biology 6, 311–324.

Schmidt-Nielsen, K. (1984). Why Is Animal Size SoImportant? Cambridge: Cambridge University Press.

Seber, G. A. F. & Wild, C. J. (1989). Nonlinear Regression.New York: Wiley.

West, G. B., Brown, J. H. & Enquist, B. J. (1997). A generalmodel for the origin of allometric scaling laws in biology.Science 276, 122–126.

West, G. B., Brown, J. H. & Enquist, B. J. (1999a). Thefourth dimension of life : fractal geometry and allometricscaling of organisms. Science 284, 1677–1679.

West, G. B., Brown, J. H. & Enquist, B. J. (1999b). A gen-eral model for the structure and allometry of plant vas-cular systems. Nature 400, 664–667.

Wolf, J. B., Brodie III, E. D. & Wade, M. J. (2000).Epistasis and the Evolutionary Process. Oxford: OxfordUniversity Press.

Wu, R. L., Ma, C. X., Littel, R. C. & Casella, G. (2002). Astatistical model for the genetic origin of allometricscaling laws in biology. Journal of Theoretical Biology219, 121–135.

Wu, R. L., Ma, C. X., Lou, X. Y. & Casella, G. (2003).Molecular dissection of allometry, ontogeny, and plas-ticity: a genomic view of developmental biology.BioScience 53, 1041–1047.

F. Long et al. 216