Epistasis and environmental heterogeneity in the speciation process

9
Ecological Modelling 221 (2010) 2546–2554 Contents lists available at ScienceDirect Ecological Modelling journal homepage: www.elsevier.com/locate/ecolmodel Epistasis and environmental heterogeneity in the speciation process Paulo R.A. Campos a,, Viviane M. de Oliveira b , Alexandre Rosas c a Departamento de Física, Universidade Federal Rural de Pernambuco, 52171-900, Dois Irmãos, Recife, PE, Brazil b Departamento de Estatística e Informática, Universidade Federal Rural de Pernambuco, 52171-900, Dois Irmãos, Recife, PE, Brazil c Departamento de Física, Universidade Federal da Paraíba, Caixa Postal 5008, 58059-900, João Pessoa, PB, Brazil article info Article history: Received 2 June 2010 Received in revised form 16 July 2010 Accepted 21 July 2010 Available online 21 August 2010 Keywords: Speciation Structured populations Adaptation Species-area relationship NK model Spatial heterogeneity abstract The study of evolutionary mechanisms driving evolution of natural populations is a central issue in evo- lutionary biology. In this context the question of genetic divergence and speciation in ecosystems is one of the most intriguing and complex challenges nowadays. The current manuscript aims to investigate the aforementioned problem by means of extensive computer simulations. Here we introduce a spatial model of evolution which assumes viability selection, mutations and random mating. The population evolves in a heterogeneous environment where the habitat diversity is an input of the model. As a model for the fitness landscape a variation of the NK model is assumed. We present results for the time evolu- tion of fitness and genetic divergence and address the species-area relationship. The dependence of the quantities and scalings of the problem with epistasis is investigated. Depending on the mutation rate, the species-area relationship displays either two-phase or a triphasic scenario. In any case, for intermediate and large system sizes the relationship is well described by a power-law, with exponents within the range of values observed in real ecological niches. All the quantities show a strong dependence on the epistasis parameter. According to our simulations, a lower level of genetic divergence is needed in order to ensure sexual isolation when interactions among genes is strong. © 2010 Elsevier B.V. All rights reserved. 1. Introduction The diversity of species on Earth is a striking characteristic which has raised an increased interest and investigations in the last decades (see for example Rosenzweig, 1995; Drakare et al., 2006 and references therein). Nonetheless, the processes of gen- eration and maintenance of species and biodiversity patterns are still a matter of debate (Chesson, 2000; Chave et al., 2002). Partic- ularly, the question why the number of species varies in distinct places and also in different time intervals appears recurrently and is a very important issue in this field. The crescent interest in the understanding of the role of biodiversity on ecosystems has as a goal to preview the consequences of the lost of species due to human and environmental disturbances (Begon et al., 2006). Interactions among species can be important in determining the number of species within a community (Molofsky et al., 1999). Furthermore, the spatial structure of the habitat and the spa- tial distribution of species richness play an important role on the ecosystems. Populations are actually expected to vary spatially to some extent (Nielsen et al., 2009; Holland et al., 1991; Parsek and Singh, 2003; Bell and Reboud, 1997). More importantly, spatial het- Corresponding author. E-mail address: [email protected] (P.R.A. Campos). erogeneity has often been suggested as an explanation for the high levels of genetic variation found in natural populations (Kassen, 2002), which has a direct consequence on the speed at which pop- ulation evolves and adapt to new environmental conditions. A fundamental issue in ecology is how many species occur within a given area. The spatial structure of species distributions has long been described in terms of species-area relationship (Arrhenius, 1921; Connor and McCoy, 1979; Lomolino, 2000; Plotkin et al., 2000; Lennon et al., 2002). The species-area relation- ship has been the focus of several empirical and theoretical studies (Plotkin et al., 2000; Lennon et al., 2002; de Aguiar et al., 2009; Pigolotti and Cencini, 2009), which bring about a broad spectrum of possible outcomes and a great divergence about the processes involved in the generation of species diversity (Lennon et al., 2002). Though it is well established that greater the area greater the bio- diversity (Williams, 1964; Connor and McCoy, 1979), especially because larger areas contain a larger variety of habitats (Kohn and Walsh, 1994). A substantial fraction of the investigations claims a power-law dependence of the species diversity (S) with area (A), i.e. S A z , as first suggested by Arrhenius (1921). Some empirical studies reinforce the power-law distribution for the species-area relationship and additionally suggest that the exponent z is smaller than 0.72 (Begon et al., 2006; Rosenzweig, 1995). However, sev- eral other investigations show that for larger areas the species-area relationship displays a downward curvature and so the power-law 0304-3800/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2010.07.023

Transcript of Epistasis and environmental heterogeneity in the speciation process

E

Pa

b

c

a

ARRAA

KSSASNS

1

wl2esupiuta

nFtesS

0d

Ecological Modelling 221 (2010) 2546–2554

Contents lists available at ScienceDirect

Ecological Modelling

journa l homepage: www.e lsev ier .com/ locate /eco lmodel

pistasis and environmental heterogeneity in the speciation process

aulo R.A. Camposa,∗, Viviane M. de Oliveirab, Alexandre Rosasc

Departamento de Física, Universidade Federal Rural de Pernambuco, 52171-900, Dois Irmãos, Recife, PE, BrazilDepartamento de Estatística e Informática, Universidade Federal Rural de Pernambuco, 52171-900, Dois Irmãos, Recife, PE, BrazilDepartamento de Física, Universidade Federal da Paraíba, Caixa Postal 5008, 58059-900, João Pessoa, PB, Brazil

r t i c l e i n f o

rticle history:eceived 2 June 2010eceived in revised form 16 July 2010ccepted 21 July 2010vailable online 21 August 2010

eywords:peciation

a b s t r a c t

The study of evolutionary mechanisms driving evolution of natural populations is a central issue in evo-lutionary biology. In this context the question of genetic divergence and speciation in ecosystems is oneof the most intriguing and complex challenges nowadays. The current manuscript aims to investigatethe aforementioned problem by means of extensive computer simulations. Here we introduce a spatialmodel of evolution which assumes viability selection, mutations and random mating. The populationevolves in a heterogeneous environment where the habitat diversity is an input of the model. As a modelfor the fitness landscape a variation of the NK model is assumed. We present results for the time evolu-

tructured populationsdaptationpecies-area relationshipK modelpatial heterogeneity

tion of fitness and genetic divergence and address the species-area relationship. The dependence of thequantities and scalings of the problem with epistasis is investigated. Depending on the mutation rate, thespecies-area relationship displays either two-phase or a triphasic scenario. In any case, for intermediateand large system sizes the relationship is well described by a power-law, with exponents within therange of values observed in real ecological niches. All the quantities show a strong dependence on theepistasis parameter. According to our simulations, a lower level of genetic divergence is needed in orderto ensure sexual isolation when interactions among genes is strong.

. Introduction

The diversity of species on Earth is a striking characteristichich has raised an increased interest and investigations in the

ast decades (see for example Rosenzweig, 1995; Drakare et al.,006 and references therein). Nonetheless, the processes of gen-ration and maintenance of species and biodiversity patterns aretill a matter of debate (Chesson, 2000; Chave et al., 2002). Partic-larly, the question why the number of species varies in distinctlaces and also in different time intervals appears recurrently and

s a very important issue in this field. The crescent interest in thenderstanding of the role of biodiversity on ecosystems has as a goalo preview the consequences of the lost of species due to humannd environmental disturbances (Begon et al., 2006).

Interactions among species can be important in determining theumber of species within a community (Molofsky et al., 1999).urthermore, the spatial structure of the habitat and the spa-

ial distribution of species richness play an important role on thecosystems. Populations are actually expected to vary spatially toome extent (Nielsen et al., 2009; Holland et al., 1991; Parsek andingh, 2003; Bell and Reboud, 1997). More importantly, spatial het-

∗ Corresponding author.E-mail address: [email protected] (P.R.A. Campos).

304-3800/$ – see front matter © 2010 Elsevier B.V. All rights reserved.oi:10.1016/j.ecolmodel.2010.07.023

© 2010 Elsevier B.V. All rights reserved.

erogeneity has often been suggested as an explanation for the highlevels of genetic variation found in natural populations (Kassen,2002), which has a direct consequence on the speed at which pop-ulation evolves and adapt to new environmental conditions.

A fundamental issue in ecology is how many species occurwithin a given area. The spatial structure of species distributionshas long been described in terms of species-area relationship(Arrhenius, 1921; Connor and McCoy, 1979; Lomolino, 2000;Plotkin et al., 2000; Lennon et al., 2002). The species-area relation-ship has been the focus of several empirical and theoretical studies(Plotkin et al., 2000; Lennon et al., 2002; de Aguiar et al., 2009;Pigolotti and Cencini, 2009), which bring about a broad spectrumof possible outcomes and a great divergence about the processesinvolved in the generation of species diversity (Lennon et al., 2002).Though it is well established that greater the area greater the bio-diversity (Williams, 1964; Connor and McCoy, 1979), especiallybecause larger areas contain a larger variety of habitats (Kohn andWalsh, 1994). A substantial fraction of the investigations claims apower-law dependence of the species diversity (S) with area (A),i.e. S ∼ Az, as first suggested by Arrhenius (1921). Some empirical

studies reinforce the power-law distribution for the species-arearelationship and additionally suggest that the exponent z is smallerthan 0.72 (Begon et al., 2006; Rosenzweig, 1995). However, sev-eral other investigations show that for larger areas the species-arearelationship displays a downward curvature and so the power-law

l Mod

otr

S

wta2

htA2iDrboattlefiK2nognu1(asmot

ttWiswne

dwfgoo

2

mtopp

P.R.A. Campos et al. / Ecologica

verestimates the actual diversity of species in this scale. In ordero take into account this effect, Plotkin has proposed the followingelation (Plotkin et al., 2000)

= cAze−kA (1)

here c, k and z are constants. Alternatively, other studies suggesthat different scalings exist for different area intervals (Pigolottind Cencini, 2009; Durrett and Levin, 1996; Rosindell and Cornell,007; de Aguiar et al., 2009).

As mentioned above, several models have been proposed andave addressed the problem of speciation and species-area rela-ionship in a theoretical framework (Pigolotti and Cencini, 2009; deguiar et al., 2009; Durrett and Levin, 1996; Rosindell and Cornell,007; O’Dwyer and Green, 2010). Most of the recent development

n the field make the restrictive assumption of neutral selection.espite this assumption, these investigations are successful in

eproducing some of empirical outcomes in real ecological niches,eing the power-law behavior for the species-area relationship onef the most remarkable outcomes. In the current work, we proposemodel that assumes explicitly the occurrence of viability selec-

ion, mutations and local mating. Additionally the model assumeshat selection acts differently on the genotypes depending on theocation of individuals, i.e. the model assumes environmental het-rogeneity. To ascribe fitness values for genotypes, we model thetness landscape with the well-known NK model, proposed byauffman and Levin (Kauffman and Levin, 1987; Kauffman, 1995,000). In the NK model, the contribution of each gene to the fit-ess depends on its state and also on the state of K other genesf the genotype, that is, it considers epistatic interactions amongenes. The model is able to create a family of tunable rugged fit-ess landscapes, by varying the value of K. The NK model has beensed to study local properties of fitness landscapes (Weinberger,991), molecular clock (Ohta, 1997b,a) and evolutionary dynamicsCampos et al., 2003; Sibani and Pedersen, 1999). Applications of NKnd NK-like models to ecological modelling have been formulated,uch as, modelling of coevolutionary dynamics in individual-basedodels (Christensen et al., 2002; Hall et al., 2002) and modelling

f the evolution of ecological networks where interactions occur athe level of species (Fath and Grant, 2007; Swanack et al., 2008).

The model we introduce employs a slightly modified version ofhe original model by Kauffman and Levin where each locus con-ributes multiplicatively to the fitness of the organism (Welch and

axman, 2005). By changing the level of epistasis on the organismst is possible to survey how epistasis affects the pattern of diver-ity in a given population. This is important since epistasis togetherith sexual reproduction has been indicated as a possible mecha-ism for generating genetic divergence (Presgraves, 2007; Turellit al., 2001).

In the next section, we describe our computational model andefinition of the main quantities used in this approach. After thate show our simulation results and discussions. At this point we

ocus our analysis on the study of time evolution for fitness andenetic divergence among genotypes, and investigate the patternsf diversity with system size, and its dependence on the parametersf the model. Finally, in the last section we present our conclusions.

. Material and methods

Our work is based on extensive computer simulations. For that

atter we have implemented two independent codes written in

he C and C++ programming languages. As expected, the resultsf both codes agree perfectly. In the remaining of this section, weresent in detail the description of the model and the measurementrocedures.

elling 221 (2010) 2546–2554 2547

2.1. The model

2.1.1. Spatial arrangementThe computer simulation model assumes a finite population of

M haploid individuals which is arranged on a two-dimensionalsquare lattice of size L, such that M = L × L (there are no vacantsites), with periodic boundary conditions. Each site in the networkis occupied by a single individual, whose genome is representedby a binary sequence of finite length N. In this way, an individual’sgenome can be denoted by S = (s1, s2, . . ., sN), where each locus siis assumed to take one of two possible values (i.e. alleles) si = 0, 1.The simulation starts with all the M individuals having the samegenome, whose configuration is random, and one generation cor-responding to M independent reproduction events (described inSection 2.1.3).

2.1.2. Fitness landscapeThe capability of adaptation of each organism is determined by

its genotype configuration. In order to relate fitness and genotypewe use a recently modified version of the NK model introduced byWelch and Waxman (2005). The original NK model (Kauffman andLevin, 1987; Kauffman, 1993, 1995, 2000) considers epistatic inter-actions among genes, by assuming that the contribution of eachlocus to the total fitness of the organism depends on the state of Kother loci. Consequently, the level of epistasis can be adjusted bytuning the value of the epistatic parameter K. In their paper, Kauff-man and Levin proposed that each individual was characterizedby a binary finite genome with additive contributions of the genesto the overall fitness. In the simplest version of the NK model, thecontribution of each gene is taken from an uniform distribution.According to the original model the fitness of a given individual iscalculated as

W = 1N

N∑i=1

hi, (2)

where hi is the uniformly distributed contribution of locus i to W.This contribution depends on the state of locus i as well as on K otherloci which are randomly chosen among the N loci. In this case, thefitness landscape is bounded, with maximum value equal to one.

Conversely, the modified model assumes multiplicative (insteadof additive) contributions to the overall fitness. In this case, thegenotype’s fitness W is given by

W = eh1 × eh2 × · · · × ehN , (3)

where hi is the contribution of locus i to W. As before, the contri-bution depends on the state of locus i as well as on K other loci.Further, the values of hi are drawn from a Gaussian distribution ofmean 0 and variance �2 (in opposition to the uniform distribution),i.e.,

f (hi) = 1√2��2

exp

(−h2

2�2

). (4)

The Gaussian distribution allows the fitness landscape to includevery high and low rare fitness values which are not available foruniformly distributed gene contributions. However, it introducesnegative values for the contribution of the genes for the overallfitness which are technically undesirable (Wolf et al., 2000). Forthis reason, the multiplicative landscape is used.

Since the contribution of each locus to the individual’s fitness

depends on the state of K other loci, we build up a lookup tablewith 2K+1 entrances for each locus. The value of each entrance onthis table is an independent random number which is drawn fromdistribution (4). In Fig. 1 we show an instance of a lookup table fora genome of size N = 6, and epistatic parameter K = 1.

2548 P.R.A. Campos et al. / Ecological Modelling 221 (2010) 2546–2554

F size isa us 3,

2

vfiaicigipt

Firo

ig. 1. Snapshot of the lookup table for the NK model. In this example, the genomend 5. The contribution h1 of locus 1 to genome’s fitness depends on the state of loc

.1.3. Reproductive schemeReproduction, by its turn, is considered to be sexual, and indi-

iduals are also subjected to mutations and viability selection. In arst step, a random chosen individual on the lattice (denoted by Iind represented by the red square in Fig. 2) mates with one of thendividuals in its neighborhood (denoted by Ij, and also randomlyhosen), which comprises the four nearest neighbors, as illustratedn Fig. 2 by the four light blue squares (sites). This pair of individuals

ives rise to an offspring which results from the recombination ofndividuals Ii and Ij. For simplicity, one assumes that the crossoveroint always occurs in the middle of the genome. At this point,he offspring’s genome is composed by L/2 genes from Ii’s genome,

ig. 2. Illustration of the reproduction process. An individual (red square) and one ofts neighbors are randomly chosen to mate. Their genomes are recombined and giveise to an offspring. After mutation (not shown in this figure), the offspring replacesne of the occupants of the von Neumann neighborhood of the chosen individual.

N = 6 and epistatic parameter K = 1. The figure displays the lookup table for locus 1while locus 5 has its contribution dependent on locus 2.

while the remaining loci are inherited from individual Ij, as alsoillustrated in Fig. 2. After reproduction, the offspring acquires a sin-gle mutation with probability U in a randomly locus – that is, thestate of the locus is flipped (0 → 1 or 1 → 0).

The final step corresponds to the replacement of an individual inthe von Neumann neighborhood of the individual Ii (the four near-est neighbors plus the individual Ii itself) by the resulting offspring.Natural selection shows up at this point. Poorly adapted individu-als are more prone to removal than well adapted ones. This processis implemented by the assumption that the removal likelihood isproportional to 1/Wk, where Wk denotes the fitness of individual k.In such a way, selection is local, i.e., the model assumes soft selec-tion (Goodnight et al., 1992). For instance, suppose that the fitnessof the individual Ii and its neighbors are respectively W1, W2, W3,W4 and W5. The probability that the individual Ik is the one to bereplaced is

pk =1

Wk

5∑j=1

1Wj

. (5)

In Fig. 2, for example, the left neighbor (k = 4) of the individual Iiwas the one chosen to be replaced.

2.1.4. Spatial heterogeneityEnvironmental heterogeneity is taken into consideration by

dividing the lattice into distinct regions. The relevant parameterhere is �, which corresponds to the linear size of each region, andso each region has area equal to A� = � × �. The number of distinctareas, i.e. the habitat diversity, is then Nr = M/A�. For each of thesehabitats we build up an independent set of lookup tables to repre-sent the adaptability to the different environmental characteristicsof the habitats. Therefore, the individuals are subjected to differentselective pressures in each habitat – the same individual possiblybeing very well adapted to one habitat, but poorly adapted to theothers. In this way, each locus owns Nr lookup tables, each onereferring to a specific region of the network (habitat). Fig. 3 shows

a square lattice of linear size L = 10, divided in four distinct habi-tats of linear size � = 5. Each region (illustrated in the figure with adifferent shade) has its own set lookup tables.

At this point, we would like summarize the roles of genotype,environment and phenotype in the model. The genotype is char-

P.R.A. Campos et al. / Ecological Mod

FsEw

aodtn

2i

s(n

Ffi

ig. 3. Illustration of the habitat diversity. In this example, a square lattice of 10 × 10ites is divided in four habitats (represented by the different shades) of 5 × 5 sites.ach habitat distinguishes itself from the other by its environmental characteristicshich are represented in the model by different sets of lookup tables.

cterized by the sequence S = (s1, s2, . . ., sN) composed of N zerosr ones, while the environment is included in the model via theistinct fitness lookup tables. Finally, selection takes place whenhe newborn most likely substitutes the less fit individuals in theeighborhood of its parents.

.2. Measurement procedure for the intra-region andnter-regions average pairwise difference

With the advent of modern genetic sequencing, new tools forequencing alignment have been developed and extensively usedAltschul et al., 1997). One of the measurements that is availableowadays is the pairwise comparison between genetic sequences.

ig. 4. Illustration of pairwise differences measurements. Pairs of genomes are put one aor all pairs of genomes within each habitat (intra-region) and between all pairs of genontra-region pairwise difference is 2 and for the inter-region is 7.

elling 221 (2010) 2546–2554 2549

For instance, in a recent paper, the count of the pairwise divergencewas used to determine the genes possibly involved in the speciationof fruit flies (Sousa-Neves and Rosas, 2010).

Here we use the average pairwise difference as a quantifica-tion of the genetic divergence among the pool of genotypes in thepopulation. In our analysis, we distinguish two situations: the intra-region measurement, that is, the average is made over the diversityof genomes within a given habitat; and inter-region, where theaverage is taken over genome diversity between pairs of habitats.Fig. 4 illustrates how these measurements are performed. First wesearch the different genotypes in each habitat – illustratively, inFig. 4, these genotypes have been labelled as Genome 1, Genome2 and Genome 3, in the first region and Genome ˛ and Genome ˇ,in the second region. Next, we do a pairwise comparison amongthe genomes in a given region, counting how many alleles differbetween them, and average over all pairs in that habitat. This iswhat we call the intra-region average pairwise difference. Finally,we do pairwise comparison among genotypes of different regionsin order to obtain the inter-region average pairwise difference.

3. Results and discussions

In this section, we present our simulation results. We first focusour attention in the mean population fitness, the genome diversityand the average pairwise differences among genomes. Our resultsare averaged over different runs (sample average), each sampleconsisting of four habitats, except when explicitly stated otherwise.

Fig. 5 shows a typical evolutionary trajectory of a populationevolving on a system of linear size L = 100 which holds four distinctregions of same size (single run). The figure displays a snapshot ofthe population at four distinct times: t = 2000, t = 4000, t = 6000 andt = 8000 generations. The different colors denote distinct genomeconfigurations. In all the four snapshots, it is possible to distinguishthe habitats and their border lines, since different genotypes occupythe different habitats. Additionally it is perceivable that the individ-uals which are dominant in a given habitat are unlikely to invadeother places. This seems to indicate the emergence of reproductive

isolation.

Fig. 6 presents typical (single run) trajectories for the evolu-tion of the mean population fitness (left panel), which is defined asW = (1/M)

∑Mi=1Wi, and the genome diversity (right panel). Typi-

cally, the mean population fitness grows rapidly in the earlier stages

bove the other and the number of mismatches (loci in red) is counted. This is donemes of different habitats (inter-region). For the pairs illustrated in this figure, the

2550 P.R.A. Campos et al. / Ecological Modelling 221 (2010) 2546–2554

Fig. 5. Snapshots of the time evolution of the population. The different colors represent distinct genotype configurations. The parameter values are L = 100, N = 64, K = 3 andU = 0.001. The number of distinct habitats is Nr = 4. The snapshots were taken at generations t = 2000 (a), t = 4000 (b), t = 6000 (c) and t = 8000 (d).

Fig. 6. Typical curves for mean population fitness (left panel) and diversity (right panel) as a function of time. The parameter values are L = 100, � = 50, N = 64, K = 3 andmutation rate U = 0.001 and �2 = 0.0005. The line is a eye guide for the 1/t rate of growth of the fitness.

P.R.A. Campos et al. / Ecological Modelling 221 (2010) 2546–2554 2551

Fig. 7. Average pairwise difference as a function of time. The solid symbols denotethe intra-region measurements and open symbols represent the inter-regions mea-surements. For the intra-region case, the different symbols denote different habitats(in this simulation, there are four habitats). For the inter-region case, each sym-bT�

owctBpctagltrteetasswvsafc

btEtagagF

Fus(

of speciation (Rosindell and Cornell, 2007; Chave et al., 2002).Opposed to the previous scenario, now the exponent z is largerin the regime of large areas than their corresponding values in theregime of intermediate habitat sizes. Therefore, the model concil-

ol denote a different pair of habitats (in this case, 6 pairs of distinct habitats).he parameter values are L = 100, � = 50, N = 64, K = 3, mutation rate U = 0.001 and2 = 0.005. In the simulations, averages over 50 independent runs were taken.

f the adaptive process, but for longer times its rate of growth dropsith 1/t, as indicated by the guide eye line in the figure. This is a

ommon feature of complex fitness landscapes, and this is preciselyhe situation in NK fitness landscapes (Sibani and Pedersen, 1999).ecause the initial population starts from a single genome as theopulation founder, the genome diversity increases abruptly as aonsequence of the sources of variation, mutation and recombina-ion. After reaching a maximum, it drops before fluctuating aroundgiven value (saturation). The figure shows the results for times

oing up to 10000, but we did verify that the same trends hold foronger times. These fluctuations occur steadily since during selec-ive sweeps (appearance of a better adapted individual) diversity iseduced, and then increases after the fixation of the mutant. Thoughhis fixation is local, restricted to each habitat. The static view ofcosystems where diversity and other features do not change afterquilibrium is incorrect, as pointed out by Hubbell in his neutralheory of biodiversity and biogeography (Hubbell, 2001). Actu-lly, natural communities exhibit strong fluctuations owing to thetochastic character of the ecological processes. The present modelhares this same perspective of evolution as a dynamical process,here diversity and rate of growth of fitness (not the fitness itself)

aries around a mean value in time. Obviously, for infinite time theystem will eventually reach a state where the fitness also fluctu-tes around a mean value. Nevertheless, the time scale requiredor this regime is not of interest for natural populations and is notomputationally achievable.

We now turn our attention to the average pairwise differencesetween the different genomes. In Fig. 7 we observe a clear dis-inction between the intra-region and inter-regions measurements.ach set of points denotes the measurement between pairs of habi-ats. The intra-region measurements indicate that the divergencemong individuals within any given habitat is about 3% of theenome size (N = 64), while the inter-regions measurements showdivergence of about 40% of the genome size. This level of diver-

ence, together with the reproductive isolation (characterized inig. 5), is a signature that speciation has occurred.

The next subject we tackle is the species-area relationship. In

ig. 8 we plot the species diversity versus area for different val-es of the epistatic parameter K and mutation rate U = 0.001. Here,pecies diversity and genomic diversity are treated as the samepoint mutation assumption, Hubbell, 2001; Rosindell and Cornell,

Fig. 8. Species Diversity, S, versus area, A. The parameter values are N = 64, Nr = 4,mutation rate U = 0.001 and �2 = 0.0005. The different symbols denote distinct valuesof the epistatic parameter K: K = 1 (solid circles), K = 3 (empty circles), K = 5 (trianglesup) and K = 8 (stars). The numbers indicate the slopes of the dashed lines.

2007). Eq. (1) has failed to fit the simulation data accordingly, butthe results are compatible with a crossover behavior: two distinctpower-laws are verified, being one in the range of small and inter-mediate system sizes, and a second power-law for large areas. Theresults also point out the dependence of the relation diversity-areawith the epistatic parameter K, where larger values of K impliessmaller values of the exponent z and smaller diversity. This is seenin both regimes of the power-law relation. Additionally, the expo-nent z in the regime of large areas is always smaller than thecorresponding exponent in the regime of small and intermediateareas. Furthermore, the values of the exponents occur within therange verified in measurements of real ecosystems (Begon et al.,2006). On the other hand, when a smaller mutation rate, U = 0.0001,is considered (Fig. 9) a more complex scenario emerges. Small sys-tem now fit in a different scale, and the system presents threephases. Similar triphasic pattern has also been observed empiri-cally (Rosenzweig, 1995), and mainly in theoretical neutral models

Fig. 9. Species Diversity, S, versus area, A. The parameter values are N = 64, Nr = 4,mutation rate U = 0.0001 and �2 = 0.0005. The different symbols denote distinct val-ues of the epistatic parameter K: K = 1 (solid circles), K = 3 (empty circles), K = 5(triangles up) and K = 8 (stars). The numbers indicate the slopes of the dashed lines.

2552 P.R.A. Campos et al. / Ecological Modelling 221 (2010) 2546–2554

F tinct regions Nr . The parameter values are L = 300, N = 64, U = 0.001 and �2 = 0.0005. Thed ircles), K = 3 (empty circles) and K = 8 (triangles up). Right panel – the mean populationfi he parameter values are the same as in the left panel.

ip

rdildsalvdetTfAdvaiumftTdttittr

dprwstp

ig. 10. Left panel – average pairwise difference as a function of the number of disifferent symbols denote distinct values of the epistatic parameter K: K = 1 (solid ctness after the population has evolved for 10,000 generations as a function of Nr . T

ates the results obtained from neutral evolution models with theresence of natural selection.

The average pairwise distance among those species that haveeached a population of at least 50 individuals versus the habitativersity Nr = M/A� is shown in the left panel of Fig. 10. The quantity

s a peaked function of Nr with its maximum around Nr = 40, for aattice of side 300. For small habitat diversity, the average pairwiseistance is small because there are not many different habitats forpeciation to develop. The drop of the average pairwise differencet large habitat diversities, owes to the finiteness of the genomeength and also owes to a relatively production of hybrids indi-iduals in the interface among the habitats. At very large habitativersity it is not possible to produce a great genome diversity,xploring more regions of the genotype space, and at the sameime expecting that no correlation exists between the genomes.he position of the peak depends on the lattice size. For instance,or a lattice of linear size L = 100 the maximum occurs around 16.n interesting feature is the dependence of the average pairwiseistance with the epistatic parameter K, being smaller for greateralues of K. This proves that fewer mutations are needed to enabledaptation in a novel environment when epistasis among geness stronger. The right panel of Fig. 10 shows how the mean pop-lation fitness varies with the habitat diversity. In this example,easurements were considered after the population has evolved

or 10,000 generations. The mean population fitness, W , is a mono-onic decreasing function of Nr, being larger for greater values of K.he reason for this monotonic decreasing is that for larger habitativersity the number of individuals in each habitat is smaller, sincehe total population (size of the lattice) is the same. Consequentlyhe probing of the fitness space becomes less efficient. This pictures reinforced by Fig. 11 (see discussion below). It is also notewor-hy that epistasis plays an important role here since it can enablehe population to reach higher peaks of the fitness landscape moreapidly.

For the sake of completeness, Fig. 11 shows again the depen-ence of the mean population fitness with habitat diversity for aopulation of constant size M = 300 × 300, but now we compare the

esults with the mean fitness of a single population of size A� = � × �hich is placed in a homogeneous environment. The comparison

hows that the adaptation values are very similar, which meanshat the mean population fitness reached by an entire populationlaced in a heterogeneous environment is nearly equal to levels of

Fig. 11. Mean population fitness as a function of the number of distinct regionsNr (solid circles). The parameter values are L = 300, N = 64, K = 3, U = 0.001 and�2 = 0.0005. The empty circles represent the mean fitness of a single homogeneouspopulation of size A� , which corresponds to the size of each habitat.

adaptation reached by homogeneous populations of size compara-ble to the size of a single habitat. In this way, the subpopulations inthe network evolve almost independently of each other, reinforcingour ascertainment that speciation has in fact occurred. Further-more, each subpopulation in a given habitat is basically exploringa different subset of the genotype space.

4. Conclusions

We have proposed an agent-based model to study ecologicalspeciation events in structured and heterogeneous environments,where physical barriers do not exist. The model comprisesimportant factors of natural ecological niches such as natural selec-tion, sexual reproduction and mutations, opposed to standard

approaches in the study of speciation modelling where neu-tral selection is generally assumed (Pigolotti and Cencini, 2009;de Aguiar et al., 2009; Durrett and Levin, 1996; Rosindell andCornell, 2007). Besides the selective pressure, another importantfeature has been considered in the model: epistasis. Furthermore,

l Mod

io(

gwMtensiaobau2

dAotthasaaaaisoqFtsWamCasts

ssdoemtmfrr

A

vaC

P.R.A. Campos et al. / Ecologica

t has recently been claimed that epistasis is a key factor for theccurrence of genetic divergence, hybrid sterility and inviabilityPresgraves, 2007; Turelli et al., 2001).

The model has been very successful in generating genetic diver-ence by considering heterogeneity among the different habitats,hich is modelled by assuming variability in selection pressure.ore importantly, it is also clear that this genetic divergence comes

ogether with sexual isolation. The hybrids, those offsprings gen-rated from parents inhabitting on different habitats, are usuallyot able to survive viability selection. This fact is enhanced fortronger epistasis where a smaller genetic divergence is neededn order to ensure sexual isolation. Diversity within each habitatlso exists and it is robust over time. Hence the results corrob-rate the scenario that a very strong epistasis speciation eventecomes more likely. This fact reinforces even further speciations a possible outcome of the evolutionary process since sex-al reproduction selects for negative epistasis (Azevedo et al.,006).

The species-area relationship has also been addressed and itsependence on the parameters of the model has been investigated.s expected, larger mutation rates are able to sustain larger levelsf diversity. On the other side, epistasis acts contrarily and tendso reduce the genotype diversity. In the cases studied, the exis-ence of a power-law dependence for the species-area relationshipas been gathered, which takes place in the range of intermediatend large areas. Several distinct pictures to species-area relation-hip have been presented in the literature. As observed by otheruthors (Rosenzweig, 1995; Pigolotti and Cencini, 2009; Rosindellnd Cornell, 2007; Chave et al., 2002), in the regime of intermedi-te areas, a power-law like relation between species diversity andrea is noticed. Our simulations show that the exponent describ-ng the power-law depends on the epistasis parameter K, beingmaller for larger K. The range of values of z are within the range ofbserved values in natural ecosystems. The mutation rate can alsoualitatively influence the species-area relationship, as shown inigs. 8 and 9. For intermediate and large values of mutation rate,wo distinct regimes appear, whereas, for low mutation rates, thepecies-area relationship is better described by a triphasic picture.

e also notice that in the former the values of the exponent zre smaller in the range of large areas than in the range of inter-ediate sizes, and the opposite occurs in the triphasic scenario.

onsequently, this work gives a step to conciliate the most relevantnd established evolutionary forces in a theoretical framework totudy the problem of speciation, namely, the triphasic picture ofhe neutral model and the biphasic behavior that appears whenelection plays its role.

In a future perspective we wish to investigate the role ofelective correlation among the habitats in a way to produce amooth transition from a completely heterogeneous model, as weo here, to one where the environment is homogeneous. All previ-us attempts have failed to produce biodiversity in homogeneousnvironment subject to natural selection. On the other side, theodel we introduced works in the other extreme, and the habi-

ats are completely uncorrelated. Certainly in real ecosystems aore gradual transition between environments is more commonly

ound. Our belief is that this intermediate assumption on the envi-onmental conditions can also harbor enough diversity in a givenange of correlation.

cknowledgements

The authors are supported by Conselho Nacional de Desen-olvimento Científico e Tecnológico (CNPq). PRAC and VMOcknowledge financial support from the program PRONEX/MCT-NPq-FACEPE and Fundacão de Amparo à Ciência e Tecnolo-

elling 221 (2010) 2546–2554 2553

gia do Estado de Pernambuco (FACEPE). AR is thankful toNANOBIOTEC/CAPES for financial support.

References

Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D., 1997.Gapped BLAST and PSI-BLAST: a new generation of protein database searchprograms. Nucleic Acids Res. 25 (17), 3389–3402.

Arrhenius, O., 1921. Species and area. J. Ecol. 9, 95–99.Azevedo, R.B.R., Lohaus, R., Srinivasan, S., Dang, K.K., Burch, C.L., 2006. Sexual repro-

duction selects for robustness and negative epistasis in artificial gene networks.Nature 440, 87–90.

Begon, M., Townsend, C.R., Harper, J.L., 2006. Ecology: from Individuals to Ecosys-tems. Blackwell Publishing, USA.

Bell, G., Reboud, X., 1997. Experimental evolution in chlamydomonas. ii. Geneticvariation in strongly contrasted environment. Hereduty 78, 498–506.

Campos, P., Adami, C., Wilke, C., 2003. Optimal adaptive performance and delocal-ization in NK fitness landscapes (vol. 304, p. 495, 2002). Physica A: Stat. Mech.Appl. 318 (3–4), 637.

Chave, J., Muller-Landau, H.C., Levin, S.A., 2002. Comparing classical communitymodels: theoretical consequences for patterns of diversity. Am. Nat. 159, 1–23.

Chesson, P., 2000. Mechanisms of maintenance of species diversity. Ann. Rev. Ecol.Syst. 31, 343–366.

Christensen, K., di Collobiano, S.A., Hall, M., 2002. Tangled nature: a model of evolu-tionary ecology. J. Theor. Biol. 216, 73–84.

Connor, E.F., McCoy, E.D., 1979. The statistics and biology of the species-area rela-tionship. Am. Nat. 113, 791–833.

de Aguiar, M.A.M., Baranger, M., Baptestini, E.M., Kauffman, L., Bar-Yam, Y., 2009.Global petterns of speciation and diversity. Nature 460, 384–388.

Drakare, S., Lennon, J.J., Hillebrand, H., 2006. The imprint of the geographical, evo-lutionary and ecological context on species-area relationships. Ecol. Lett. 9,215–227.

Durrett, R., Levin, S.A., 1996. Spatial models for species-area curves. J. Theor. Biol.179, 119–127.

Fath, B.D., Grant, W.E., 2007. Ecosystems as evolutionary complex systems: networkanalysis of fitness models. Environ. Modell. Softw. 22, 693–700.

Goodnight, C.J., Schwartz, J.M., Stevens, L., 1992. Contextual analysis of models ofgroup selection, soft selection, hard selection, and the evolution of altruism.Am. Nat. 140, 743–761.

Hall, M., Christensen, K., di Collobiano, S.A., Jensen, H.J., 2002. Time-dependentextinction rate and species abundance in a tangled-nature model of biologicalevolution. Phys. Rev. E 66, 011904.

Holland, J.J., dela Torre, J.C., Clarke, D.K., Duarte, E.A., 1991. Quantitation of relativefitness and great adaptability of clonal populations of rna viruses. J. Virol. 65,2960–2967.

Hubbell, S.P., 2001. The Unified Neutral Theory of Biodiversity and Biogeography.Princeton University Press.

Kassen, R., 2002. The experimental evolution of specialists, generalists and the main-tenance of diversity. J. Evol. Biol. 15, 173–190.

Kauffman, S., 1993. Origins of Order: Self-Organization and Selection in Evolution.Oxford University Press, Oxford.

Kauffman, S., 1995. At Home in the Universe. Oxford University Press, Oxford.Kauffman, S., 2000. Investigations. Oxford University Press, Oxford.Kauffman, S.A., Levin, S., 1987. Towards a general theory of adaptive walks on rugged

landscapes. J. Theor. Biol. 128, 11–45.Kohn, D.D., Walsh, D.M., 1994. Plant species richness – the effect of island size and

habitat diversity. J. Ecol. 82, 367–377.Lennon, J.J., Kunin, W.E., Hartley, S., 2002. Fractal species distributions do not pro-

duce power-law species-area relationships. Oikos 97, 378–386.Lomolino, M.V., 2000. Ecology’s most general, yet protean pattern: the species-area

relationship. J. Biogeogr. 27, 17–26.Molofsky, J., Durrett, R., Dushoff, J., Griffeath, D., Levin, S., 1999. Local frequency

dependence and global coexistence. Theor. Popul. Biol. 55, 270–282.Nielsen, E.E., Hemmer-Hansen, J., Larsen, P.F., Bekkevold, D., 2009. Population

genomics of marine fishes: identifying adaptive variation in space and time.Mol. Ecol. 18, 3128–3150.

O’Dwyer, J.P., Green, J.L., 2010. Field theory for biogeography: a spatially explicitmodel for predicting patterns of biodiversity. Ecol. Lett. 13 (1), 87–95.

Ohta, T., 1997a. The meaning of near-neutrality at coding and non-coding regions.Gene 205 (1–2), 261–267.

Ohta, T., 1997b. Role of random genetic drift in the evolution of interactive systems.J. Mol. Evol. 44, S9–S14.

Parsek, M.R., Singh, P.K., 2003. Bacterial biofilms: an emerging link to disease patho-genesis. Annu. Rev. Microbiol. 57, 677–701.

Pigolotti, S., Cencini, M., 2009. Speciation-rate dependence in species-area relation-ship. J. Theor. Biol. 260, 83–89.

Plotkin, J.B., Potts, M.D., Yu, D.W., Bunyavejchewin, S., Condit, R., Foster, R., Hubbell,S., LaFrankie, J., Manokaran, N., Lee, H., Sukumar, R., Novak, M.A., Ashton, P.S.,

2000. Predicting species diversity in tropical forests. Proc. Natl. Acad. Sci. U.S.A.97, 10850–10854.

Presgraves, D.C., 2007. Speciation genetics: epistasis, conflict and the origin ofspecies. Curr. Biol. 17, R125–R127.

Rosenzweig, M.L., 1995. Species Diversity in Space and Time. Cambridge UniversityPress, Cambridge.

2 al Mod

R

S

S

S

554 P.R.A. Campos et al. / Ecologic

osindell, J., Cornell, S.J., 2007. Species-area relationships from a spatially explicitneutral model in an infinite landscape. Ecol. Lett. 10, 586–595.

ibani, P., Pedersen, A., 1999. Evolutionary dynamics in terraced nk fitness land-scapes. Europhys. Lett. 48, 346–352.

ousa-Neves, R., Rosas, A., 2010. An analysis of genetic changes during the divergenceof drosophila species. PLoS ONE 5 (5), 10485.

wanack, T.M., Grant, W.E., Fath, B.D., 2008. On the use of multi-species nk modelsto explore ecosystem development. Ecol. Modell. 218, 367–374.

elling 221 (2010) 2546–2554

Turelli, M., Barton, N.H., Coyne, J.A., 2001. Theory and speciation. TREE 16, 330–343.Weinberger, E., 1991. Local properties of Kauffman N-kappa model – a tunably

rugged energy landscape. Phys. Rev. A 44 (10), 6399–6413.Welch, J.J., Waxman, D., 2005. The nk model and population genetics. J. Theor. Biol.

234, 329–340.Williams, C.B., 1964. Patterns in the Balance of Nature. Academic Press, London.Wolf, J.B., Brodie, E.D., Wade, M.J., 2000. Epistasis and the Evolutionary Process.

Oxford University Press, USA.