The Need for Bioinformatics in Evo-Devo

9
Articles T wo of the most exciting and difficult challenges in biology are to understand how morphology, or anatom- ical form, is created in development and to determine how such developmental morphology is transformed in evo- lution. These questions unite the emergent field of “evo-devo,”the interdisciplinary study of evolution and de- velopment, which has become one of the most rapidly mov- ing fields in biology. A number of excellent reviews and books document the history, knowledge base, and syntheses in this field (Gilbert 2003, Minelli 2003, Holland 2004, Carroll 2005). As evo-devo scientists sharpen their focus on comparing developmental morphology, it be- comes more obvious that the comparisons must be “read” within a phylogenetic framework (Baum et al. 2005). In fact, tracing the pattern of evolutionary transformation of the developing phenotype against the evolutionary tree of life defines a significant problem set for evo-devo (Wagner and Larsson 2003). In addition, the impact of various de- velopmental mechanisms (Arthur 2004) can be assessed against the pattern defined by the tree of life. Explaining the pattern of evolutionary change in morphology in an inte- grative fashion requires a systems approach—synthesizing knowledge from various biological levels encompassed in de- velopment (e.g., DNA sequence, regulatory networks, cell–cell and tissue interactions), evolution (e.g., phylogenetic relationships, population level), and ecology. The critical need to relate genotypic to phenotypic data, and the difficulty of that endeavor, is echoed in many recent pa- pers (Hall 2003, Smith 2003, Irish and Benfey 2004, Kuratani 2004). Despite the battery of tools being used to address the question of how an organism is built (i.e., how the develop- ing phenotype, the observable expression of the genotype, is created in development), the connection between genotype and phenotype remains poorly understood. Large-scale genome sequencing has resulted in a “parts catalog” of com- plete information about what is in the genome, and yet the “user’s manual,” the genetic and molecular basis and rules by which morphology is assembled, is poorly understood (Kitano 2002). It is a gargantuan leap to go from genotype to phenotype using tools from developmental genetics. Progress is being made by chipping away at genotype–phenotype re- lationships in the context of developmental genetics of model species, and another jump, aided by tools of phylogenetics, has been taken to connect development to the directionality and timing of evolutionary change. But it is the tools of com- putation, of bioinformatics and ontologies (i.e., structured Paula M. Mabee (e-mail: [email protected]) is an evo-devo biologist at the University of South Dakota, Vermillion, SD 57069. She studies skeletal devel- opment and evolution in fishes. © 2006 American Institute of Biological Sciences. Integrating Evolution and Development: The Need for Bioinformatics in Evo-Devo PAULA M. MABEE This article is an overview of concepts relating to the integration of the genotype and phenotype. One of the major goals of evolutionary developmental biology, or evo-devo, is to understand the transformation of morphology in evolution. This goal can be accomplished by synthesizing the data pertaining to gene regulatory networks and making use of the increasingly comprehensive knowledge of phylogenetic relationships and associated phenotypes. I give several examples of recent success in connecting these different biological levels. These examples help illuminate the “black box” between genotype and phenotype and illustrate a few of the technical and bioinformatic challenges ahead. The key concept of modularity unites genetic, developmental, and evolutionary approaches, because modules are the units of evolution. Primitive and derived network modules interact in development to create the phenotype. To obtain a systems-level understanding of evolutionary phenotypic change, bioinformatics approaches involving ontologies need to be applied, and new methods of visualization need to be developed. Keywords: bioinformatics, genetics, modularity, evolution, development www.biosciencemag.org April 2006 / Vol. 56 No. 4 • BioScience 301 Downloaded from https://academic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

Transcript of The Need for Bioinformatics in Evo-Devo

Articles

Two of the most exciting and difficult challenges inbiology are to understand how morphology, or anatom-

ical form, is created in development and to determine howsuch developmental morphology is transformed in evo-lution. These questions unite the emergent field of“evo-devo,” the interdisciplinary study of evolution and de-velopment, which has become one of the most rapidly mov-ing fields in biology. A number of excellent reviews andbooks document the history, knowledge base, and syntheses in this field (Gilbert 2003, Minelli 2003, Holland2004, Carroll 2005). As evo-devo scientists sharpen their focus on comparing developmental morphology, it be-comes more obvious that the comparisons must be “read”within a phylogenetic framework (Baum et al. 2005). Infact, tracing the pattern of evolutionary transformation ofthe developing phenotype against the evolutionary tree oflife defines a significant problem set for evo-devo (Wagnerand Larsson 2003). In addition, the impact of various de-velopmental mechanisms (Arthur 2004) can be assessedagainst the pattern defined by the tree of life. Explaining thepattern of evolutionary change in morphology in an inte-grative fashion requires a systems approach—synthesizingknowledge from various biological levels encompassed in de-velopment (e.g., DNA sequence, regulatory networks,cell–cell and tissue interactions), evolution (e.g., phylogeneticrelationships, population level), and ecology.

The critical need to relate genotypic to phenotypic data, andthe difficulty of that endeavor, is echoed in many recent pa-pers (Hall 2003, Smith 2003, Irish and Benfey 2004, Kuratani2004). Despite the battery of tools being used to address thequestion of how an organism is built (i.e., how the develop-ing phenotype, the observable expression of the genotype, iscreated in development), the connection between genotypeand phenotype remains poorly understood. Large-scalegenome sequencing has resulted in a “parts catalog” of com-plete information about what is in the genome, and yet the“user’s manual,” the genetic and molecular basis and rules by which morphology is assembled, is poorly understood(Kitano 2002). It is a gargantuan leap to go from genotype tophenotype using tools from developmental genetics. Progressis being made by chipping away at genotype–phenotype re-lationships in the context of developmental genetics of modelspecies, and another jump, aided by tools of phylogenetics, hasbeen taken to connect development to the directionality andtiming of evolutionary change. But it is the tools of com-putation, of bioinformatics and ontologies (i.e., structured

Paula M. Mabee (e-mail: [email protected]) is an evo-devo biologist at the

University of South Dakota, Vermillion, SD 57069. She studies skeletal devel-

opment and evolution in fishes. © 2006 American Institute of Biological

Sciences.

Integrating Evolution andDevelopment: The Need forBioinformatics in Evo-Devo

PAULA M. MABEE

This article is an overview of concepts relating to the integration of the genotype and phenotype. One of the major goals of evolutionary developmentalbiology, or evo-devo, is to understand the transformation of morphology in evolution. This goal can be accomplished by synthesizing the data pertaining to gene regulatory networks and making use of the increasingly comprehensive knowledge of phylogenetic relationships and associated phenotypes. I give several examples of recent success in connecting these different biological levels. These examples help illuminate the “black box”between genotype and phenotype and illustrate a few of the technical and bioinformatic challenges ahead. The key concept of modularity unites genetic, developmental, and evolutionary approaches, because modules are the units of evolution. Primitive and derived network modules interactin development to create the phenotype. To obtain a systems-level understanding of evolutionary phenotypic change, bioinformatics approaches involving ontologies need to be applied, and new methods of visualization need to be developed.

Keywords: bioinformatics, genetics, modularity, evolution, development

www.biosciencemag.org April 2006 / Vol. 56 No. 4 • BioScience 301

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

vocabularies that make databases interoperable), that arenow required to merge the biological levels of knowledge.Withimproved methods of data mining and analysis, evolutionaryquestions can be answered with the desired mechanisticdepth. The purpose of this article is to explain a key concept—namely, modularity—that spans the gap from genotype tophenotype; to describe a few of the recent successes in con-necting genotype and phenotype; and to champion the needfor bioinformatics tools to facilitate more rapid and higher-level progress.

ModularityModules are a piece of common conceptual ground for evo-lutionary and developmental biologists because they span andunite biological levels. Biologists working at different levels,however, have different conceptions of modules: Geneticistsconceive of modularity in terms of gene organization andfunction, developmental biologists consider the modularityof gene regulatory networks, and many evolutionary biologiststhink in terms of phenotypic “characters.” Modularity at themorphological level is somewhat intuitive: An organism iscomposed of discrete parts, and these parts are semi-independent in development (figure 1) and evolution. Mor-phological modules as entities are assumed to have a devel-opmental genetic integrity, and as such they are individuated(Wagner 1989, Geeta 2003).

The limb bud is an excellent example of modularity atboth morphological and genetic levels. The paired limbshave a clear morphological identity and a limited degree ofconnectivity to the rest of the body at the morphologicallevel. At a genetic level, the regulatory cascades that controllimb development (Tabin et al. 1999) constitute a patterningmodule that is conserved between forelimbs and hindlimbs.Other examples of morphological modules include

actinopterygian fish fins (Mabee et al. 2002), the mousemandible (morphometrically partitioned into submodules;Klingenberg et al. 2003), and bird feathers (hierarchicallynested into metamodules in development; Prum and Dyck2003).

Morphological modules, inasmuch as they represent unitsof homology, may be considered equivalent to the morpho-logical characters of phylogenetic systematics. Evolutionarycharacters are the homologous structures that vary acrossspecies, and multiple characters are analyzed together to gen-erate phylogenies. Although systematists do not have “mod-ularity” as a working concept and evo-devo biologists do notthink in terms of “characters,”it is appropriate to consider theseterms equivalent, because both refer to an individuated en-tity that is genetically determined, homologous, and main-tained across taxa. In working terms, because so little isunderstood about the modules underlying the phenotype, sys-tematists cannot use them as characters to infer phylogeny atthis time.As details about how modules are integrated are bet-ter understood, these data can be added to phylogeneticanalyses.

Underlying modularity at the morphological level—and responsible for it—are integrated developmental networks (seeSchlosser and Wagner 2004). Modules have been defined, infact, as “networks of interacting elements behaving as relativelyindependent units of development or function’’ (Schlosser andThieffry 2000). At the level of gene interactions, modulesmay range from interactions among a few genes to large-scalegenetic networks, such as the complex regulatory gene net-work for endomesoderm specification in the sea urchin (fig-ure 2; Davidson et al. 2002). This complex regulatory networkunderlies the specification of a single cell type (endomeso-derm). Clearly, the development of even a single aspect of mor-phology, which is controlled by many connected networks ofregulatory genes, must be staggeringly complex. Interest-ingly, though, the complexity of body plan development is saidto be matched by that of some human-engineered systems(Csete and Doyle 2002). A Boeing 777 generates a volume ofinformation similar to that in the human genome everyminute: 150,000 modules, organized through protocols intonetworks (Csete and Doyle 2002). Genetic modules, or “cas-settes”of genes used to effect a common function (Rudel andSommer 2003), interact with one another to ultimately spec-ify phenotypes. Thus, modules that themselves interact, thatvary semi-independently, and that together specify a modu-lar morphology (figure 1) are an important link betweengenotype and phenotype. One might say that such molecu-lar modules, the gene interactive networks and their con-necting protocols or rules (Ravasz et al. 2002), are the buildingblocks of development and evolution.

The network properties of development are increasingly dis-cussed in the literature of development and of bioinformat-ics (Ravasz et al. 2002, Qin et al. 2003, Anholt 2004, Cork andPurugganan 2004). Development of the body plan is con-trolled by large networks of regulatory genes, and it essentiallyflows from the correct temporal and spatial transcription of

Articles

302 BioScience • April 2006 / Vol. 56 No. 4 www.biosciencemag.org

Figure 1. The black box between genotype and phenotypeis filled here with the modules (represented by coloredcylinders) that function in development to create mor-phology (more generally, the phenotype). Blue and red in-dicate the evolutionary status of the modules, blue forprimitive and red for derived. Primitive and derivedmodules interact here to form the typical mosaic of primi-tive and derived features (characters or modules) thatcomprise an organism.

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

such genes, which require the correct use of the instructionsencoded in cis-regulatory DNA (Stern 2000, Revilla-i-Domingo and Davidson 2003). Evolution of phenotype oc-curs through changes in these networks. Because generegulatory networks underlie the processes of both develop-ment and evolution, unraveling their architecture in appro-priately chosen species will be the key to understanding howgenomes control development and how they evolve (Revilla-i-Domingo and Davidson 2003). The discovery of gene tran-scription modules from the rapidly accumulating geneexpression data has been aided recently by new analyticalmethods in bioinformatics (Lee TI et al. 2002, Lee I et al. 2004,

Kloster et al. 2005). Uncovering the modular architecture ofthe transcriptional networks that comprise developmental pro-grams (Rast 2003) and understanding the developmentaland evolutionary ramifications of their topology are criticalnew areas in evo-devo.

Underlying the phenotype and the genetic networks isanother level of modularity at the genetic regulatory level. Themodularity of these genetic regulators or “switches” (Carroll2005) is perhaps underappreciated by those working at higherlevels. The cis-regulatory DNA of many genes is organized intoindependent modules that direct or repress transcription inspecific tissues at particular times in development. The mod-

Articles

www.biosciencemag.org April 2006 / Vol. 56 No. 4 • BioScience 303

Figure 2. Regulatory gene network for endomesoderm specification (reproduced with permission from Eric Davidson;http://sugp.caltech.edu/endomes/). Forty genes are currently known to be involved in specifying this single cell type (Davidson et al. 2002). Each short horizontal line from which a bent arrow extends to indicate transcription represents thecis-regulatory element that is responsible for expression of the gene named in the domain shown. The architecture of the network is based on perturbation and expression data, on data from cis-regulatory analyses for several genes, and on otherexperiments referenced at the Web site cited above.

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

ules themselves are constructed from multiple binding sitesfor individual transcription factors (Arnone and Davidson1997). Modular regulatory switches are used for buildingmodular animals (Carroll 2005), and switches are thus the crit-ical connection to evolution: Such regulation permits evolu-tionary change to occur in one part of a structure, independentof other parts. Ultimately, modularity at the genetic level ofregulation is the secret to modularity at the morphologicallevel, and modularity is the key to building complexity (Car-roll 2005), as evidenced by the overwhelming number ofevolutionary modifications that consist of specializing, tin-kering with, and modifying the number of existing modularparts.

How do modules evolve? Modularity has been argued to be the most critical aspect oforder in living organisms and their ontogenies, and the at-tribute that most strongly facilitates evolution (Raff 1996).Modularity brings together development and evolution in sev-eral respects (Schlosser and Wagner 2004). First, we know thatin development, the modularity of gene regulation is trans-lated into semi-independent genetic regulatory networks,and these specify the discrete features of the phenotype. Un-til the advent of molecular systematics, such features, or char-acters of the phenotype, were almost exclusively used bysystematists to infer phylogenetic relationships. The inferen-tial methods of phylogenetic systematics assume character in-dependence, and the likelihood of inferring false relationshipsis increased by correlated features. Formulation of a charac-ter thus might be seen as equivalent to delineating a pheno-typic module that has been maintained and modified byevolution. Essentially, developmental modularity underlies theevolutionary modularity of systematic characters. Because atsome level the entire phenotype is correlated, the empiricalproblem becomes that of determining levels of correlation.With enough developmental data on the underlying net-works and appropriate informatics, character dependenciescan be evaluated, and these can be built into the methods ofanalysis. This is a nontrivial undertaking that relies on heavyuse of phylogenetic and developmental model databases, in-teroperability, tools, and other programmatic infrastructure.

Phylogenetic information about how individual species arerelated is necessary for determining which modules are an-cestral and which are derived (Maddison and Maddison1992), for learning whether a new morphology is due to thegain or loss of a module, and for finding out whether a mod-ule has been derived once or many times. Development canbe represented as a mosaic of primitive and derived devel-opmental modules interacting to form primitive or derivedmorphological features. Phylogenetic hypotheses are thuscritical in interpreting the patterns of evolution of mod-ules—at any hierarchical level.Without a phylogenetic frame-work, even the most basic questions cannot be framed in ameaningful way. Applying phylogenetic methods to genenetworks and other modules will yield answers to vital ques-tions in the field of evo-devo (box 1). Fortunately, a great deal

of progress has been made toward understanding the phylo-genetic relationships of major animal and plant groups, as wellas the relationships within many smaller clades.

Ontologies: Connecting evolutionary and developmental anatomy The phenotype figures prominently in the work of develop-mental biologists, who place major emphasis on connectinggene expression and function to phenotypic effects. Muta-genesis (i.e., isolating single gene mutants that have discreteeffects on body pattern) has been a primary approach in un-derstanding normal development and in identifying thebody-patterning and “toolkit”genes of evolution (Greenspan2001, Carroll 2005). The control gene eyeless, for example, wasidentified from the mutants without eyes in Drosophila. Otherapproaches include the removal or misexpression of a geneproduct during development, followed by examination ofthe effects on the resultant phenotype (Stern 2000). The dataresulting from these approaches are voluminous, and dis-covering the complex gene networks that connect sequenceto phenotype will require long-term bioinformatics efforts andnew tools.

Ontologies, or controlled vocabularies, are one approachthat is used to connect databases. Ontologies formally rep-resent hierarchical relationships between defined biologicalconcepts, such that the vocabularies can be used by both hu-mans and computers to exchange and explore information(Holloway 2002, Blake 2004). Controlled vocabularies in theform of words or phrases are used in ontologies to bring dif-ferent meanings or synonyms together under a single clearterm; free text is avoided, and the data can be computed.Bio-ontologies clarify scientific discussions by providing ashared vocabulary to communicate results and to exploredata. They also make data exploration, inference, and datamining computationally possible.

The development of ontologies was instigated by molec-ular biologists to promote collaborations between the med-ical informatics and bioinformatics ontology communities(Holloway 2002). The Gene Ontology (GO) project (Harriset al. 2004) is perhaps the most widely known in the molec-ular world, consisting of structured, controlled vocabularies(> 17,500 terms) and classifications that cover several domainsof molecular and cellular biology (www.geneontology.org). Itconsists of three ontologies at different hierarchical levels(molecular function, biological process, and cellular com-ponent attributes of gene products). These ontologies areorthologous (relatable) to one another and thus interopera-ble (i.e., they can work together). Use of the GO is widespread,and many biological resources, such as UniProt (Apweiler etal. 2004) and Protein Data Bank, annotate their data in GOterms (Wolstencroft et al. 2005). The result is that by search-ing for all proteins associated with a particular GO term,data from many different sources can be retrieved automat-ically and efficiently. A single GO term has associated with itall known synonyms, and thus all related terms are searchedsimultaneously.

Articles

304 BioScience • April 2006 / Vol. 56 No. 4 www.biosciencemag.org

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

More recently, researchers have developed phenotypic on-tologies (Bard 2005) within model organism communities (ze-brafish, mouse, fly), motivated by the desire to answer thedevelopmental questions of how phenotype is generatedfrom genes. Phenotypic effects of different treatments for amodel species (e.g., mouse) can be compared and analyzedtogether with genetic sequence data to piece together the

underlying genetic architecture. There are now about 15anatomical ontologies, many of which are linked to organismdatabases (Bard 2005) (e.g., Zebrafish Anatomical Ontol-ogy, http://zfin.org; Drosophila database [Flybase], http://flybase.bio.indiana.edu; and Edinburgh Mouse Atlas Project,http://genex.hgu.mrc.ac.uk). These bio-ontologies of the phe-notype (Gkoutos et al. 2004), or “anatomics”(Bard 2005), have

Articles

www.biosciencemag.org April 2006 / Vol. 56 No. 4 • BioScience 305

The pattern of phylogenetic history is critical for tests, predictions, and investigations in evolutionary developmental biology,or evo-devo. Patterns of character evolution must be synthesized across the full spectrum of biological levels to answer the major questions below and those subsumed within them. Because of the complexity and volume of these data, new methodsof visualization and analysis will be required to fully grasp the patterns of such evolutionary changes.

Developmental network-level questions:

• What is the pattern of evolution of gene regulatory network modules?

• What are the specific changes that have occurred in a particular gene network as it is transformed in evolution, and exactlywhere have these have occurred?

• Where exactly does the remodeling of developmental pathways occur? (cis-acting elements? Protein function?)

• What is the frequency and nature of parallel co-option of genetic networks? (Co-option appears to occur frequently in evolution, as evidenced by the parallel independent co-option of Pax-6, Dll, and tinman to pattern eyes, limbs, and hearts,respectively, in insects and vertebrates; Jockusch and Ober 2004.)

• Developmental pathways may be redeployed in other tissues (heterotopy) or at other developmental times (heterochrony),or both. How often and under what circumstances does this happen?

• What are the specific bases for constraint in gene networks?

• What are the network properties that promote resilience or enhance evolvability? (The interactivity among genes indicatesthat there might be considerable flexibility in the capacity of the genome to respond to diverse conditions; Greenspan 2001.)

• How is constraint at the morphological level related to that at the network level?

• Which sites in a gene network are most conserved or constrained, and which are most labile?

• What types of changes are most common? (Cork and Purugganan [2004] predict that genes functioning early in a geneticpathway are subject to stronger stabilizing selection than downstream loci, since mutations in these genes are likely to havegreater pleiotropic effects and affect all downstream phenotypes.)

Phenotype-level questions:

• What is the developmental basis for the phenotypic characters (modules) of evolution?

• Is there a “signature” modular composition to the morphological characters (modules) of systematics?

• What are the probabilities of different types of changes within modules?

• Are new linkages between submodules made to produce modular characters?

• Are sets of modules (at different or similar biological levels) correlated evolutionarily? (By mapping multiple modules simultaneously, the patterns of character association can tested for correlation. Levels of correlation can be quantified.)

Systems-level questions (requiring information from ecology and environment as well):

• What accounts for major novelties?

• Generally, what accounts for homology?

• Can we generalize about homoplasy? Why are some characters susceptible to parallel evolution?

• Why are there trends or “metapatterns” in evolution?

Box 1. Evolution and development: Developmental network-, phenotype-, and systems-level questions.

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

sprung up without significant input from comparative evo-lutionary morphologists or systematists.

To address many evolutionary questions (box 1), species’phenotypes must be compared. Unlike the phenotypes con-tained in developmental databases, the phenotypes of evo-lutionary biology are not the result of short-term laboratoryperturbation or mutation, but rather the result of millions ofyears of natural mutation. Phenotypic variation across speciesis much broader than that observed within a mutated modelspecies. To understand the evolutionary changes in the de-velopmental genetic and epigenetic inputs that are involvedin building and individuating species, phenotypic ontologiesfor individual species must be connected to each other, as wellas to ontologies at lower biological levels (e.g., GO). Evolu-tionary comparisons operate at a high tier in systems biology,namely, at the level of continuity (and modification) of thephenotype across the tree of life. Similarity among phenotypesis due to the continuity of inherited information (i.e., homol-ogy; Van Valen 1982, Roth 1984). Homology is an evo-lutionary concept; homologous features are similar becausethey are inherited from a common ancestor—that is, they aresimilar because of their common genealogy. Homology, acommon ground for developmental and evolutionary biol-ogists, is central to all comparative biology (Bock and Cardew1999). It is central in phylogeny reconstruction, because char-acters have to be homologues to achieve a correct hypothe-sis of evolution using phylogenetic methods. The invisibleevolutionary threads connecting organisms are those of con-tinuity—of homology—and thus ultimately ontologies mustbe connected through homologous features or modules.

The developmental genetic basis of convergent morphological changeThere is some evidence—and it is becoming almost a centraltenet of evo-devo—that the evolution of regulatory elementsis the primary source of phenotypic change during the courseof evolution. Regulatory elements, including cis elements,enhancers, promoters, and trans regulatory factors, in com-bination with a host of other proteins that are part of the tran-scriptional machinery (Brivanlou and Darnell 2002), activateor repress gene transcription. Recent molecular advances inseveral model systems have demonstrated the correlation ofmorphological variation with variation in regulatory regions.These regulatory elements appear to be highly flexible, andevolutionary changes in their roles and expression domainsseem fairly common. Regulatory mutations in key develop-mental control genes appear to provide a general mecha-nism for selectively altering expression in specific structureswhile preserving expression at other sites required for viability(Carroll 2000, Stern 2000, Tautz 2000). Several recent stud-ies of evolutionary changes in the regulatory elements un-derlying phenotypic evolution (Gompel and Carroll 2003,Sucena et al. 2003, Shapiro et al. 2004) provide examples ofmaking the much-needed connection from genotype to phe-notype. Interestingly, these studies come from examiningthe development of parallel or convergent similarity within

small clades, as opposed to the focus on novel morphologies(e.g., the pentameral echinoderm body plan from a bilateralancestor; Arenas-Mena et al. 2000, Peterson et al. 2000, Popodiand Raff 2001) that has characterized much of the researchin evo-devo.

Convergence describes cases in which similar derived mor-phologies are produced from different ancestral morpholo-gies, though the terms convergence and parallelism are usedinterchangeably in several subdisciplines of biology (Wienset al. 2003), including evo-devo. Regulatory changes are in-volved in the convergent loss of features (Sucena et al. 2003,Shapiro et al. 2004). The evolutionary modification of “justa few developmental hotspots” (Richardson and Brakefield2003) may underlie parallel evolutionary changes at the phe-notypic level (Richardson and Brakefield 2003, Sucena et al.2003). The examples below demonstrate such modifications,and they provide an opportunity to describe how advancesin bioinformatics can significantly enhance understanding ofthese processes.

Morphological convergent reduction in pelvic fins of three-spine sticklebacks. Marine threespine sticklebacks, Gas-terosteus aculeatus, have given rise to freshwater sticklebackpopulations that have evolved complete or partial loss of thepelvic skeleton in less than 10,000 generations. Shapiro andcolleagues (2004) identified a major chromosome regioncontrolling loss of pelvic structures in a natural populationof sticklebacks. They found an altered pattern of expressionfor the gene Pitx1 in some tissues but not in others, consis-tent with a regulatory mutation that disrupts expression inboth the prospective pelvic region and the caudal fin. Theirdata suggest that cis-acting regulatory mutations in Pitx1 area major cause of pelvic reduction in this rapidly evolving sys-tem; the coding regions of Pitx1 are unchanged betweenpopulations with and without pelvic fins. In this and manyother cases in which evolutionary changes have been tracedto regulatory alterations in major developmental controlgenes, the actual DNA sequences responsible for tissue-specific expression differences are unknown. Regulatorymutations are much more difficult to identify at the molec-ular level, because the regulatory regions can be located farfrom the gene of interest. The upshot is that it appears thatsimilar regulatory mutations in key developmental controlgenes are responsible for the parallel morphological changesin fins during stickleback evolution.

With the appropriate bioinformatics resources in place, re-searchers could quickly find all of the examples of evolu-tionary loss of the pelvic fins in fish evolution. They couldquickly access and examine images of pelvic fins in the mostrecent ancestors of the fishes that lost them, and they couldsee stages (when they exist) in the gradual loss of such struc-tures. The genetic underpinnings of pelvic fin loss in stickle-backs, which is known, could be compared with that inexamples of natural mutation, and in fact the sticklebackgenes and regulators could be used as candidates to examinein other fishes. By connecting the breadth of parallel evolu-

Articles

306 BioScience • April 2006 / Vol. 56 No. 4 www.biosciencemag.org

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

tionary phenotypes with the developmental genetic depthfrom a well-understood system, important, overarching evo-devo questions (box 1) can be answered.

Morphological convergent loss of larval hairs in Drosophila.The pattern of trichomes (larval “hairs”) on the dorsal andlateral cuticle varies among the larvae of various Drosophilaspecies;“thin” trichomes have been lost to varying degrees infour species in the D. virilis clade. When the presence or ab-sence of thin trichomes is mapped (Sucena et al. 2003) on aphylogeny for the group (Spicer and Bell 2002), at least threeevolutionarily independent losses of trichomes are required.It had been previously determined that the transcriptionfactor Svb (shavenbaby/ovo, svb/ovo) acts to switch cells be-tween naked cuticle and the production of trichomes (Payreet al. 1999). Using in situ hybridization, Sucena and col-leagues (2003) demonstrated that the expression of svb isstrictly correlated with the pattern of trichomes on the cuti-cle along the anterior–posterior axis. In other words, in nakedspecies (those lacking thin trichomes), svb transcription is ab-sent and the corresponding cells differentiate naked cuticle.Combining this finding with data from intraspecific crosses,Sucena and colleagues (2003) concluded that the transcrip-tional enhancer that promotes svb expression in the domainthat produces thin trichomes is turned off in the lineages thatlack the trichomes. Regulatory changes in svb expression areinvolved in all cases of parallel loss in the D. virilis clade.Although determining exactly how the cis-regulatory regionof svb has been altered will require identification and char-acterization of this genomic area in these species—which, asin the case of the stickleback, is technically difficult—it is clearthat parallel changes in the regulation of the svb/ovo gene un-derlie all independent cases of morphological convergence.

Like the stickleback fish example, this demonstrates that apparently identical developmental genetic mechanisms mayunderlie parallel evolutionary changes. Although evolution-ary biologists have long recognized that identical develop-mental genetic mechanisms may not underlie homology(Roth 1988), the finding that identical developmental regu-latory changes may, in fact, underlie parallelism underscoresthe need for phylogenetic information in determining ho-mology (Sucena et al. 2003). If only the developmental geneticinformation, and not the phylogenetic information, wereavailable for the D. virilis species group, the absence of tri-chomes at the first larval stage would be considered homol-ogous, because changes in regulation of the same gene (svb)produce this pattern. Phylogeny, however, indicates that thisloss must have happened independently, at least three times.Sucena and colleagues (2003) point out that the modular na-ture of svb regulatory regions permits alterations in part ofthe cuticular pattern without pleiotropic consequences (Stern2000, Sucena and Stern 2000). Caution is advisable evenwhen phylogenetic history and morphology indicate a com-mon (homologous) developmental basis. There are now sev-eral examples of morphology, such as vulva position innematodes (Sommer 2000, Rudel and Sommer 2003) and

wing polyphenism in ants (Abouheif and Wray 2002), inwhich differently modified gene networks underlie apparenthomology.

As in the first example, with the appropriate bioinfor-matics resources in place, researchers could quickly query fordata across insect taxa to find other examples of such phe-notypic changes in evolution. They could promptly examineimages of morphological features, and again, the well-knownset of genes and regulators could be searched for parallel useat a broader level.

Pigment and trichome patterns in flies. On a phylogeny for 13Drosophila species, Gompel and Carroll (2003) mapped thepattern of melanic pigment and trichomes on the pupal ab-dominal segments against aspects of bab2 gene regulation.Theyinferred from phylogeny and character distribution that, an-cestrally, Bab2 repressed pigment formation and also regulatedtrichome development in the abdominal epidermis. In the evo-lution of the group, males of several species evolved repressionof bab in the posterior abdomen (allowing pigment to form).Convergence in pigment pattern was achieved through similar, independent regulatory changes in bab. Additionally,from the phylogeny they could see a spectrum of species-specific modulations of Bab2 expression. This showed themthat bab controls pigment and trichomes independently, andthat their evolution can be uncoupled. They suggested thatchanges in cis to bab2, in regulatory elements responding directly or indirectly to one or more body plan regulators, wereresponsible for the diversification of Bab2 expression.

Like the first two examples, this detailed study of fruitflies adds potential developmental genetic depth to the mul-titude of parallel changes across insects. Ontologies would pro-vide a link across these databases and enable comparativemorphologists and developmental geneticists to exchangeinformation.

Many of the examples of convergent evolution in evo-devo have involved the loss or reduction of a particular mor-phological feature within a small clade, and they show thatparallel changes in genetic regulatory networks underlie theloss of morphological features such as eyes in cavefish, wingsin ants, and pelvic fins in stickleback fishes (Carroll et al. 2001,Abouheif and Wray 2002, Jenner 2004,Yamamoto et al. 2004,Tian and Price 2005). Convergence in regulatory gene ex-pression domains, which is probably more common than isgenerally acknowledged, can arise for several different reasons(Wray 2002). The ease with which a morphological featuremay be lost is consistent with the evolutionary idea that theloss of a morphological feature is more probable than gain (re-viewed in Jenner 2004), which in systematics has been for-malized as Dollo’s law (Farris 1977). If these empiricaldevelopmental data are representative, they support the useof this assumption in phylogenetic inference.

ConclusionsUnderstanding the interplay between developmental andphylogenetic constraints in the evolution of the phenotype is

Articles

www.biosciencemag.org April 2006 / Vol. 56 No. 4 • BioScience 307

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

an overarching goal of the field of evo-devo. The examples ofconvergent evolution related above have shown that parallelchanges in genetic regulatory networks underlie the loss ofmorphological features. Whether this will prove true acrossdisparate phylogenetic levels will require additional empiri-cal research—research at the genomic and phenotypic levelsof multiple species—and informatics to facilitate data min-ing and analysis. Practically, shared ontologies for homolo-gous parts, characters, or modules of the phenotype must bereferenced, so that growing databases in genomics and evo-lution can connect and help address evolutionary questions(box 1). Currently, developmental biologists and evolution-ary morphologists individuate or parse the phenotype dif-ferently (Geeta 2003), but bioinformatics efforts are under wayto facilitate communication across disciplines. Better analyticaltools and more powerful methods of analysis will make it pos-sible to understand the staggering level of complexity under-lying phenotypes and the constraints of the developmentalgenetic architecture that underlie phenotypic modules. Bio-informatics methods that facilitate visualization of multi-dimensional genotype and phenotype data simultaneously(Tao et al. 2005) are crucial to biological research in thispostgenomic era.

AcknowledgmentsI thank Matthew Greenstone for his encouragement in de-veloping this overview, and Patricia Crotwell and anony-mous reviewers for their suggestions.

References citedAbouheif E, Wray GA. 2002. Evolution of the gene network underlying

wing polyphenism in ants. Science 297: 249–252.Anholt RR. 2004. Genetic modules and networks for behavior: Lessons

from Drosophila. Bioessays 26: 1299–1306.Apweiler R, et al. 2004. UniProt: The Universal Protein knowledgebase.

Nucleic Acids Research 32: D115–D119.Arenas-Mena C, Cameron AR, Davidson EH. 2000. Spatial expression of Hox

cluster genes in the ontogeny of a sea urchin. Development 127:4631–4643.

Arnone MI, Davidson EH. 1997. The hardwiring of development: Organi-zation and function of genomic regulatory systems. Development 124:1851–1864.

Arthur W. 2004. Biased Embryos and Evolution. Cambridge (United Kingdom): Cambridge University Press.

Bard JBL. 2005. Anatomics: The intersection of anatomy and bioinformat-ics. Journal of Anatomy 206: 1–16.

Baum DA, Smith SD, Donovan SSS. 2005. The Tree-Thinking Challenge.Science 310: 979–980.

Blake J. 2004. Bio-ontologies—fast and furious. Nature Biotechnology 22:773–774.

Bock GR, Cardew G, eds. 1999. Homology. New York: Wiley.Brivanlou AH, Darnell JE. 2002. Signal transduction and the control of

gene expression. Science 295: 813–818.Carroll SB. 2000. Endless forms: The evolution of gene regulation and

morphological diversity. Cell 101: 577–580.———. 2005. Endless Forms Most Beautiful: The New Science of Evo Devo

and the Making of the Animal Kingdom. New York: Norton.Carroll SB, Grenier JK, Weatherbee SD. 2001. From DNA to Diversity:

Molecular Genetics and the Evolution of Animal Design. London:Blackwell Science.

Cork JM, Purugganan MD. 2004. The evolution of molecular genetic pathways and networks. Bioessays 26: 479–484.

Csete ME, Doyle JC. 2002. Reverse engineering of biological complexity.Science 295: 1664–1669.

Davidson EH, et al. 2002. A genomic regulatory network for development.Science 295: 1669–1678.

Farris JS. 1977. Phylogenetic analysis under Dollo’s Law. Systematic Zoology26: 77–88.

Geeta R. 2003. Structure trees and species trees: What they say about morphological development and evolution. Evolution and Develop-ment 5: 609–621.

Gilbert SF. 2003. The morphogenesis of evolutionary developmental biology. International Journal of Developmental Biology 47: 467–477.

Gkoutos GV, Green ECJ, Mallon A-M, Blake A, Greenaway S, Hancock JM,Davidson D. 2004. Ontologies for the description of mouse phenotypes.Comparative and Functional Genomics 5: 545–551.

Gompel N, Carroll SB. 2003. Genetic mechanisms and constraints govern-ing the evolution of correlated traits in drosophilid flies. Nature 424:931–935.

Greenspan RJ. 2001. The flexible genome. Nature Reviews Genetics 2:383–387.

Hall BK. 2003. Evo-Devo: Evolutionary developmental mechanisms.International Journal of Developmental Biology 47: 491–495.

Harris MA, et al. 2004. The Gene Ontology (GO) database and informaticsresource. Nucleic Acids Research 32: D258–D261.

Holland PWH. 2004. The fall and rise of evolutionary developmental biology. Pages 261–275 in Williams DM, Forey PL, eds. Milestones in Systematics. Boca Raton (FL): CRC Press.

Holloway E. 2002. Meeting review. From genotype to phenotype: Linkingbioinformatics and medical informatics ontologies. Comparative Functional Genomics 3: 447–450.

Irish VF, Benfey PN. 2004. Beyond Arabidopsis. Translational biology meetsevolutionary developmental biology. Plant Physiology 135: 611–614.

Jenner RA. 2004. When molecules and morphology clash: Reconciling con-flicting phylogenies of the Metazoa by considering secondary characterloss. Evolution and Development 6: 372–378.

Jockusch EL, Ober KA. 2004. Hypothesis testing in evolutionary develop-mental biology: A case study from insect wings. Journal of Heredity 95:382–396.

Kitano H. 2002. Systems biology: A brief overview. Science 295: 1662–1664.Klingenberg CP, Mebus K, Auffray JC. 2003. Developmental integration in

a complex morphological structure: How distinct are the modules in themouse mandible? Evolution and Development 5: 522–531.

Kloster M, Tang C, Wingreen NS. 2005. Finding regulatory modules throughlarge-scale gene-expression data analysis. Bioinformatics 21: 1172–1179.

Kuratani S. 2004. Evolution of the vertebrate jaw: Comparative embryologyand molecular developmental biology reveal the factors behind evo-lutionary novelty. Journal of Anatomy 205: 335–347.

Lee I, Date SV, Adai AT, Marcotte EM. 2004. A probabilistic functional network of yeast genes. Science 306: 1555–1558.

Lee TI, et al. 2002. Transcriptional regulatory networks in Saccharomyces cere-visiae. Science 298: 799–804.

Mabee PM, Crotwell PL, Burke AC, Bird NC. 2002. Evolution of median finmodules in the axial skeleton of fishes. Journal of Experimental Zoology:Molecular and Developmental Evolution 294: 77–90.

Maddison WP, Maddison DR. 1992. MacClade: Analysis of Phylogeny andCharacter Evolution, version 3.0. Sunderland (MA): Sinauer.

Minelli A. 2003. The Development of Animal Form: Ontogeny, Morphology,and Evolution. London: Cambridge University Press.

Payre F, Vincent A, Carreno S. 1999. ovo/svb integrates Wingless and DERpathways to control epidermis differentiation. Nature 400: 271–275.

Peterson KJ, Arenas-Mena C, Davidson EH. 2000. The A/P axis in echino-derm ontogeny and evolution: Evidence from fossils and molecules.Evolution and Development 2: 93–101.

Popodi E, Raff RA. 2001. Hox genes in a pentameral animal. Bioessays 23:211–214.

Prum RO, Dyck J. 2003. A hierarchical model of plumage: Morphology,development, and evolution. Journal of Experimental Zoology:Molecular and Developmental Evolution 298: 73–90.

Articles

308 BioScience • April 2006 / Vol. 56 No. 4 www.biosciencemag.org

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022

Qin H, Lu HHS, Wu WB, Li W-H. 2003. Evolution of the yeast protein

interaction network. Proceedings of the National Academy of Sciences

100: 12820–12824.

Raff RA. 1996. The Shape of Life. Chicago: University of Chicago Press.

Rast JP. 2003. Development gene networks and evolution. Journal of

Structural and Functional Genomics 3: 225–234.

Ravasz E, Somera AL, Mongru D, Oltvai ZN, Barabasi A-L. 2002. Hierarchical

organization of modularity in metabolic networks. Science 297:

1551–1555.

Revilla-i-Domingo R, Davidson EH. 2003. Developmental gene network

analysis. International Journal of Developmental Biology 47: 695–703.

Richardson MK, Brakefield PM. 2003. Developmental biology: Hotspots for

evolution. Nature 424: 894–895.

Roth VL. 1984. On homology. Biological Journal of the Linnean Society 22:

13–29.

———. 1988. The biological basis of homology. Pages 1–26 in Humphries

CJ, ed. Ontogeny and Systematics. New York: Columbia University Press.

Rudel D, Sommer RJ. 2003. The evolution of developmental mechanisms.

Developmental Biology 264: 15–37.

Schlosser G, Thieffry D. 2000. Modularity in development and evolution.

Bioessays 22: 1043–1045.

Schlosser G, Wagner GP. 2004. Modularity in Development and Evolution.

Chicago: University of Chicago Press.

Shapiro MD, Marks ME, Peichel CL, Blackman BK, Nereng KS, Jonsson B,

Schluter D, Kingsley DM. 2004. Genetic and developmental basis of

evolutionary pelvic reduction in threespine sticklebacks. Nature 428:

717–723.

Smith KK. 2003. Time’s arrow: Heterochrony and the evolution of devel-

opment. International Journal of Developmental Biology 47: 613–621.

Sommer RJ. 2000. Comparative genetics: A third model nematode species.

Current Biology 10: R879–R881.

Spicer GS, Bell CD. 2002. Molecular phylogeny of the Drosophila virilis

species group (Diptera: Drosophilidae) inferred from mitochondrial

12S and 16S ribosomal RNA genes. Annals of the Entomological

Society of America 95: 156–161.

Stern DL. 2000. Evolutionary developmental biology and the problem ofvariation. Evolution: International Journal of Organic Evolution 54:1079–1091.

Sucena E, Stern DL. 2000. Divergence of larval morphology betweenDrosophila sechellia and its sibling species caused by cis-regulatory evolution of ovo/shaven-baby. Proceedings of the National Academy ofSciences 97: 4530–4534.

Sucena E, Delon I, Jones I, Payre F, Stern DL. 2003. Regulatory evolution ofshavenbaby/ovo underlies multiple cases of morphological parallelism.Nature 424: 935–938.

Tabin CJ, Carroll SB, Panganiban G. 1999. Out on a limb: Parallels in verte-brate and invertebrate limb patterning and the origin of appendages.American Zoologist 39: 650–663.

Tao Y, Friedman C, Lussier YA. 2005. Visualizing information across multidimensional post-genomic structured and textual databases.Bioinformatics 21: 1659–1667.

Tautz D. 2000. Evolution of transcriptional regulation. Current Opinion inGenetics and Development 10: 575–579.

Tian NM, Price DJ. 2005. Why cavefish are blind. Bioessays 27: 235–238.Van Valen LM. 1982. Homology and causes. Journal of Morphology 173:

305–312.Wagner GP. 1989. The biological homology concept. Annual Review of

Ecology and Systematics 20: 51–69.Wagner GP, Larsson HC. 2003. What is the promise of developmental

evolution? III: The crucible of developmental evolution. Journal ofExperimental Zoology: Molecular and Developmental Evolution 300: 1–4.

Wiens JJ, Chippindale PT, Hillis DM. 2003. When are phylogenetic analysesmisled by convergence? A case study in Texas cave salamanders. SystematicBiology 52: 501–514.

Wolstencroft K, McEntire R, Stevens R, Tabernero L, Brass A. 2005.Constructing ontology-driven protein family databases. Bioinformatics21: 1685–1692.

Wray GA. 2002. Do convergent developmental mechanisms underlie convergent phenotypes? Brain, Behavior and Evolution 59: 327–336.

Yamamoto Y, Stock DW, Jeffery WR. 2004. Hedgehog signaling controls eyedegeneration in blind cavefish. Nature 431: 844–847.

Articles

www.biosciencemag.org April 2006 / Vol. 56 No. 4 • BioScience 309

Dow

nloaded from https://academ

ic.oup.com/bioscience/article/56/4/301/228984 by guest on 20 July 2022