Testing the accuracy of TreeMap and Brooks parsimony analyses of coevolutionary patterns using...

20
Testing the accuracy of TreeMap and Brooks parsimony analyses of coevolutionary patterns using artificial associations Ashley P.G. Dowling Museum of Zoology, Insect Division, University of Michigan, 1109 Geddes Avenue, Ann Arbor, MI 48109, USA Accepted 8 March 2002 Abstract Brooks parsimony analysis and TreeMap are the two most commonly used methods and are tested using artificially evolved host–parasite associations with varying amounts of coevolutionary events occurring between the two ‘‘phylogenies.’’ The purpose is to test the precision with which each method recovers the true coevolutionary history. The reconstructions recovered by each method are compared against the original test case to determine how closely the reconstruction resembles the artificially created coevolutionary history. Brooks parsimony analysis is found to be consistently less prone to gross overestimation of coevolutionary events and misleading results. Brooks parsimony analysis performs better overall because it is more adept at dealing with host- switching events, both between and within lineages leading to widespread parasite taxa, which provides enough evidence for im- plementing Brooks parsimony analysis instead of TreeMap in coevolutionary studies. Ó 2002 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved. Introduction The concept of coevolution is again becoming the focus of research in such areas as herbivore–plant interactions, mutualisms and symbiosis between multiple organisms, and host–parasite systems. All are appropriate systems for studying coevolution, and the context of this paper is based upon systems involving host–parasite interactions, but is applicable to all other such systems. Associations between two organisms, such as a host and a parasite, can be explained in only one of two ways. The first and most important reason in terms of coevolutionary studies is through association by descent (Brooks, 1979). For example, the mite Chiroptonyssus robustipes (Ewing) is a parasite on the molossid bat Tadarida brasiliensis. If it is found that the ancestor of C. robustipes was a parasite on the ancestor of T. bra- siliensis, then one could say they are associated by de- scent; in other words, they have a history together. On the other hand, if it were found that the ancestor was a parasite on the deer mouse Peromyscus leucopus and not on the ancestor of T. brasiliensis, then we would pos- tulate association by colonization (Brooks, 1979), which is the second reason for association. It is clear in the latter example that C. robustipes has no history with T. brasiliensis, and therefore their current association must be the product of a colonization or host-switching event. It is possible though, that after the colonization, the host and parasite may develop a history together, as their association evolves. Historically, every parasitic association had to begin with a colonization event. The goal of this study is to compare two methods used for deciphering coevolutionary histories (Brooks parsi- mony analysis and TreeMap) in order to determine which method is more precise when attempting to recover the ‘‘true’’ coevolutionary history between associated or- ganisms. Often these two methods give similar results, but occasionally strongly disagree, which can be attrib- uted to the difference in the way each derives the resolved coevolutionary tree. The test is accomplished by analyz- ing associations between artificially created host and parasite ‘‘phylogenies’’ and comparing the number of coevolutionary events determined by each method. The two methods are also used to analyze phylogenetic hy- potheses derived from real case studies. These compari- sons are expected to reveal whether patterns found during the simulations are decipherable from the results obtained from the analysis of real phylogenetic hypoth- eses, in which the true history is not known. Cladistics 18 (2002) 416–435 www.academicpress.com Cladistics E-mail address: [email protected]. 0748-3007/02/$ - see front matter Ó 2002 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved. PII:S0748-3007(02)00007-5

Transcript of Testing the accuracy of TreeMap and Brooks parsimony analyses of coevolutionary patterns using...

Testing the accuracy of TreeMap and Brooks parsimony analyses ofcoevolutionary patterns using artificial associations

Ashley P.G. Dowling

Museum of Zoology, Insect Division, University of Michigan, 1109 Geddes Avenue, Ann Arbor, MI 48109, USA

Accepted 8 March 2002

Abstract

Brooks parsimony analysis and TreeMap are the two most commonly used methods and are tested using artificially evolved

host–parasite associations with varying amounts of coevolutionary events occurring between the two ‘‘phylogenies.’’ The purpose is

to test the precision with which each method recovers the true coevolutionary history. The reconstructions recovered by each

method are compared against the original test case to determine how closely the reconstruction resembles the artificially created

coevolutionary history. Brooks parsimony analysis is found to be consistently less prone to gross overestimation of coevolutionary

events and misleading results. Brooks parsimony analysis performs better overall because it is more adept at dealing with host-

switching events, both between and within lineages leading to widespread parasite taxa, which provides enough evidence for im-

plementing Brooks parsimony analysis instead of TreeMap in coevolutionary studies. � 2002 The Willi Hennig Society. Published

by Elsevier Science (USA). All rights reserved.

Introduction

The concept of coevolution is again becoming the focusof research in such areas as herbivore–plant interactions,mutualisms and symbiosis between multiple organisms,and host–parasite systems. All are appropriate systemsfor studying coevolution, and the context of this paper isbased upon systems involving host–parasite interactions,but is applicable to all other such systems.

Associations between two organisms, such as a hostand a parasite, can be explained in only one of twoways. The first and most important reason in terms ofcoevolutionary studies is through association by descent(Brooks, 1979). For example, the mite Chiroptonyssusrobustipes (Ewing) is a parasite on the molossid batTadarida brasiliensis. If it is found that the ancestor ofC. robustipes was a parasite on the ancestor of T. bra-siliensis, then one could say they are associated by de-scent; in other words, they have a history together. Onthe other hand, if it were found that the ancestor was aparasite on the deer mouse Peromyscus leucopus and noton the ancestor of T. brasiliensis, then we would pos-tulate association by colonization (Brooks, 1979), which

is the second reason for association. It is clear in thelatter example that C. robustipes has no history withT. brasiliensis, and therefore their current associationmust be the product of a colonization or host-switchingevent. It is possible though, that after the colonization,the host and parasite may develop a history together, astheir association evolves. Historically, every parasiticassociation had to begin with a colonization event.

The goal of this study is to compare two methods usedfor deciphering coevolutionary histories (Brooks parsi-mony analysis and TreeMap) in order to determine whichmethod is more precise when attempting to recover the‘‘true’’ coevolutionary history between associated or-ganisms. Often these two methods give similar results,but occasionally strongly disagree, which can be attrib-uted to the difference in the way each derives the resolvedcoevolutionary tree. The test is accomplished by analyz-ing associations between artificially created host andparasite ‘‘phylogenies’’ and comparing the number ofcoevolutionary events determined by each method. Thetwo methods are also used to analyze phylogenetic hy-potheses derived from real case studies. These compari-sons are expected to reveal whether patterns foundduring the simulations are decipherable from the resultsobtained from the analysis of real phylogenetic hypoth-eses, in which the true history is not known.

Cladistics 18 (2002) 416–435

www.academicpress.com

Cladistics

E-mail address: [email protected].

0748-3007/02/$ - see front matter � 2002 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

PII: S0748 -3007 (02 )00007-5

Coevolutionary patterns

It is widely agreed that there are four kinds of coev-olutionary patterns leading to either congruence or in-congruence between associated phylogenetic hypotheses:cospeciation, host switching, duplications, and sortingevents (Fig. 1).

Incongruence between host and parasite phylogenetichypotheses is easily explained by colonization events(host switches) by the parasite species (Brooks andMcLennan, 1991) or through sorting events (extinctionsof parasite lineages). On the other hand, congruence isnot so easily deciphered. The predominant and mostlikely reason for congruence involves cospeciation be-tween host and parasite lineages, but can also be due tohost switching that happened to mirror the phylogeneticrelationships of the host (Brooks and McLennan, 1991).This coincidental pattern is described as resourcetracking, which requires the host to speciate first, afterwhich the parasite colonizes the new host species (pre-sumably because the new host has a specific resourcerequired by the associate) and undergoes subsequentspeciation. Finally, duplications, or independent speci-ations, followed by subsequent cospeciation actuallyadd to the number of congruent lineages while dupli-cations followed by extinctions will appear as additionalincongruence (homoplasy).

Cospeciation refers to a pattern of association bydescent and is displayed as congruence between associ-ated host and parasite phylogenetic hypotheses. Cospe-ciation exists when the parasite undergoes cladogenesisin response to, or concomitant with, host cladogenesis(Brooks, 1979).

This cocladogenesis by the host and parasite is clearly aresult of allopatric speciation, in which a vicariant eventseparates a host population and in turn separates theparasite populations as well. Allopatric cospeciation maybe considered the null model of coevolutionary studies,predicting that phylogenetic congruence between hostsand parasites is entirely due to simultaneous speciation ineach lineage (Brooks and McLennan, 1993).

Host switching occurs when a parasite from one hosttransfers to another host. Though it seems rather simple,host switching can occur in many different forms. Theparasite may transfer to any host it finds suitable,sometimes to a species closely related to its current host,but other times to distantly related species. Distantlyrelated species often do not show up in the context ofour coevolutionary studies because it is likely that thehost would not have been sampled if it were clearlydistantly related to the group of hosts involved in thestudy. When a parasite transfers to a different hostspecies, many scenarios may occur (Fig. 2). First, theparasite may transfer to another host, resulting in amore widespread parasite, in which two or more hostsharbor the same parasite species (Fig. 2a). Likewise, theparasite may transfer to another host and then speciatedue to its separation from the main population of par-asites (peripheral isolates speciation) (Fig. 2b). After ahost switch has taken place, the parasite may beginphylogenetically tracking its new host (i.e., cospeciation)(Fig. 3a) or it may not (Fig. 3b).

During a duplication event, a parasite speciates in-dependently of its host, resulting in multiple relatedparasites occupying the same host species. Parasite du-plications typically represent cases of sympatric specia-tion, which are easily recognizable when sister parasitelineages are found associated with the same host lin-eages. Duplications appear to cause incongruence be-tween associated phylogenetic hypotheses only whencoupled with extinction of lineages. It is also feasiblethat the host speciates more rapidly than the parasite,resulting in many related host species carrying the sameparasite species. Duplications and host switching canlead to two scenarios of association, widespread taxa

Fig. 1. The phylogeny of four hosts (black; A–D) and the evolutionary

history of five associated parasites (gray; I–V). The four coevolution-

ary events that influence the congruence and incongruence between the

host and parasite phylogenies are illustrated and labeled.

Fig. 2. (a) A host-switching event of parasite I to host C without a

speciation and (b) the same host-switching event except with a speci-

ation to form parasite II.

Fig. 3. (a) A host-switching event followed by host tracking (cospeci-

ation) between the host and the parasite and (b) a host switch not

followed by host tracking.

A.P.G. Dowling / Cladistics 18 (2002) 416–435 417

and redundant distributions (Fig. 4). The widespreadtaxa scenario occurs when a parasite is associated withseveral host species (Fig. 4a), due to either hostswitching without speciation or host duplications. Aredundant distribution occurs when more than oneparasite is associated with a single host species (Fig. 4b),which can occur after host switching with subsequentspeciation events or due to parasite duplications. Bothwidespread taxa and redundant distributions have beenknown to lead to ambiguous results in the form of ad-ditional ad hoc hypotheses in the study of coevolution.

Sorting events are used to explain the absence of aparasite on a host that is expected to house a parasitebecause its ancestor did. Currently there are many ac-knowledged kinds of sorting events that could lead tothe absence of a parasite, a few of which are discussedbelow. The first, and most obvious, is the extinction of aparasite from its host following a cospeciation event.Another reason could be incomplete sampling. Parasitestend to have patchy distributions and if the sample sizeof hosts is too low, it is possible that the parasites mayhave simply been missed. A third reason is the ‘‘missingthe boat’’ hypothesis recently put forward by Patersonand Gray (1997). Missing the boat deals with the patchydistributions of a parasite within a host population. Ifthe founder population at a host speciation event issmall, it is likely that a parasite normally associated withthe founder species may not be present, therefore caus-ing the parasite to be absent from that lineage. Likewise,a small founding host population may have moved intoan environment not conducive to the survival of thepresent parasites (i.e., a switch to an aquatic lifestyle).

The study of coevolution between associated phylo-genetic hypotheses is dependent upon the explanation ofthe incongruent nodes in the trees. Since it is the numberof these events represented by incongruence that deter-mine which reconstruction is deemed most optimal, theexplanation of incongruence is the most important as-pect of any coevolutionary analysis. As mentioned pre-viously, cospeciation typically explains congruence andhost switching and sorting events (sometimes in con-junction with duplications) are used to explain incon-gruence. It is the methods by which different methods

used for studying coevolution choose to explain incon-gruence that sets them apart from one another.

Methodology

Much like systematics, coevolution can be studiedwith a variety of methodologies. A sufficient method forstudying host–parasite associations requires three majorcomponents: (1) severely tested phylogenetic hypothesesfor both the host and parasite, (2) a quantitative methodfor determining the number of coevolutionary eventsneeded to explain the incongruence between associatedphylogenetic hypotheses, and (3) a statistical method fordetermining whether the congruence between phyloge-netic hypotheses is possible by chance alone. This finalaspect of a sufficient method is not covered in this study,but has been addressed by Page (1990, 1994b) and morerecently by Siddall (1996).

Early parasitologists may have pioneered coevolu-tionary thinking in the 1890s, but at the time, they hadno method for obtaining severely tested phylogenetichypotheses. During their time, and for almost 70 yearsafterward, phylogenetic hypotheses were producedbased on taxonomy, and coevolutionary studies werebased on patterns of association and host specificity ofparasites. After Hennig conceived of his parasitologicalmethod in the mid 1950s, rigorous methodologies weredeveloped to test phylogenetic hypotheses about thehistories of organisms.

Since the late 1970s, a variety of methods have beendeveloped to make inferences about coevolutionaryhistory, but currently only two methods are frequentlyused to quantitatively determine the number of coevo-lutionary events needed to explain the incongruencebetween the associated phylogenetic hypotheses. Theearliest methods simply compared host and parasitephylogenetic hypotheses by eye, which has been referredto as the narrative approach (Brooks, 1981). In 1981 twomajor classes of methodologies appeared; the parsimonyapproach devised by Brooks (1981) and the componentapproach created by Nelson and Platnick (1981). Fromthese beginnings, many methods have been developedsuch as Brooks parsimony analysis (BPA), component,reconciliation analyses (TreeMap; Page, 1994a,b), costmatrix analyses (two-dimensional and three-dimen-sional; Ronquist, 1995, 1998), and most recently anothercost analysis called Jungles (Charleston, 1998). Thoughthere are many methods, it has not been agreed uponwhich method is the best to use. This study focuses onthe comparison of Brooks parsimony analysis andTreeMap, due to each method’s ability to detect andaccount for cospeciations, extinctions (sorting events),host switches, and duplications. The other availablemethods do not detect one or more of these commoncoevolutionary events (Paterson and Banks, 2001).

(a) (b)

Fig. 4. (a) Representation of a widespread parasite distribution in

which a single parasite taxon is distributed across four host taxa and

(b) a redundant parasite distribution in which four parasite taxa are

infesting a single host taxon.

418 A.P.G. Dowling / Cladistics 18 (2002) 416–435

Brooks parsimony analysis

BPA is a method for identifying the histories of as-sociated host and parasites (coevolution) and the his-tories of members with unique distributions(biogeography). BPA allows the comparison of associ-ated phylogenetic hypotheses (two or more) through theuse of additive binary coding and the Wagner algorithm(Kluge and Farris, 1969). The fact that the method iscapable of comparing more than two associated phylo-genetic hypotheses is helpful because hosts often havemore than one parasite group and areas typically con-tain more than one group of inhabitants. Other methodsdo not have this flexibility. For the purposes of thisstudy, one-step primary BPA is implemented, but it isimportant to note that secondary BPA can be added,making it a two-step process. Primary BPA determineswhether there is any falsification of the null hypothesisin the form of homoplasy that cannot be explained byextinction of lineages. Secondary BPA explicitly depictsthose exceptions to the null hypothesis and therebyhelps gain an understanding of how current diversitieswere assembled (Brooks and McLennan, 2001; Brookset al., 2001). Secondary BPA is more commonly used inbiogeographic studies involving landmasses with retic-ulate histories. The following is a simple host–parasiteexample starting with two associated phylogenetic hy-potheses and ending with the final BPA tree hypothesis:

1. Obtain phylogenetic hypotheses for both the hostand the parasite (Fig. 5).

2. Determine the associations between the host andthe parasite cladograms (Fig. 6).

3. Treat the phylogenetic relationships of the parasitetaxa as a character state tree and number each terminaltaxon and internal branch of the tree (Fig. 7). Thenumbering sequence is arbitrary, but typically the ter-minal taxa are numbered first and then the branches arenumbered. Each parasite species now has a code thatidentifies it and its ancestors. For example, parasite III isrepresented as (3, 7, 8, 9) and parasite V as (5, 6, 8, 9).

4. Represent these codes in the form of a taxon bycode data matrix. For each parasite taxon, the presenceof a number in the parasite code is listed as 1 and anabsence of a number in the parasite code is listed as 0(Table 1). This is an application of additive binary

coding which allows the phylogenetic relationships ofthe parasites to be represented by 0’s and 1’s. For acheck, perform a phylogenetic analysis using the binarycodes from Table 1. If the conversion was done cor-rectly, this analysis should have reproduced the originalparasite tree (Fig. 5b).

5. Replace the parasite names in Table 1 with the hostname associated with each parasite (Table 2). In theevent that more than one parasite species infest the samehost species, the coding for the host in the data matrix ismade from a combination of the different parasite codesby a treatment called Inclusive OR-ing. For example,imagine that host A is infested by both parasites I andVI. Inclusive OR-ing is performed in Table 3 to show

(a) (b)

Fig. 5. (a) Five-taxa host tree (A–E) and (b) a five-taxa parasite tree

(I–V) for use in the BPA methodology example.

Fig. 7. Coding of the terminal nodes (1–5) and the internal branches

(6–9) in preparation for Brooks parsimony analysis.

Table 1

Data matrix showing the results of additive binary coding for a par-

asite phylogeny (Fig. 5b)

Parasite Codes

1 2 3 4 5 6 7 8 9

I 1 0 0 0 0 0 0 0 1

II 0 1 0 0 0 0 1 1 1

III 0 0 1 0 0 0 1 1 1

IV 0 0 0 1 0 1 0 1 1

V 0 0 0 0 1 1 0 1 1

Note. Table resulting from steps 3 and 4 of BPA procedures.

Fig. 6. A representation of the associations between the host and

parasite phylogenies from Fig. 5. Notice the incongruence between the

trees where parasites II and III are sister taxa, while their hosts are

paraphyletic. The remainder of the parasite tree is congruent with the

host tree.

A.P.G. Dowling / Cladistics 18 (2002) 416–435 419

how the combination of parasite species codes occurs.The codes for parasite I (1, 9) and parasite V (5, 6, 8, 9)are combined to form the new inclusive OR-ed code (1,5, 6, 8, 9). This code would then be used as the code forhost A.

6. Finally, using Farris optimization, map the char-acters in the new data matrix (Table 2) onto the originalhost tree (Fig. 5a) to create the BPA tree (Fig. 8). Thecharacters mapped onto the cladogram represent thehistorical association between the hosts and their para-sites. The optimal reconstruction minimizes the numberof ad hoc hypotheses, such as homoplasious charactersand reversals, required to explain the observed associa-tion. Using BPA, homoplasious characters are inter-preted as host-switching events and reversals areinterpreted as extinctions of parasites. It is possible toobserve duplication events on the tree, which simplyappear as independent speciations.

Since its creation, BPA (Brooks, 1981) has undergonemany changes and improvements (Brooks, 1985, 1990;Wiley, 1988a,b), but is still commonly criticized (Page,1990, 1994a, 1994b; Paterson and Gray, 1997; Ronquist,1995; Ronquist and Nylin, 1990). One common criticismis that BPA tends to overestimate host-switching eventsdue to the way additive binary coding treats eachcharacter as independent. A host switch in a BPA tree isrepresented as homoplasy, but often the extinct ances-tors of the dispersing taxa will also appear to haveswitched, resulting in a misleading and unrealistic con-clusion. After gaining experience looking at BPA trees,it becomes obvious when this error, which Brooks refersto as ‘‘ghost characters,’’ occurs. These ghost charactersshow up only when a host switch has occurred and donot provide any support for groupings that were notalready supported on the tree (Brooks and McLennan,2002), therefore making this a minor problem which hasno more of an adverse effect on the study than acting asan eyesore on the BPA tree (Fig. 9).

A second criticism deals with missing taxa. BPA oftenoverestimates the number of extinctions that have oc-curred when a parasite is not found on a host, pre-sumably because the parasite has gone extinct from thathost species. This flaw is also due to additive binarycoding and causes the ancestors of the extinct species toalso go extinct on that host species (Fig. 10), eventhough realistically they went extinct long ago, when

Table 2

Data matrix from Table 1 with hosts matched alongside their corresponding parasites

Host Parasite Codes

1 2 3 4 5 6 7 8 9

A I 1 0 0 0 0 0 0 0 1

B II 0 1 0 0 0 0 1 1 1

C III 0 0 1 0 0 0 1 1 1

D IV 0 0 0 1 0 1 0 1 1

E V 0 0 0 0 1 1 0 1 1

Note. Table resulting from step 5 of BPA procedures.

Table 3

An example of inclusive OR-ing for multiple parasites

Host Codes

1 2 3 4 5 6 7 8 9

AI 1 0 0 0 0 0 0 0 1

AVI 0 0 0 0 1 1 0 1 1

A 1 0 0 0 1 1 0 1 1

Note. Two parasite taxa (I and VI) are infesting a single host taxa

(A). The two binary coding schemes are combined to give the final

binary code for host A.

Fig. 8. The parasite characters (1–9) are mapped onto the host phy-

logeny for the final construction of the BPA tree. Notice that parasite

character 7 is homoplasious indicating a host switch possibly from host

B to host C followed by a speciation on host C to form parasite 3.

Fig. 9. A BPA reconstruction demonstrating the appearance of ghost

taxa (gray 7 and 8). There are three homoplasious parasite characters

on this tree (3, 7, and 8), but only one parasite (3) actually switched

hosts. Parasite 3 switched from host C to host A and parasite char-

acters 7 and 8 are extinct ancestors of 3, which falsely appear to have

also switched hosts because of the ghost taxa problem.

420 A.P.G. Dowling / Cladistics 18 (2002) 416–435

they speciated to form their descendents. This problemis also readily recognized and does not appear to affectthe overall structure of the BPA tree as well. Wiley(1988a,b) suggested that absent taxa should be treatedas missing data to eliminate the overestimation of ex-tinction problem. Wiley’s quick fix for the missing taxaproblem is not implemented in this study, so that thereader is confronted with the initial results of the dif-ferent methodologies.

TreeMap

TreeMap (Page, 1994b) is a reconciliation-basedmethod that reconstructs the parasite phylogenetic hy-pothesis using duplications and sorting events to explainincongruence in order to maximize the amount of co-speciation between the host and the parasite phyloge-netic hypotheses. The reconciled tree is created from amap between a host tree and a parasite tree, which as-sociates each node in the parasite tree with a node in thehost tree. The optimal reconciliation is the reconstruc-tion that maximizes cospeciation and minimizes thenumber of duplications and sorting events involved. Asa result, reconciliation methods prohibit host switching.As discussed earlier, host switching is a very probableand realistic coevolutionary event and to not use it inreconstructions can produce misleading results. Tree-Map does allow host switching to be incorporated intothe reconstruction in an a posteriori manner. Any host-switching event that increases the amount of cospecia-tion between the two phylogenetic hypotheses is deemeda better hypothesis than the original reconstruction.Though this is not the best way to deal with hostswitching, it does allow the researcher the opportunityto visualize the occurrence of host switching within thereconstruction. The procedure for reconciling host andparasite phylogenetic hypotheses is stated below.

1. Obtain phylogenetic hypotheses for both the hostand the parasite involved in the study (Fig. 11).

2. Determine the associations between the two hy-potheses (Fig. 12).

3. Label each terminal and internal node of the twohypotheses and create a map between the two trees. Thisis accomplished by representing each terminal node ofthe parasite tree by the host set it infests and each in-ternal node by the set of hosts infested by its descen-dants (Page, 1994b). Table 4 shows how the mapbetween the two phylogenetic hypotheses is representedin terms of host sets infested by the parasites. Fig. 13 is amore visual way to represent how the nodes betweentrees match. For example, node 6 in the parasite tree,which has the host set {D,E},corresponds to node F inthe host tree.

4. Map each parasite node onto the node in the hosttree that contains all members of the parasite host set

(a) (b)

Fig. 11. (a) Five-taxa host tree (A–E) and (b) a five-taxa parasite tree

(I–V) for use in the TreeMap methodology example.

Fig. 12. Associations between the host and parasite phylogenies from

Fig. 11.

Table 4

Data table showing the node associations between a host and a par-

asite phylogeny

Node in parasite

tree

Host set Corresponding node

in host tree

1 A A

2 B B

3 C C

4 D D

5 E E

6 D, E F

7 B, C H

8 B, C, D, E H

9 A, B, C, D, E I

Note. Table resulting from step 3 of TreeMap reconciliation

procedure.

Fig. 10. A BPA reconstruction demonstrating the missing taxa prob-

lem. A parasite that was a descendent of parasite 6 went extinct from

host C. Due to additive binary coding, the BPA reconstruction also

shows ancestors 6, 7, and 8 to have gone extinct on host C, when re-

alistically they went extinct well before that this event when they

speciated to give rise to their respective descendents.

A.P.G. Dowling / Cladistics 18 (2002) 416–435 421

and the fewest hosts not included in that set. This willallow you to obtain the reconciled tree from Table 4.For example, parasite node 7 has the host set {B,C},which would be mapped onto host node H, which is thesmallest set containing {B,C}. The only other host nodewhich contains the set {B,C} is node I, which contains alarger set than node H. Fig. 14 represents this mappingwith the parasite tree superimposed upon the host treeso that each parasite node is adjacent to its corre-sponding host node. When multiple parasite nodescorrespond to the same host node, as is the case withparasite nodes 7 and 8 with host node H, they aremapped next to each other.

5. Draw a path between all nodes in the parasite treeso that it traces the path between the corresponding hostnodes (Fig. 15). For example, starting with node 9 drawa path to node 1 and to node 7. Since 7 and 8 bothcorrespond to host node h, draw a path between nodes 7and 8. From 7 also trace a path to nodes 2 and 3, andfrom node 8 draw a path to node 6, which then connectsto nodes 4 and 5.

6. Determine the number of cospeciations, duplica-tions, and sorting events. Fig. 16 shows three cospecia-tions (solid black circles), one duplication (solid blacksquare), and three sorting events (represented by X).

When using TreeMap, the researcher may take thisprocedure one step further with the inclusion of host-switching events. Page (1994a) identified possible dis-persing species by deleting each associate in turn andcomputing a new reconciled tree for the remaining as-sociates. The associate(s) whose deletion caused thegreatest increase in the overall congruence is consideredto represent a host switch that can then be manuallyapplied to the original reconciliation. This procedure isaccomplished by using the exact search function inTreeMap.

Like BPA, TreeMap has undergone its own share ofcriticism. One major criticism is the implementation ofthe one-host-per-parasite assumption within the meth-odology (Ronquist, 1995). This assumption requires ex-clusion of a parasite from its original host before itcolonizes a new host, thus eliminating the possibility ofwidespread taxa due to host switching. Page and others(Ronquist, 1995; Ronquist and Nylin, 1990) use this as-sumption to simplify the procedural methodology by re-stricting the number of different types of events that needto be considered by the algorithm producing the recon-structions. The use of this assumption eliminates thepossibility of host switching without speciation. As dis-cussed earlier, a parasite does not have to speciate after itmakes a host switch, nor is a parasite constrained to usingonly one host species. Thus the one-host-per-parasite

Fig. 13. Gray bars indicating the associated nodes of the host and

parasite phylogenies. This a visual way to represent Table 4. Nodes are

matched up based upon the host set for each parasite node.

Fig. 14. Parasite nodes (gray numbers) are mapped to their corre-

sponding host nodes (black letters) on the host phylogeny. Notice that

parasite nodes 7 and 8 are mapped onto the same host node (H).

Fig. 15. The parasite nodes are connected in the appropriate order to

produce the final reconciled tree for associated host and parasite

phylogenies.

Fig. 16. Reconciled tree from Fig. 15 showing three cospeciations

(black circle), one duplication (black square), and three sorting events

(black X).

422 A.P.G. Dowling / Cladistics 18 (2002) 416–435

assumption appears to be a rather unrealistic assumptionthat should be avoided in a coevolutionary study.

A second criticism of TreeMap is its tendency tounderestimate host switching (obviously since it doesnot incorporate host switching at all) and its often grossoverestimation of duplications and sorting events(Ronquist, 1995). The overestimation of duplicationsand sorting events is directly related to the prohibitionof host switching.

In the previous example, BPA and TreeMap provideddifferent reconstructions to explain the relationship be-tween the same two host and parasite phylogenetic hy-potheses. This difference was expected because BPA useshost-switching events to explain incongruence whileTreeMap and other reconciliation methods prefer to useduplications in combination with sorting events instead.The exact search function mentioned earlier does allow aposteriori manipulation of the reconciled tree to deter-mine if any host-switching events may give a better re-construction (i.e., one with a larger number ofcospeciation events). For this particular example, Tree-Map does allow a host switch that does not increasethe total number of cospeciations, but does decrease thenumber of duplications and sorting events, making it themore attractive reconstruction.

Though the methodologies and philosophies behindBPA and TreeMap differ, they do agree on a couple ofconceptual points (Hoberg et al., 1997). First, as wasbelieved by von Ihering, Kellogg, and Farenholz, therelationships between hosts and parasites can be ex-plained by examining the associations between host andparasite phylogenetic hypotheses. Second, both methodsagree that complete cospeciation, or congruence, is thenull hypothesis. They also agree that congruence be-tween phylogenetic hypotheses equals cospeciation andthat incongruence between phylogenetic hypotheses re-quires ad hoc explanations. Both methods attempt indifferent ways to minimize ad hoc assumptions, butreconciliation methods, at the cost of providing lessparsimonious reconstructions, will postulate extra adhoc assumptions in order to maximize cospeciation.Often, the two methods will produce similar results(Hoberg et al., 1997; Paterson and Gray, 1997), but theinterpretations of the incongruence between phyloge-netic hypotheses will differ.

The protocol

Proponents of both methods claim that there is a bigdifference between BPA and reconciliation methods(Hoberg et al., 1997; Paterson and Gray, 1997; VanVeller and Brooks, 2001), but on the surface they appearsimilar, which has proven confusing to many users. Bothmethods share the same null hypothesis of vicariantspeciation, and both make the assumption that phylo-

genetic and distribution data of taxa are informative forthe reconstruction of the historical relationships (Brooksand McLennan, 1991), but where they differ lies in theway each method derives the resolved coevolutionarytree (Enghoff, 1996). Both methods are parsimonybased, but BPA is an a posteriori parsimony method,while TreeMap is considered a priori parsimony (VanVeller et al., 2000). The a posteriori methods use parsi-mony at the end of the analysis to indicate how manyexceptions to the null hypothesis of cospeciation arepresent between the associated phylogenetic hypothesesand which events are responsible for the differences (VanVeller and Brooks, 2001). TreeMap and other a prioriparsimony methods reconcile the associated phyloge-netic hypotheses as much as possible to achieve theinitial null hypothesis of cospeciation using duplicationswith an appropriate number of extinctions to decreasethe amount of incongruence (Van Veller and Brooks,2001). Thus, reconciliation methods sacrifice parsimonyin order to maximize cospeciation and better fit the nullhypothesis.

Hoberg et al. (1997) have argued that BPA is thebetter method and Paterson and Gray (1997) have ar-gued in favor of reconciliation methods. The question is,which is the better method and how is one methoddeemed better than the other? This study sets out to testthe two methods head to head using artificially createdphylogenies and associations based upon the occurrenceof specific coevolutionary events. These artificial asso-ciations allow comparison of the two method’s resultsbecause the ‘‘true’’ history is known and therefore thereis a right answer to each analysis. Furthermore, the re-sults from these analyses can be extrapolated to real,empirical data sets because of the way each methodworks. The methods are based entirely upon the struc-ture of the phylogenetic hypotheses and the associationsbetween each. No account is made for the statisticalstrength of the tree or the data that was required toproduce it. Therefore, a host switch without subsequentspeciation will graphically appear the same in real as-sociations as it does in the artificial situations involvedin this study and will produce the same results.

In this study, determining the ‘‘better’’ method will bebased on the method that produces reconstructions withthe fewest ad hoc assumptions, in other words, the re-construction that deviates the least from the known testcase.

For each test, the same nine-taxa host ‘‘phylogeny’’(Fig. 17) is used, while the associated parasite phylogenyand/or the associations are different. The host phylogenywas created with nine taxa to allow for enough possiblecombinations of coevolutionary events between the twotrees to produce a large sample of trials. Also, the designof the tree was intentional. Rather than using a simplepectinate tree, the host phylogeny was designed to rep-resent two distinct lineages (Taxa A–D and E–J) and in

A.P.G. Dowling / Cladistics 18 (2002) 416–435 423

an asymmetrical manner to eliminate potential bias thatmay be present in a symmetrical tree topology. Also thisdesign allowed for the testing of host-switching eventsboth between and within lineages to determine if theamount of relatedness between host species has an effecton the amount of error generated by the methods duringhost-switching events.

Each parasite phylogeny used in the testing was cre-ated under the assumption that coevolution is occurringwith the host, but not in a perfectly congruent manner.The number of taxa in each parasite phylogeny is de-pendent upon the coevolutionary events that occur afterthe initial colonization of a host species. These eventslead to either congruence between phylogenetic hy-potheses through cospeciations or incongruence throughhost switches and/or sorting events. A little more thanhalf of the trials deal with at least a single host-switchingevent since this seems to be the event that causes thelargest problems with these methods. Appendix A con-tains a figure for each test performed in the study withthe parasite phylogeny overlaying the host phylogenyand where appropriate, host switches are indicated byan arrow leading from the original host to the new host.Host switches to an extinct ancestor were prohibited inthis study to maintain realism. For Brooks parsimonyanalysis, the additive binary coding is done by hand andMacClade v3.0 (Maddison and Maddison, 1992) withDELTRAN optimization is used to map parasite char-

Fig. 18. Trial 1 parasite phylogeny superimposed upon associated host

phylogeny. The dashed arrow indicates a host-switching event without

speciation.

Fig. 17. The nine-taxa host phylogeny used for all test trials. Notice the

two distinct clades [(A–D) and (E–J)] as well as the asymmetry built

into the host tree to allow greater test variability.

Fig. 19. Host–parasite associations of pocket gophers and lice modified from Hafner et al. (1994, Fig. 2b). The pocket gopher phylogeny is on the left

and the louse phylogeny on the right with associations mapped by dotted lines.

424 A.P.G. Dowling / Cladistics 18 (2002) 416–435

acters onto the host phylogeny. For the reconciliationanalysis, TreeMap v1.0 (Page, 1994b) is used with theexact search function to allow for host switching in or-der to further maximize cospeciation, while in turn re-ducing the amount of hypothesized sorting events andduplications. For an example of how the test trials wererun and analyzed, I will work through test trial 1,starting with Fig. 18, which shows a parasite phylogenysuperimposed upon its corresponding host phylogeny.The arrow on the parasite phylogeny indicates a hostswitch from host B to D without a speciation event sothat host B still has parasite II and host D now hasparasites IV and II. Fig. 18 represents the ‘‘true’’ historyof the hosts and parasites with four coevolutionaryevents and one host-switching event.

Finally, three published host–parasite phylogenetichypotheses based on empirical data are analyzed withBPA and TreeMap and the results are compared. Thefirst involves pocket gophers and their lice (Hafner et al.,1994), and the second involves seabirds and lice (Pat-erson et al., 2000), and finally third involves primatesand pinworms (Hugot, 1999). The three associationstested are displayed in Figs. 19–21.

Results

When BPA is performed for these two associatedphylogenies (Fig. 18), the resulting cladogram (Fig. 22)shows four cospeciation events and two host-switchingevents, therefore only requiring one ad hoc hypothesisto explain the relationship. In fact, the extra host-switching event depicts one of the well-known flaws ofthe BPA method, which will be discussed later. The testof coevolution performed by TreeMap gives a very dif-ferent reconstruction (Fig. 23), which includes only onecospeciation event, three duplication events, and eightsorting events. This explanation requires 11 extra ad hocassumptions, does not suggest a host switch, and fallsthree cospeciations short of the ‘‘true history.’’ Obvi-ously, for this example, one would have to conclude thatBrooks parsimony analysis is the better method. This isonly one example; associations can be created to showmajor flaws in both methodologies.

The overall results represented in Table 5 have a fewnoticeable trends. The first and most surprising was thatBPA actually predicted more cospeciation events thanTreeMap. This was unexpected since TreeMap often

Fig. 20. Host–parasite associations of seabirds and lice modified from Paterson et al. (2000, Fig. 3). The seabird phylogeny is on the left and the louse

phylogeny on the right with associations mapped by dotted lines.

A.P.G. Dowling / Cladistics 18 (2002) 416–435 425

sacrifices the parsimony criterion in order to furthermaximize the number of cospeciation events, but in fact,there was not a single trial in which TreeMap predictedmore cospeciation events than BPA. Apparently there isa big difference between minimizing ad hoc assumptionsand maximizing congruence after all. TreeMap clearlyprefers to use sorting events and duplications to explain

Fig. 21. Host–parasite associations of primates and pinworms modified from Hugot (1999, Fig. 3). The primate phylogeny is on the left and the

pinworm phylogeny on the right with associations mapped by dotted lines.

Fig. 22. BPA reconstruction for trial 1 illustrating a host switch of

parasite 2 from host B to host D. Notice that parasite character 10 is

also homoplasious, but is obviously a ghost taxon because it is an

ancestor of parasite 2. Also notice that parasite character 11, which

infested an ancestor of hosts A, B, and C, is incorrectly placed on the

same branch as parasite character 12. This is also a ghost taxon, be-

cause it was an ancestor of parasite character 10 and would be rep-

resented as a homoplasy on host D, but is more parsimoniously placed

on the same branch with parasite character 12.

Fig. 23. TreeMap reconciliation for trial 1 illustrating five cospecia-

tions (black circles), three duplications (black squares), and a total of

eight sorting events.

426 A.P.G. Dowling / Cladistics 18 (2002) 416–435

Table 5

Results from BPA and TreeMap for 62 test trials

Trial Actual history Brooks parsimony analysis TreeMap—best reconstruction

C D H S C D H S C D H S

1 8 0 1 0 8 0 1 0 5 3 0 8

2 8 0 1 0 8 0 1 0 5 3 0 10

3 8 0 1 0 8 0 4 0 4 4 0 16

4 8 0 1 0 8 0 2 0 6 2 0 4

5 8 0 1 0 8 0 2 0 6 2 0 4

6 8 0 1 0 8 0 2 0 6 2 0 6

7 8 0 1 0 8 0 4 0 4 4 0 15

8 8 0 1 0 8 0 2 0 6 2 0 4

9 8 0 1 0 8 0 2 0 6 2 0 6

10 8 0 1 0 8 0 2 0 6 2 0 7

11 8 0 1 0 8 0 3 0 5 3 0 10

12 8 0 1 0 8 0 3 0 5 3 0 9

13 4 0 1 0 4 0 3 0 2 2 0 9

14 8 0 2 0 8 0 6 0 3 5 0 17

15 8 0 2 0 8 0 3 0 5 3 0 7

16 8 0 2 0 8 0 3 0 5 3 0 9

17 8 0 2 0 8 0 3 0 5 3 0 8

18 8 0 2 0 8 0 4 0 4 4 0 12

19 8 0 1 0 8 0 1 0 8 0 1 0

20 8 0 1 0 8 0 2 0 8 0 1 0

21 8 0 1 0 8 0 2 0 8 0 1 0

22 4 0 1 0 4 0 2 0 4 0 1 0

23 4 0 1 0 4 0 2 0 4 0 1 0

24 8 0 1 0 8 0 1 0 8 0 1 0

25 10 0 1 0 10 0 1 0 10 0 1 0

26 9 0 1 0 9 0 1 0 9 0 1 0

27 6 0 1 0 6 0 1 0 6 0 1 0

28 9 1 0 0 9 1 0 0 9 1 0 0

29 8 1 0 0 8 1 0 0 8 1 0 0

30 8 1 0 0 8 1 0 0 8 1 0 0

31 9 1 0 0 9 1 0 0 9 1 0 0

32 10 1 0 0 10 1 0 0 10 1 0 0

33 10 1 0 0 10 1 0 0 10 1 0 0

34 11 1 0 0 11 1 0 0 11 1 0 0

35 6 0 0 1 6 0 0 1 6 0 0 1

36 7 0 0 1 7 0 0 2 7 0 0 1

37 6 0 0 2 6 0 0 2 6 0 0 2

38 7 0 0 1 7 0 0 2 7 0 0 1

39 6 0 0 2 6 0 0 3 6 0 0 2

40 6 0 0 2 6 0 0 3 6 0 0 2

41 5 0 0 3 5 0 0 3 5 0 0 3

42 5 0 0 1 5 0 0 1 5 0 0 1

43 5 0 0 2 5 0 0 4 5 0 0 2

44 5 0 0 3 5 0 0 7 5 0 0 3

45 7 1 0 3 8 0 1 0 7 0 1 1

46 10 2 0 1 10 2 0 1 10 2 0 1

47 10 1 0 1 10 1 0 1 8 0 3 1

48 9 1 0 1 9 1 0 1 9 1 0 1

49 10 2 0 2 10 2 0 1 10 1 1 1

50 7 1 0 2 7 1 0 3 7 1 0 2

51 9 1 0 1 9 1 0 1 9 1 0 1

52 10 1 0 2 10 1 0 2 10 1 0 2

53 9 1 0 2 9 1 0 3 9 1 0 2

54 7 0 1 1 6 0 2 1 6 0 1 0

55 9 0 1 1 9 1 0 0 9 0 1 1

56 7 0 1 4 7 0 2 2 7 0 1 3

57 8 0 1 1 7 0 2 0 7 0 1 1

58 8 0 1 1 7 0 1 0 7 0 1 1

59 7 0 1 1 7 0 4 3 3 4 0 17

A.P.G. Dowling / Cladistics 18 (2002) 416–435 427

the historical associations between the host and theparasite. This was expected since TreeMap postulatesthese two events rather than host switches, but the in-credibly large number of sorting events postulated (248

total) was unexpected. Overall these results inform usthat BPA is the more consistently accurate method, butdoes not tell us in which circumstances either methodperforms better than the other. To uncover this, we will

Table (continued)

Trial Actual history Brooks parsimony analysis TreeMap—best reconstruction

C D H S C D H S C D H S

60 7 0 1 1 7 0 4 2 3 4 0 16

61 7 0 1 1 7 0 3 0 5 2 0 7

62 7 0 1 1 7 0 3 2 4 3 0 11

Totals 476 18 41 45 474 18 85 51 406 80 19 248

Note. Columns under the ‘‘Actual history’’ heading apply to the number of coevolutionary events built into each test trial (C, cospeciation;

D, duplications; H, host Switches; S, Sorting events).

Table 6a

Results from 18 test trials involving host switching without speciation

Trial Actual history Brooks parsimony analysis TreeMap—best reconstruction

C D H S C D H S C D H S

1 8 0 1 0 8 0 1 0 5 3 0 8

2 8 0 1 0 8 0 1 0 5 3 0 10

3 8 0 1 0 8 0 4 0 4 4 0 16

4 8 0 1 0 8 0 2 0 6 2 0 4

5 8 0 1 0 8 0 2 0 6 2 0 4

6 8 0 1 0 8 0 2 0 6 2 0 6

7 8 0 1 0 8 0 4 0 4 4 0 15

8 8 0 1 0 8 0 2 0 6 2 0 4

9 8 0 1 0 8 0 2 0 6 2 0 6

10 8 0 1 0 8 0 2 0 6 2 0 7

11 8 0 1 0 8 0 3 0 5 3 0 10

12 8 0 1 0 8 0 3 0 5 3 0 9

13 4 0 1 0 4 0 3 0 2 2 0 9

14 8 0 2 0 8 0 6 0 3 5 0 17

15 8 0 2 0 8 0 3 0 5 3 0 7

16 8 0 2 0 8 0 3 0 5 3 0 9

17 8 0 2 0 8 0 3 0 5 3 0 8

18 8 0 2 0 8 0 4 0 4 4 0 12

Totals 140 0 23 0 140 0 50 0 88 52 0 161

Note. C, cospeciation; D, duplications; H, host switches; S, sorting events.

Table 6b

Results from nine test trials involving host switching with speciation

Trial Actual history Brooks parsimony analysis TreeMap—best reconstruction

C D H S C D H S C D H S

19 8 0 1 0 8 0 1 0 8 0 1 0

20 8 0 1 0 8 0 2 0 8 0 1 0

21 8 0 1 0 8 0 2 0 8 0 1 0

22 4 0 1 0 4 0 2 0 4 0 1 0

23 4 0 1 0 4 0 2 0 4 0 1 0

24 8 0 1 0 8 0 1 0 8 0 1 0

25 10 0 1 0 10 0 1 0 10 0 1 0

26 9 0 1 0 9 0 1 0 9 0 1 0

27 6 0 1 0 6 0 1 1 6 0 1 0

Totals 65 0 9 0 65 0 13 0 65 0 9 0

Note. C, cospeciation; D, duplications; H, host switches; S, sorting events.

428 A.P.G. Dowling / Cladistics 18 (2002) 416–435

need to break down the table into sections based uponthe types of coevolutionary events involved in each trial.Tables 6a–6e groups the trials into five categories: (1)host switching without speciation (Table 6a); (2) hostswitching with speciation (Table 6b); (3) duplications(Table 6c); (4) sorting events (Table 6d); and (5) com-binations of coevolutionary events (Table 6e). Uponbreaking the data down into these subcategories, thedisparity between the two methods becomes evident.For the events represented in Tables 6a, 6b, and 6e,TreeMap greatly overestimates the number of sortingevents in the various trials.

Results from the analyses of the three published as-sociations are displayed in Table 7.

Overall, BPA produces with fewer duplicationsand sorting events and slightly more host switches,typically due to ghost taxa. One major difference liesin the number of reconstructions provided to the re-searcher by each method. TreeMap produced morethan 480 reconstructions for the gopher/lice analysisand more than 1880 reconstructions were recoveredfor the primate/pinworm analysis. BPA typically pro-duces one or very few most-parsimonious reconstruc-tions.

Discussion

The subgrouping of the results allows us to moreeasily see how the methods perform under specificconditions. The first group (trials 1–18) involves varia-tions of host switching without speciation (Table 6a),causing the widespread taxa and/or redundant distri-bution problems discussed earlier in the paper. Imme-diately we can see that TreeMap has problems withthese host-switching events while BPA seems to performvery well. TreeMap grossly overestimates the number ofsorting events and duplications, which are both zero forthis group of tests. TreeMap predicts 161 sorting eventsand 52 duplications over the 18 host-switching trials(Table 6a). Notice that TreeMap does not even hy-pothesize a single host switch among these trials. That isan average of 12 postulated sorting events and dupli-cations per trial, which means an additional 10 or 11 adhoc assumptions required by TreeMap to explain theassociations between the host and parasite phylogenies.TreeMap also underestimates the number of cospecia-tion events by 52 over the 18 trials, which are almost 3fewer cospeciations per trial. BPA, on the other hand,postulates only a few additional (27 extra) host-switch-

Table 6c

Results from seven test trials involving duplication events

Trial Actual history Brooks parsimony analysis TreeMap—best reconstruction

C D H S C D H S C D H S

28 9 1 0 0 9 1 0 0 9 1 0 0

29 8 1 0 0 8 1 0 0 8 1 0 0

30 8 1 0 0 8 1 0 0 8 1 0 0

31 9 1 0 0 9 1 0 0 9 1 0 0

32 10 1 0 0 10 1 0 0 10 1 0 0

33 10 1 0 0 10 1 0 0 10 1 0 0

34 11 1 0 0 11 1 0 0 11 1 0 0

Totals 65 7 0 0 65 7 0 0 65 7 0 0

Note. C, cospeciation; D, duplications; H, host switches; S, sorting events.

Table 6d

Results from 10 test trials involving sorting events

Trial Actual history Brooks parsimony analysis TreeMap—best reconstruction

C D H S C D H S C D H S

35 6 0 0 1 6 0 0 1 6 0 0 1

36 7 0 0 1 7 0 0 2 7 0 0 1

37 6 0 0 2 6 0 0 2 6 0 0 2

38 7 0 0 1 7 0 0 2 7 0 0 1

39 6 0 0 2 6 0 0 3 6 0 0 2

40 6 0 0 2 6 0 0 3 6 0 0 2

41 5 0 0 3 5 0 0 3 5 0 0 3

42 5 0 0 1 5 0 0 1 5 0 0 1

43 5 0 0 2 5 0 0 4 5 0 0 2

44 5 0 0 3 5 0 0 7 5 0 0 3

Total 52 0 0 17 52 0 0 27 52 0 0 17

Note. C, cospeciation; D, duplications; H, host switches; S, sorting events.

A.P.G. Dowling / Cladistics 18 (2002) 416–435 429

ing events, which was expected because of the problemof ghost taxa discussed earlier. Remember, ghost taxa inno way affect the tree topology, they act only as more ofan eye sore (see Fig. 22).

To break the results down further, an interesting pat-tern arises from trials 1–18, which involve single or mul-tiple host switches between and/or within the two hostlineages. A switch within one of the lineages did not causeas much of a problem for TreeMap as did a switch be-tween the two lineages. The range of ad hoc assumptionsfor within-lineage host switching was between 5 and 9events, but host switches between lineages had between 8and 19 events. Thus, a host switch between lineagesmakesTreeMap much less reliable than its already unreliableperformance involving within-lineage host switches.When two host-switching events occur, it basically dou-bles the amount of events postulated by TreeMap. BPAstill postulates only 0–4 extra ghost taxa host switches.This evidence favors implementing BPA instead of Tree-Map, because with any host-switching event withoutspeciation, TreeMap has the potential to seriously leadone astray, while BPA will, if at all, mislead with only afew, easily detectable ghost taxa.

The next subgroup of trials (19–27) involved similartests of host switching as above, but this time withspeciation occurring as a result of the host switch. Un-like the previous host-switching events, these do notcause the serious problems for TreeMap (Table 6b). Theexact search function consistently recovered the correctcoevolutionary history. For these trials, both methodsperformed well; TreeMap reconstructions were exactmatches and BPA reconstructions were also exact orslightly off due to the ghost taxa error. Notice, withthese host-switching with speciation events, widespreadtaxa are no longer a problem, but redundant parasitedistributions are still present. Therefore, it must be ei-ther the widespread parasites or a combination ofwidespread parasites and redundant distributions thatcause TreeMap to fail.

With this in mind, a series of trials involving onlywidespread taxa was performed. Considering that in thefour trials testing only widespread taxa (trials 59–62), theaverage number of extra coevolutionary events postu-lated was 14, with a range of 7–19 events, the evidencestrongly suggests that widespread parasites are the sourceof TreeMap’s unreliability. It appears that Page’s choice

Table 7

Results from BPA and TreeMap analysis of three sets of published data

Trial Brooks parsimony analysis TreeMap—best reconstruction No. Tree Map

reconstructionsC D H S C D H S

Pocket gopher/louse (Hafner et al., 1994) 9 0 10 3 9 1 6 7 > 480

Seabird/louse (Paterson et al., 2000) 10 0 4 4 10 0 3 6 2

Primates/pinworms (Hugot, 1999) 7 2 6 0 7 7 2 13 > 1880

Note. C, cospeciation; D, duplications; H, host switches; S, sorting events.

Table 6e

Results from 18 test trials involving various combinations of duplications, host switches, and/or sorting events

Trial Actual history Brooks parsimony analysis TreeMap—best reconstruction

C D H S C D H S C D H S

45 7 1 0 3 8 0 1 0 7 0 1 1

46 10 2 0 1 10 2 0 1 10 2 0 1

47 10 1 0 1 10 1 0 1 8 0 3 1

48 9 1 0 1 9 1 0 1 9 1 0 1

49 10 2 0 2 10 2 0 1 10 1 1 1

50 7 1 0 2 7 1 0 3 7 1 0 2

51 9 1 0 1 9 1 0 1 9 1 0 1

52 10 1 0 2 10 1 0 2 10 1 0 2

53 9 1 0 2 9 1 0 3 9 1 0 2

54 7 0 1 1 6 0 2 1 6 0 1 0

55 9 0 1 1 9 1 0 0 9 0 1 1

56 7 0 1 4 7 0 2 2 7 0 1 3

57 8 0 1 1 7 0 2 0 7 0 1 1

58 8 0 1 1 7 0 1 0 7 0 1 1

59 7 0 1 1 7 0 4 3 3 4 0 17

60 7 0 1 1 7 0 4 2 3 4 0 16

61 7 0 1 1 7 0 3 0 5 2 0 7

62 7 0 1 1 7 0 3 2 4 3 0 11

Total 148 11 9 27 146 11 22 23 130 21 10 69

Note. C, cospeciation; D, duplications; H, host switches; S, sorting events.

430 A.P.G. Dowling / Cladistics 18 (2002) 416–435

to use the one-host-per-parasite assumption is the directcause of this problem since the algorithm does not takeinto consideration the possibility of a host switch withoutspeciation. When dealing with widespread parasites in acoevolutionary study, the best method to use is BPA. Ofall the unknowns involved when dealing with historiesand historical associations of organisms, widespreadparasites are one observable character of an associationthat we can ‘‘know.’’ Since a widespread parasite issimply a parasite species that infests multiple host species,it makes them easy to recognize, therefore making it avalid decision to implement BPA instead of TreeMap.

The next two groups of tests involve separately test-ing the effects of duplications and extinctions. In thetrials involving duplications (trials 28–34), both methodsaccurately predicted the correct reconstruction. Neithermethod had any specific problem when dealing withduplications (Table 6c). On the other hand, the methodsperformed differently during the sorting event (extinc-tion) trials (35–44). Once again, TreeMap handled thesetrials flawlessly, but BPA had some slight (10 additionalextinctions), but expected, problems (Table 6d). Duringthese trials, the problem of ancestral extinctions dis-cussed earlier was encountered. When a parasite ismissing from a host, BPA shows the previous ancestorsalso as reversals when mapping the parasite charactersonto the host phylogeny. Like the ghost charactersproblem, this extinction problem is also due to the in-dependent treatment of characters through additive bi-nary coding. If we institute Wiley’s (1988a,b) suggestionof treating absent parasites as missing data, the problemdisappears. Just like the ghost taxa, it is obvious on thetree when this problem occurs, and therefore it can beoverlooked as a flaw in the methodology that has noeffect on the actual structure of the BPA tree.

The remaining group of trials (45–62) includes testsusing combinations of the different coevolutionaryevents. These trials were performed to ensure that nor-mally benign events did not become problematical whencombined together. No unexpected results were ob-tained (Table 6e). Trials that included host switchesleading to widespread taxa were no more problematicalthan expected when combined with other events.

The final analyses involved the three empirical datasets obtained from published studies (pocket gophersand lice; seabirds and lice; primates and pinworms). Forall three examples, both methods recovered the sameestimates of cospeciation events, but disagreed in thenumber of additional coevolutionary events. Resultswere as expected based upon the previous trials in whichBPA typically found a slightly more parsimonious re-construction with typically more host-switching events,but much fewer duplications and sorting events. Forexample, in the primate/pinworm example, BPA esti-mated 5 host switches versus 2 for the TreeMap recon-struction, but only estimated 2 duplications and 1

sorting event as opposed to 7 and 13, respectively, forTreeMap. If not for the problems of ghost taxa dis-cussed earlier, the estimates of host switching by BPAwould have been slightly less. These results followed thepatterns uncovered in the various test trials.

Another major difference between the results was thevariation between reconstructions provided by eachmethod. Brooks parsimony analysis resulted in onemost-parsimonious reconstruction, while TreeMapprovided more than 480 reconstructions for the gopher/lice data sets and more than 1880 reconstructions for theprimate/pinworm associations. The researcher is left tosort through each reconstruction to determine whether itis the most optimal reconstruction to that point or not.Often the most optimal reconstruction (i.e., the recon-struction with maximal cospeciation events and minimalsorting events, duplications, and host switches) had avariation of two to six steps in the various categories ofcoevolutionary events, making it difficult to choose onetopology as the optimal choice. The seabird/lice data setdid not have this problem.

Conclusion

These tests demonstrate that neither methodology isalways accurate; however, Brooks parsimony analysiswas the more reliable method for two main reasons.First, TreeMap grossly overestimates duplications andsorting events when widespread taxa due to hostswitching are present in the associations between thehost and parasite phylogenies. Second, the ghost taxathat BPA mistakenly produces do not cause any topo-logical changes in the tree, are readily recognizable, andare easy to interpret. The patterns of method perfor-mance found in the test trials also appear when real datasets are analyzed, which allows extrapolation from theartificial to the real world. Since neither method iscompletely error-free, nor is the true history of the as-sociations knowable, it therefore seems logical to choosethe method that provides the smallest range of error.

Acknowledgments

I thank Dan Brooks, Arnold Kluge, and otheranonymous reviewers for their thorough review andhelpful comments, which greatly improved the manu-script. And thanks are due to Jennifer Ast and HeatherHeying for their helpful reviews long ago during the veryrough drafts of this paper.

Appendix

The host phylogeny is represented in black and theparasite phylogeny is in gray. Host switches are desig-

A.P.G. Dowling / Cladistics 18 (2002) 416–435 431

nated as a solid arrow for a host switch with speciationand a dashed arrow for a host switch without speciation.Loss or extinction of a parasite is indicated by astoppage of the parasite phylogeny along the host phy-logeny. Duplications, or independent speciations of theparasite are represented by a split, or doubling, of theparasite phylogeny.

Trials 1–6

Trials 7–12

Trials 13–18

Trials 19–24

432 A.P.G. Dowling / Cladistics 18 (2002) 416–435

Trials 25–30

Trials 31–36

Trials 37–42

Trials 43–48

A.P.G. Dowling / Cladistics 18 (2002) 416–435 433

Trials 49–54

Trials 55–60

Trials 61 and 62

References

Brooks, D.R., 1979. Testing the context and extent of host–parasite

coevolution. Syst. Zool. 28, 299–307.

Brooks, D.R., 1981. Hennig parasitological method—a proposed

solution. Syst. Zool. 30, 229–249.

Brooks, D.R., 1985. Historical ecology: a new approach to studying

the evolution of ecological associations. Ann. Missouri Bot.

Garden 72, 660–680.

Brooks, D.R., 1990. Parsimony analysis in historical biogeography

and coevolution—methodological and theoretical update. Syst.

Zool. 39, 14–30.

Brooks, D.R., McLennan, D.A., 1991. Phylogeny, Ecology, and

Behavior: A Research Program in Comparative Biology. Univer-

sity of Chicago Press, Chicago.

Brooks, D.R., McLennan, D.A., 1993. Parascript: Parasites and the

Language of Evolution. Smithsonian Press, Washington, DC.

Brooks, D.R., McLennan, D.A., 2001. A comparison of a discovery-

based and an event-based method of historical biogeography. J.

Biogeogr. 28, 757–767.

Brooks, D.R., McLennan, D.A., 2002. The Nature of Discovery: An

Evolutionary Voyage of Discovery. University of Chicago Press,

Chicago.

Brooks, D.R., Van Veller, M.G.P., McLennan, D.A., 2001. How to do

BPA, really. J. Biogeogr. 28, 345–358.

Charleston, M.A., 1998. Jungles: a new solution to the host/parasite

phylogeny reconciliation problem. Math. Biosci. 149, 191–223.

Enghoff, H., 1996. Widespread taxa, sympatry, dispersal, and an

algorithm for resolved area cladograms. Cladistics 12, 349–364.

Hafner, M.S., Sudman, P.D., Villablanca, F.X., Sprading, T.A.,

Demastes, J.W., Nadler, S.A., 1994. Disparate rates of molecular

evolution in cospeciating host and parasites. Science 265, 1087–

1090.

Hoberg, E.P., Brooks, D.R., Siegel-Causey, D., 1997. Host–parasite

co-speciation: history, principles, and prospects. In: Clayton, D.H.,

Moore, J. (Eds.), Host–Parasite Evolution: General Principles and

Avian Models. Oxford University Press, Oxford, pp. 212–235.

Hugot, J.P., 1999. Primates and their pinworm parasites: the Cameron

hypothesis revisited. Syst. Biol. 48, 523–546.

Kluge, A.G., Farris, J.S., 1969. Quantitative phyletics and the

evolution of anurans. Syst. Zool. 18, 1–32.

Maddison, W., Maddison, D., 1992. MacClade: Analysis of Phylogeny

and Character Evolution. Sinauer, Sunderland, MA.

Nelson, G., Platnick, N.I., 1981. Systematics and Biogeography:

Cladistics and Vicariance. Columbia University Press, New York.

Page, R.D.M., 1990. Component analysis—a valiant failure. Cladistics

6, 119–136.

Page,R.D.M.,1994a.Mapsbetween treesandcladistic analysisofhistorical

associations among genes, organisms, and areas. Syst. Biol. 43, 58–77.

Page, R.D.M., 1994b. Parallel ‘‘phylogenies’’—reconstructing the

history of host–parasite assemblages. Cladistics 10, 155–173.

Paterson, A.M., Banks, J., 2001. Analytical approaches to measuring

cospeciation of host and parasites: through a glass, darkly. Int. J.

Parasitol. 31, 1012–1022.

Paterson, A.M., Gray, R.D., 1997. Host–parasite co-speciation, host

switching, and missing the boat. In: Clayton, D.H., Moore, J.

(Eds.), Host–Parasite Evolution: General Principles and Avian

Models. Oxford University Press, Oxford, pp. 236–250.

Paterson, A.M., Wallis, G.P., Wallis, L.J., Gray, R.D., 2000.

Seabird and louse coevolution: complex histories revealed by 12S

rRNA sequences and reconciliation analyses. Syst. Biol. 49, 383–

399.

Ronquist, F., 1995. Reconstructing the history of host–parasite

associations using generalized parsimony. Cladistics 11, 73–89.

Ronquist, F., 1998. Three-dimensional cost-matrix optimization and

maximum cospeciation. Cladistics 14, 167–172.

434 A.P.G. Dowling / Cladistics 18 (2002) 416–435

Ronquist, F., Nylin, S., 1990. Process and pattern in the evolution of

species associations. Syst. Zool. 39, 323–344.

Siddall, M.E., 1996. Phylogenetic covariance probability: confidence

and historical associations. Syst. Biol. 45, 48–66.

Van Veller, M.G.P., Brooks, D.R., 2001. When simplicity is not

parsimonious: a priori and a posteriori methods in historical

biogeography. J. Biogeogr. 28, 1–12.

Van Veller, M.G.P., Kornet, D.J., Zandee, M., 2000. Methods in

vicariance biogeography: assessment of the implementations of

assumptions 0, 1, and 2. Cladistics 16, 319–345.

Wiley, E.O., 1988a. Parsimony analysis and vicariance biogeography.

Syst. Zool. 37, 271–290.

Wiley, E.O., 1988b. Vicariance biogeography. Annu. Rev. Ecol. Syst.

19, 513–542.

A.P.G. Dowling / Cladistics 18 (2002) 416–435 435