Phylogenetic relationships of flaviviruses correlate with their epidemiology, disease association...

10
Journal of General Virology (2001), 82, 1867–1876. Printed in Great Britain ................................................................................................................................................................................................................................................................................... Phylogenetic relationships of flaviviruses correlate with their epidemiology, disease association and biogeography Michael W. Gaunt, 1, 5 Amadou A. Sall, 2, 5 Xavier de Lamballerie, 3, 5 Andrew K. I. Falconar, 4 Tatyana I. Dzhivanian 6 and Ernest A. Gould 5 1 Pathogen Molecular Biology and Biochemistry Unit, Department of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK 2 Institut Pasteur de Dakar, Dakar, Senegal 3 Unite ! de Virus Emergents, Faculte ! de Me ! decine, Boulevard Jean Moulin, 13005 Marseille, France 4 Wellcome Trust Centre for the Epidemiology of Infectious Disease, Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3FY, UK 5 Centre for Ecology and Hydrology (formerly Institute of Virology and Environmental Microbiology), Mansfield Road, Oxford OX1 3SR, UK 6 Institute of Poliomyelitis and Viral Encephalitis, Moscow, Russia Phylogenetic analysis of the Flavivirus genus, using either partial sequences of the non-structural 5 gene or the structural envelope gene, revealed an extensive series of clades defined by their epidemiology and disease associations. These phylogenies identified mosquito-borne, tick-borne and no-known-vector (NKV) virus clades, which could be further subdivided into clades defined by their principal vertebrate host. The mosquito-borne flaviviruses revealed two distinct epidemio- logical groups : (i) the neurotropic viruses, often associated with encephalitic disease in humans or livestock, correlated with the Culex species vector and bird reservoirs and (ii) the non-neurotropic viruses, associated with haemorrhagic disease in humans, correlated with the Aedes species vector and primate hosts. Thus, the tree topology describing the virus–host association may reflect differences in the feeding behaviour between Aedes and Culex mosquitoes. The tick-borne viruses also formed two distinct groups : one group associated with seabirds and the other, the tick-borne encephalitis complex viruses, associated primarily with rodents. The NKV flaviviruses formed three distinct groups : one group, which was closely related to the mosquito-borne viruses, associated with bats ; a second group, which was more genetically distant, also associated with bats ; and a third group associated with rodents. Each epidemiological group within the phylogenies revealed distinct geographical clusters in either the Old World or the New World, which for mosquito-borne viruses may reflect an Old World origin. The correlation between epidemiology, disease correlation and biogeography begins to define the complex evolutionary relationships between the virus, vector, vertebrate host and ecological niche. Introduction The Flavivirus genus contains many viruses associated with emerging and re-emerging human diseases, including dengue haemorrhagic fever, Kyasanur Forest haemorrhagic disease, Japanese encephalitic disease, Rocio virus encephalitis (Monath & Heinz, 1990) and West Nile fever (Lanciotti et al., 1999). Author for correspondence : Michael Gaunt (at the London School of Hygiene and Tropical Medicine). Fax 44 20 7636 5739. e-mail michael.gaunt!lshtm.ac.uk Elucidating the evolution of viruses is particularly valuable for understanding the origin and spread of emerging and re- emerging diseases (Holmes, 1998). Flaviviruses are a useful model for studying the evolution of vector-borne virus diseases, since they comprise mosquito- borne, tick-borne and no-known-vector (NKV) viruses (Porterfield, 1980). The genus contains about 70 recognized flaviviruses that are antigenically related and have a widespread geographical dispersion. They are positive-stranded RNA viruses with a genome of approximately 105 kb. Virions contain three structural proteins, capsid (C), membrane (M) and 0001-7708 # 2001 SGM BIGH

Transcript of Phylogenetic relationships of flaviviruses correlate with their epidemiology, disease association...

Journal of General Virology (2001), 82, 1867–1876. Printed in Great Britain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Phylogenetic relationships of flaviviruses correlate with theirepidemiology, disease association and biogeography

Michael W. Gaunt,1, 5 Amadou A. Sall,2, 5 Xavier de Lamballerie,3, 5 Andrew K. I. Falconar,4

Tatyana I. Dzhivanian6 and Ernest A. Gould5

1 Pathogen Molecular Biology and Biochemistry Unit, Department of Infectious and Tropical Diseases, London School of Hygiene andTropical Medicine, Keppel Street, London WC1E 7HT, UK2 Institut Pasteur de Dakar, Dakar, Senegal3 Unite! de Virus Emergents, Faculte! de Me! decine, Boulevard Jean Moulin, 13005 Marseille, France4 Wellcome Trust Centre for the Epidemiology of Infectious Disease, Department of Zoology, University of Oxford, South Parks Road,Oxford OX1 3FY, UK5 Centre for Ecology and Hydrology (formerly Institute of Virology and Environmental Microbiology), Mansfield Road, Oxford OX1 3SR, UK6 Institute of Poliomyelitis and Viral Encephalitis, Moscow, Russia

Phylogenetic analysis of the Flavivirus genus, using either partial sequences of the non-structural5 gene or the structural envelope gene, revealed an extensive series of clades defined by theirepidemiology and disease associations. These phylogenies identified mosquito-borne, tick-borneand no-known-vector (NKV) virus clades, which could be further subdivided into clades defined bytheir principal vertebrate host. The mosquito-borne flaviviruses revealed two distinct epidemio-logical groups: (i) the neurotropic viruses, often associated with encephalitic disease in humans orlivestock, correlated with the Culex species vector and bird reservoirs and (ii) the non-neurotropicviruses, associated with haemorrhagic disease in humans, correlated with the Aedes species vectorand primate hosts. Thus, the tree topology describing the virus–host association may reflectdifferences in the feeding behaviour between Aedes and Culex mosquitoes. The tick-borne virusesalso formed two distinct groups: one group associated with seabirds and the other, the tick-borneencephalitis complex viruses, associated primarily with rodents. The NKV flaviviruses formed threedistinct groups: one group, which was closely related to the mosquito-borne viruses, associatedwith bats; a second group, which was more genetically distant, also associated with bats; and athird group associated with rodents. Each epidemiological group within the phylogenies revealeddistinct geographical clusters in either the Old World or the New World, which for mosquito-borneviruses may reflect an Old World origin. The correlation between epidemiology, disease correlationand biogeography begins to define the complex evolutionary relationships between the virus,vector, vertebrate host and ecological niche.

IntroductionThe Flavivirus genus contains many viruses associated with

emerging and re-emerging human diseases, including denguehaemorrhagic fever, Kyasanur Forest haemorrhagic disease,Japanese encephalitic disease, Rocio virus encephalitis (Monath& Heinz, 1990) and West Nile fever (Lanciotti et al., 1999).

Author for correspondence: Michael Gaunt (at the London School of

Hygiene and Tropical Medicine).

Fax ­44 20 7636 5739. e-mail michael.gaunt!lshtm.ac.uk

Elucidating the evolution of viruses is particularly valuable forunderstanding the origin and spread of emerging and re-emerging diseases (Holmes, 1998).

Flaviviruses are a useful model for studying the evolutionof vector-borne virus diseases, since they comprise mosquito-borne, tick-borne and no-known-vector (NKV) viruses(Porterfield, 1980). The genus contains about 70 recognizedflaviviruses that are antigenically related and have awidespreadgeographical dispersion. They are positive-stranded RNAviruses with a genome of approximately 10±5 kb. Virionscontain three structural proteins, capsid (C), membrane (M) and

0001-7708 # 2001 SGM BIGH

M. W. Gaunt and othersM. W. Gaunt and others

Table 1. Flaviviruses analysed in this study

All of the flaviviruses that were analysed in this study are classified by their virus group (Heinz et al., 2001).

Virus group

Flavivirus analysed,excluding flavivirus E gene

sequences (presentedseparately) Abbreviation

Flaviviruspartial E genessequenced inthis study Abbreviation Accession no.

Louping ill Louping ill LISpanish sheep encephalitis SSETurkish sheep encephalitis TSEGreek goat encephalitis GGE

Tick-borne encephalitis Far-east Asian subtype FETBEWest European subtype WTBE

Mammalian tick-borne Kyasanur Forest disease KFD Gadgets Gulley GGY AF372408Langat LGT Kadam KAD AF372420Omsk haemorrhagic fever OHFPowassan POWRoyal Farm RFKarshi KSI

Seabird tick-borne Tyuleniy TYU Meaban MEA AF372423Saumarez Reef SRE

Aroa Iguape IGU Aroa AROA AF372413Bussuquara BSQ AF372410Naranjal NJL AF372411

Dengue Dengue types 1–3, 4 DEN1–3, 4Kedougou KED

Japanese encephalitis Japanese encephalitis JE Alfuy ALF AF372406Murray Valley encephalitis MVE Cacipacore CPC AF372417Saint Louis encephalitis SLE West Nile (tick) WN (tick) AF372405Usutu USUKunjin KUNYaounde YAO

Ntaya Tembusu TMU Bagaza BAG AF372407THCr* THCr* Ilheus ILH AF372414

Rocio ROC AF372409Israel turkeymeningoencephalo-myelitis

IT AF372415

Ntaya NTA AF372416Spondweni Spondweni SPO AF372412

Zika ZIK AF372422Yellow Fever Banzi BAN Edge Hill EH AF372419

Bouboui BOU Jugra JUG AF372418Potiskum POT Saboya SAB AF372421Sepik SEPUganda S UGSYellow Fever YF

Kokobera Kokobera KOKStratford STR

Entebbe bat Entebbe bat ENTSokoluk SOKYokose YOK

Modoc Apoi APOICowbone Ridge CRJutiapa JUTModoc MODSal Vieja SVSan Perlita SP

BIGI

Phylogenetic analysis of flavivirusesPhylogenetic analysis of flaviviruses

Table 1 (cont.)

Virus group

Flavivirus analysed,excluding flavivirus E gene

sequences (presentedseparately) Abbreviation

Flaviviruspartial E genessequenced inthis study Abbreviation Accession no.

Rio Bravo Rio Bravo RBBukalasa BUKCarey Island CIDakar Bat DKMontana myotis leukoencephalitis MMLPhnom Penh PPBBatu Cave BC

* Additional flavivirus analysed by Kuno et al. (1998).

envelope (E), and infected cells have been shown to containseven non-structural (NS) proteins, NS1, NS2A, NS2B, NS3,NS4A, NS4B and NS5 (Rice et al., 1985 ; Rice, 1996).

The evolution, dispersal patterns and epidemiologicalcharacteristics of flaviviruses are believed to have beendetermined through a combination of constraints imposed bythe arthropod vector, the vertebrate hosts, the associatedecology and the influence of human commercial activity. Forexample, the clinal evolution of the tick-borne encephalitis(TBE) complex viruses across the Euro-Asian land mass reflectsthe life-cycle and feeding habits of the ixodid tick (Zanotto etal., 1995) combined with the appropriate rodent host speciesand climatic conditions (Randolph et al., 2000). Similarly, theintroduction of goats and sheep onto the hillsides of Turkey,Greece, Spain, Ireland, Norway and the British Isles wasfollowed by the appearance of louping ill (LI) virus (Reid,1984 ; Gao et al., 1993 ; Gaunt et al., 1997 ; McGuire et al.,1998). The emergence and expansion of dengue haemorrhagicfever in the tropics has followed an increase in human andmosquito population densities brought about by urbanizationand industrialization (Zanotto et al., 1996). Finally, the trans-Atlantic dispersal of yellow fever (YF) virus, and possiblymany other flaviviruses, was thought to have coincided withthe transportation of people and mosquitoes from Africa to theAmericas on slave ships during the past few centuries (Strode,1951 ; Gould et al., 1997).

Using detailed molecular phylogenetic analyses, we haveattempted to bring all these factors together in order tounderstand the nature of flavivirus evolution, epidemiologyand dispersal. In this study, maximum likelihood (ML)phylogenetic analyses of virtually all of the recognizedflaviviruses were performed using partial NS5 gene sequencesand the tree was compared with one based on the E genesequences. The epidemiological and aetiological characteristicsof each flavivirus have been mapped onto the phylogeny toreveal a striking pattern of coincidence between the topological

arrangement of the viruses and their associated epidemiologicalcharacteristics.

MethodsThe flavivirus cDNA sequences determined in this study and the

flavivirus abbreviations used hereafter are described in Table 1. ViralRNA extraction and RT–PCR procedures have been described previously(Gritsun & Gould, 1995 ; Gaunt et al., 1997). A nested set of primersencompassing the species-specific amino acid motifs defined previously(Marin et al., 1995) was designed to amplify a partial E gene locus for thegenus Flavivirus. The PCR primers amplified 85% of the flavivirusesdescribed by Calisher et al. (1989) and will be described elsewhere.Amplified cDNA was cloned into pGEM T vectors (Promega) andsequenced using the Thermo-Sequenase Cycle Sequencing kit(Amersham), according to the manufacturer’s instructions. cDNAsequences were determined from two recombinant plasmids preparedfrom each virus and any differences were resolved using a third clone.

Phylogenetic analyses were performed using the partial NS5sequences described previously by Kuno et al. (1998), with the additionof LI virus, strain 369 (Y07863). The flavivirus E gene sequences wereobtained in this study and also included E gene sequences from previousstudies (Marin et al., 1995 ; Billoir et al., 2000). Partial NS5 gene nucleotidesequences were aligned using CLUSTAL X and edited manually fromtheir amino acid alignment (Thompson et al., 1994). The partial E genesequences were exclusively aligned using their amino acid sequences andthe final alignment comprised 308 amino acids (including gaps).

The robustness of the data sets was examined for deviations innucleotide or amino acid base composition between taxa, as nucleotide oramino acid base homogeneity is a prerequisite for the mutation modelsused in this study. The NS5 nucleotide alignment required the removal ofthe third codon position to obtain nucleotide base homogeneity (χ#-test,P! 0±05 for all three codon positions and P" 0±99 for codon positionsone and two only) (PAUP*, version 4.0 ; Swofford, 1999). The E geneamino acid alignment required the removal of cell fusing agent virus,which had sometimes been used in previous analyses, to obtain aminoacid homogeneity under the same test (P" 0±05 for all other species)(Puzzle ; Strimmer & von Haeseler, 1996). Nucleotide variation wasexamined using a sliding window analysis (SWAN; Proutski & Holmes,1998). The NS5 gene was further investigated for possible saturation

BIGJ

M. W. Gaunt and othersM. W. Gaunt and others

using a 10 bp sliding window analysis, which estimated as an entropyfunction of the nucleotide variation (Var). The sliding window analysis of

the mosquito-borne flaviviruses, and each mosquito-borne flavivirusclade described later, identifies a region between nucleotides 342 and 392of the 693 bp NS5 gene alignment showing the highest level of variation(Var¯ 0±72–1±21). This region coincided a large alignment gap and

proved difficult to align using amino acids ; therefore, these nucleotideswere subsequently removed. The ML model for NS5 gene nucleotidesubstitution was determined by testing 40 models of nucleotide

substitutions (MODELTEST; Possado & Krandall, 1998), which describedan eight parameter model consisting of the general time reversible (GTR)model of nucleotide substitution (six parameters), an invariant rateparameter (PINVAR) and the alpha parameter of a four category discrete

gamma distribution (Γ ; one parameter). The ML model parameters wereestimated by an automated reiterative ML heuristic search (PAUP*) andfrom Jukes–Cantor distances (MODELTEST). Parameter estimates were

incorporated into a full heuristic ML search for ten replications. The levelof phylogenetic support was determined by bootstrap re-sampling usingML distances incorporating each parameter estimate and a full heuristicsearch for 1000 replications (PAUP*). In addition, 100 Monte Carlo DNA

sequence simulations were performed from the final NS5 gene treetopology and reconstructed using the ML parameter estimates, asdescribed previously (Seq-Gen ; Rambaut & Grassly, 1997).

Evolutionary reconstructions of partial E gene amino acid alignmentswere performed by ML quartet puzzling using the JTT substitutionmatrix (Jones et al., 1992) for 10000 quartet puzzling steps(Puzzle ; Strimmer & von Haeseler, 1996). Likelihoods were analysed for

0, 4, 8 and 16 discrete Γ categories and the presence or absence of anPINVAR was assessed using a likelihood ratio test.

The phylogenies obtained from the partial NS5 and E gene sequences

of each flavivirus were tested for their association with particulararthropod vectors, disease associations (haemorrhagic and encephalitic)and their geographical distribution using the International Catalogue ofArboviruses (Karabatsos, 1995).

ResultsPhylogenetic reconstruction

All NS5 gene phylogenies produced a single likelihoodregardless of the taxa order in each full heuristic searchperformed (n& 10 in all cases ; Table 2). Specific details forbuilding the ML NS5 gene phylogeny and the modelparameters used are described in Table 2.

The robustness (convergence) of the E gene ML phylogenywas observed after 10000 quartet puzzling steps from two keyindicators. Firstly, only 3% of the quartets for all of thesesearches was unresolved. Secondly, although six models of ratesubstitution were tested (see Methods), the quartet puzzlingsupport differed by a maximum of 1%. Thus, no difference inbranching order between the E gene models was observed,regardless of the combination of rate parameters used. Thephylogeny for the significantly highest likelihood, as de-termined by likelihood ratio tests, was an eight category Γ

distribution with PINVAR.Figs 1 and 2 present the phylogenies for the NS5 gene and

the E gene, respectively. The topologies showed congruence atall levels where bootstrap or quartet puzzling support was

Tabl

e2

.ML

tree

build

ing

met

hod

for

NS5

amin

oac

id-b

ased

nucl

eotid

eal

ignm

ent

Asix

par

amet

erG

TR

(a–f),PIN

VA

Ran

da

four

cate

gory

discr

ete

Γdistrib

utio

nw

ere

used

.J–

Cden

ote

sth

eJu

kes–

Can

tordista

nce

met

hod,M

Lden

ote

san

auto

mat

edre

iter

ativ

em

axim

umlik

elih

ood

par

amet

eres

tim

atio

n.Re-

estim

atio

nofth

em

odel

ofnuc

leotide

substitut

ion

forth

etrun

cate

dalig

nm

entw

asid

entica

lto

the

origin

alan

dth

ere

sultin

gpar

amet

eres

tim

ates

wer

eas

sess

edfo

rco

nver

gen

ceat

two

dec

imal

place

s(d

.p.).

The

par

amet

ersofth

eG

TR

model

show

edno

conver

gen

cebet

wee

nphylo

gen

iesat

2d.p

.an

da

seco

nd

ML

auto

mat

edre

iter

ativ

ese

arch

was

conduc

ted

tore

-estim

ate

the

six

par

amet

ers,

inco

rpora

ting

the

value

sobta

ined

forΓ

and

PIN

VA

R.

Nuc

leot

ide

regi

onG

TRRep

licat

ions

per

No.

full

sear

ches

incl

udin

gLi

kelih

ood

Cal

cula

tion

rem

oved

ab

cd

ef

Γ,α

PIN

VA

Res

timat

epa

ram

eter

s*(-

ln)

1.

J–C

None

2±2

536

1±8

041

1±0

767

1±3

668

4±3

338

10±8

346

0±2

625

–30

19085±5

2.

ML

None

4±3

028

2±3

345

1±2

051

2±0

317

5±7

642

10±8

401

0±2

489

310

10

19040±2

3.

J–C

Hyper

-var

iable

2±4

503

1±7

691

1±0

382

1±4

249

4±5

630

10±8

35

0±2

74

––

–Conver

gen

ceat

2d.p

.–

––

––

––

0±8

40.2

54.

ML

Hyper

-var

iable

4±1

869

2±1

711

1±1

514

1±9

341

5±8

155

10±8

40±2

51620

10

16577±8

*A

full

heu

ristic

ML

sear

chin

corp

ora

ting

the

par

amet

eres

tim

ates

.

BIHA

Phylogenetic analysis of flavivirusesPhylogenetic analysis of flaviviruses

Fig. 1. Phylogeny tree of the genus Flavivirus based on the NS5 gene sequence constructed using an eight parameter model(GTR, Γ and PINVAR). Brackets denote the principle vector clade or the principle vertebrate host clade for the viruses therein.The numbers above each lineage show the percentage bootstrap support for that branch; numbers below 60 are not shown.Unequivocal vertebrate host clades are designated for monophyletic groups containing two or more virus species.

greater than 60%. Equivocal incongruence was observed forSLE, AROA, BSQ and NJL virus monophyly and the DENviruses between the two phylogenies presented in Figs 1and 2.

As demonstrated previously (Kuno et al., 1998), the NS5gene phylogeny defined three major groups comprising themosquito-borne, tick-borne and NKV flaviviruses. However,both analyses showed that the three NKV bat-associatedviruses, ENT, SOK and YOK viruses, were grouped within themosquito-borne virus clades forming a basal lineage with YFand SEP viruses.

Aedes clades and the Culex monophyly

Mapping epidemiological and disease characteristics of theindividual mosquito-borne viruses onto the phylogenetic treesrevealed a correlation between the principal vector genera(Culex and Aedes species), the principal vertebrate hosts (birdsand}or mammals) and the virus tropisms in humans andlivestock (neurotropic versus non-neurotropic) (Figs 1 and 2).The mosquito-borne viruses could be divided into twoepidemiologically distinct vector groups, those that wereprimarily isolated from Aedes species and those that were

BIHB

M. W. Gaunt and othersM. W. Gaunt and others

Fig. 2. Phylogeny tree of the genus Flavivirus based on the E gene amino acid sequence constructed using the JTT substitutionmodel and incorporating an additional two parameters (Γ and PINVAR). Nomenclature is identical to that in Fig. 1.

primarily isolated from Culex species. The 17 flaviviruses thatwere primarily isolated from Aedes species formed twoparaphyletic groups, one containing YF virus and the othercontaining the DEN viruses. In these two paraphyletic groups,82% of mosquito-borne flaviviruses are known to beassociated with Aedes species (14}17), hereafter denoted as theAedes clades. The other viruses in these clades, i.e. SEP, POT,BAN and SAB, have been primarily associated with Mansoniaspecies, rodents, Culex species and sandflies, respectively. Themajor exception is the mosquito species Haemagogus, thesylvatic vector of YF virus in South America, although thisdoes not apply to the urban vector. The flaviviruses primarilyisolated from Culex species formed a single clade of 23 viruses,which included JE, SLE and WN viruses as notable examples.The mosquito vector was identified for at least 16 of theseviruses (Karabatsos, 1995). Of these 16, 89% of mosquito-

borne flaviviruses in the Culex species monophyly are knownto be isolated from Culex species (14}16), hereafter denoted asthe Culex clade.

Figs 1 and 2 show that there is a clear correlation betweenthe virus, mosquito vector species and associated host. BothAedes clades contained viruses that were maintained in sylvaticprimate cycles, namely YF, DEN and ZIK viruses, or othermammals, while birds were not strongly associated with any ofthe viruses in the Aedes clades. Although Aedes clade virusesmay infect birds, they are believed to be ‘dead-end hosts ’. Incontrast, none of the Culex clade viruses are maintained inprimate cycles. Moreover, a high proportion of flaviviruses inthe Culex clade are associated with mosquito–bird cycles (atleast 12}23 viruses) (Figs 1 and 2). However, mammals areadditionally involved in the persistence of Culex clade viruses,although many are considered dead-end hosts. For example,

BIHC

Phylogenetic analysis of flavivirusesPhylogenetic analysis of flaviviruses

pigs play a role in the maintenance of JE viruses, while batscould be involved in WN virus persistence (Monath & Heinz,1990). Furthermore, AROA, IGU, BSQ and NJL viruses, whichform a single clade, could be maintained by rodents.

Mosquito-borne viruses normally associated with neuro-logical disease in humans or livestock, leading to encephalitisin severe cases, were found in the Culex clade and weregenerally associated with viruses that cycled between mos-quitoes and birds. DEN virus from the Aedes clades was theexception, since there are rare cases of DEN encephalitis (Lumet al., 1996 ; Hommel et al., 1998 ; Solomon et al., 2000). Incontrast, the mosquito-borne flaviviruses that are normallyassociated with haemorrhagic disease were exclusive to theAedes clades and were associated with viruses that cyclebetween mosquitoes and primate hosts.

Clade robustness

The Culex and Aedes clades showed robust quartet supportin the partial E gene phylogenetic tree and robust bootstrapsupport using the partial NS5 gene sequence for a single Aedesclade containing YF virus. The robustness of the distinctionbetween the Aedes clade that contained the DEN viruses andthe Culex clade was separately assessed using a Monte Carlosimulation. The observed phylogeny was subject to 100Monte Carlo simulations, reconstructed using ML (recon-nection limit¯ 1) and manually assessed for alternativetopologies. Distinct Culex and Aedes clades were observed in98% of all simulations, while the SPO and ZIK virus sistergroup formed a trifurcation with the Aedes clade containingDEN virus and the Culex clade in 9% of these simulations.

There are also significant differences between the relativepositions of some flaviviruses, as presented in the NS5-derivedtree shown in Fig. 1 and those presented by Kuno et al. (1998).For example, ZIK and SPO viruses are positioned togetherwith other Aedes-associated viruses (Fig. 1) ; they had pre-viously been placed in different positions among the mosquito-borne viruses by Kuno et al. (1998). Secondly, and in contrastwith the previous analysis (Kuno et al., 1998), KED virusshowed close phylogenetic relationships with SPO and ZIKviruses (Fig. 1), confirming the published serological data(Karabatsos, 1995).

Tick-borne and NKV flaviviruses

Vertebrate host clades were also observed in the tick-borneand NKV flaviviruses for both the NS5 and E gene trees. In theNS5 gene phylogeny, NKV viruses, for which APOI virus wasthe basal lineage, were subdivided into rodent and bat clades.The rodent clade showing robust bootstrap support containedCR, JUT, MOD, SV and SP viruses, whereas the bat cladeshowing robust bootstrap support contained BUK, CI, DK,PPB and RB viruses (Figs 1 and 2). APOI virus is alsoassociated with rodents and, therefore, could be included in therodent clade. The bat MML virus was also included in the bat

clade, despite low bootstrap support. The remaining threeNKV (bat) viruses, ENT, SOK and YOK, in the NS5 genephylogeny form a sister group with the Aedes clade containingYF virus and are maintained by robust bootstrap support. Fromthe point of view of their evolutionary origins, it is importantto note that the rodent clade NKV viruses, with the exceptionof APOI virus, have only been isolated in the New World,whereas bat-associated viruses have been isolated from theOld and the New World, although none has been isolated inboth regions of the world.

The TBE complex viruses were primarily associated withixodid (hard) ticks, mainly Ixodes species and rodent hosts. Thesecond group, consisting of tick-borne seabird-associatedviruses (MEA, TYU and SRE), were most frequently isolatedfrom Ornithodorus species or Ixodes uriae. KAD virus, which isassociated with Rhipicephalus appendiculatus in Africa andHyalomma pravus in Saudi Arabia, and GGY virus, which isassociated with seabirds and Ixodes uriae, represent earlylineages in the two tick-borne clades and possibly indicate agenetic link between the two tick-borne flavivirus groups.

Geographical distribution

The geographical distribution of the mosquito-borneflaviviruses was also examined to see whether or not virusdispersal correlated with either the Aedes clades or the Culexclade. With the exception of YF virus and DEN virus, which arebelieved to have originated in the Old World but can now alsobe found in the New World, all other viruses in the Aedes cladesare only found in the Old World. On the other hand, theviruses in the Culex clade show geographical clustering, butgenetically closely related viruses in the Culex clade have beenwidely dispersed to the Americas, Africa, Asia and Australasia,i.e. the Old and the New World.

DiscussionEarly attempts to define flavivirus interrelationships and

their evolutionary characteristics were based on antigenic crossreactivity in neutralization, complement fixation and haem-agglutination inhibition tests (de Madrid & Porterfield,1974 ; Porterfield, 1980 ; Calisher et al., 1989). Classificationschemes based on these criteria have proved helpful inunderstanding the flaviviruses, but many of the viruses havesubsequently been shown to be incorrectly assigned within theschemes. Molecular sequencing and phylogenetic recon-structions have largely overcome these problems and haveprovided important insights into the taxonomy (Heinz et al.,2000) and dispersal of flaviviruses (Gould et al., 1997). Theassociation of specific flaviviruses with particular arthropodvectors and vertebrate hosts has been defined precisely and alist of these characteristics for each virus is available in theInternational Catalogue of Arboviruses (Karabatsos, 1995).Despite these extensive data, there have been few previousattempts to correlate molecular evolution with epidemiological

BIHD

M. W. Gaunt and othersM. W. Gaunt and others

and ecological features of the flaviviruses. The phylogenetictrees presented here have extended previous analyses of theflavivirus NS5 (Kuno et al., 1998 ; Billoir et al., 2000) and E genephylogenetic trees (Marin et al., 1995 ; Zanotto et al., 1995). Bymapping these biological characteristics onto the trees, thephylogenetic analyses presented in this paper demonstrate astriking series of correlations between molecular phylogeneticand ecological}epidemiological characteristics.

It was demonstrated previously (Marin et al., 1995 ; Kuno etal., 1998) that the Flavivirus genus was monophyletic and threedistinct groups of viruses, namely tick-borne, mosquito-borneand NKV viruses, diverge at the deepest nodes. We have nowdemonstrated that the mosquito-borne viruses are subdividedinto the Culex clade and Aedes clades. Moreover, the evolutionof the Culex clade appears to have occurred after the separationof the mosquito-borne viruses from the tick-borne and NKVviruses. These observations were supported by the congruencebetween the NS5 and E gene phylogenies as well as by MonteCarlo simulation and quartet puzzling support.

The dominance of Aedes and Culex species (subfamilyCulicinae) in flavivirus transmission is explained by the speciesprevalence of each of the genera, which contain 975 and 769species, respectively, and comprise more species than all othermosquito genera combined (1522 species). Aedes and Culexmosquitoes are also among the small number of genera that areglobally dispersed. Blood-meal data obtained for Aedes speciessuggest that mammals are the primary hosts of most species,which could explain the Aedes clades–primate}mammal as-sociation (Mitchell, 1988 ; Christensen et al., 1996 ; Clements,1999). The feeding patterns of only relatively few species ofCulex mosquito are known, although a small number of bird- ormammal-specific species have been identified. Many Culexspecies feed indiscriminately on both mammals and birds andthey include the principal vectors for several flaviviruses in theCulex clade, such as C. annulirostris (MVE virus), C. tri-taeniorhynchus (JE virus), C. tarsalis (SLE virus) and C. univittatus(WN virus) (Robertson et al., 1993 ; Christensen et al.,1996 ; Clements, 1999). The difference in feeding behaviourbetween Aedes and Culex mosquitoes provides a clear ex-planation for the associations between Aedes-borne flavivirusesand mammals or between Culex-borne flaviviruses and birds.Moreover, it explains why the association between the Aedesclades and mammals appears to be unequivocal, while theassociation between the Culex clade and birds contains anumber of notable exceptions.

The second major correlation was between the type ofdisease produced and the mosquito clade in which each virusappeared. In general, severe infections caused by some Aedesspecies viruses result in haemorrhagic disease, whereas manyCulex species viruses cause encephalitic disease ; however,there have been reported cases of DEN (Aedes species-associated) encephalitis, but these seem to be very rare (Lum etal., 1996 ; Hommel et al., 1998 ; Solomon et al., 2000). Until theprecise basis of flavivirus pathogenicity has been defined at the

molecular level, it is not possible to understand why thesedifferent disease associations can be seen in the phylogenetictree. In contrast with the mosquito-borne flaviviruses, differentviruses in the tick-borne virus groups produce encephaliticdisease, but OHF and KFD viruses may also producehaemorrhagic disease in humans and this does not appear tocorrelate with either their phylogenetic or their geographicalcharacteristics.

Phylogenetic divisions between Old and New Worldflaviviruses were seen throughout the NS5 and E genephylogenies. In some instances, dispersal of flaviviruses couldbe readily linked with the vertebrate host, providing evidenceof the importance of the host in flavivirus evolution. In the caseof viruses that established infections in bats, it is easy toimagine dispersal to remote sites, as the Old World bats fromwhich flaviviruses have been isolated are known to migratehundreds of kilometres (Shilton et al., 1999). On the otherhand, individual rodent-associated NKV viruses might beexpected to show a more restricted distribution and this isdemonstrated by their detection almost exclusively in the NewWorld and by their localized or niche-like distribution.

Virtually all of the tick-borne flaviviruses are exclusivelyOld World, with the exception of POW virus. The seabird-associated tick-borne viruses were dispersed to geographicalareas where they established niches in seabird colonies in boththe Northern and the Southern hemispheres (TYU, SRE andMEA viruses) (Chastel et al., 1985). At the early period of theirevolution, the TBE complex viruses appear to have beendispersed either by seabirds or by rodents and their associatedticks. As they reached the forests of Asia, they becameestablished predominantly in Ixodes species, where theycontinued their clinal evolution into Europe (Gao et al.,1993 ; Zanotto et al., 1995 ; Gould et al., 1997).

The earliest evolutionary lineages in the mosquito-bornevirus clades appear to have radiated to geographically distantparts of the Old World and to a wide variety of species, i.e.bats, Aedes species, sandflies and large animals, includingsimians and humans. Only YF virus and the four DEN virusserotypes, which cause human epidemics, are found in the NewWorld. There is strong evidence to support the notion that YFvirus was introduced to the Americas from the Old Worldduring the past few centuries when slaves were transportedacross the Atlantic Ocean (Strode, 1951 ; Monath & Heinz,1990 ; Gould et al., 1997).

There are also reasons to believe that DEN viruses have anAfrican ancestry. The other members of the Aedes cladecontaining the DEN, ZIK, SPO and KED viruses were allisolated from Africa and formed two paraphyletic lineages tothe DEN viruses. In addition, the E gene phylogenies ofendemic}epidemic and sylvatic DEN viruses show a basalposition for Old World sylvatic lineages of DEN1, DEN2(Africa and Malaysia) and DEN4 (Wang et al., 2000). Thevector of DEN virus, Aedes aegyti, is also believed to haveoriginated in Africa (Tabachnick, 1991). There is no reason to

BIHE

Phylogenetic analysis of flavivirusesPhylogenetic analysis of flaviviruses

believe that DEN virus could not have been shipped to theAmericas from the Old World in the same way as YF virus.Therefore, as most of the other Aedes species-associated virusesare found solely in Africa and since the Culex species-associatedviruses appear to be descendants of the Aedes species-associated viruses, the mosquito-borne flaviviruses appear tohave evolved out of Africa.

In conclusion, the flaviviruses that are recognized todayrepresent a diverse group of viruses that could have emergedand dispersed during the past 10000 years, i.e. since the mostrecent ice age (Zanotto et al., 1996). The characteristicepidemiological groupings of the viruses that are apparent inthe phylogenetic trees illustrate the significant influence of theinvertebrate vectors, the vertebrate hosts and the particularecological niches into which these species have evolved.

We gratefully acknowledge the financial support from the NERC andthe Wellcome Trust. We thank E. C. Holmes (University of Oxford) forhelp and advice as well as A. Lilley (CEH), G. Clarke and M. A. Miles(LSHTM). We thank Dr R. Shope (University of Texas) and Dr J. S.Porterfield (previously University of Oxford) for supplying most of thevirus stocks used in this study.

ReferencesBilloir, F., de Chesse, R., Tolou, H., de Micco, P., Gould, E. A. & deLamballerie, X. (2000). Phylogeny of the genus Flavivirus usingcomplete coding sequences of arthropod-borne viruses and viruses withno known vector. Journal of General Virology 81, 781–790.

Calisher, C. H., Karabatsos, N., Dalrymple, J. M., Shope, R. E.,Porterfield, J. S., Westerway, E. G. & Brandt, W. E. (1989). Antigenicrelationships between flaviviruses as determined by cross-neutralisationtests with polyclonal antisera. Journal of General Virology 70, 37–43.

Chastel, C., Main, A. J., Guiguen, C., le Lay, G., Quillien, M. C., Mannat,J. Y. & Beaucournu, J. C. (1985). The isolation of Meaban virus, a newflavivirus from the seabird tick Ornithodoros (Alecorobius) maritimus inFrance. Archives of Virology 83, 129–140.

Christensen, H. A., de Vasquez, A. M. & Boreham, M. M. (1996). Host-feeding patterns of mosquitoes (Diptera : Culicidae) from central Panama.American Journal of Tropical Medicine and Hygiene 55, 202–208.

Clements, A. N. (1999). The Biology of Mosquitoes : Sensory, Reception, andBehaviour, 2nd edn. Wallingford, Oxford : CABI Publishing.

de Madrid, A. T. & Porterfield, J. S. (1974). The flaviviruses (group Barboviruses) : a cross-neutralisation study. Journal of General Virology 23,91–96.

Gao, G. F., Hussain, M. H., Reid, H. W. & Gould, E. A. (1993).Classification of a new member of the TBE flavivirus subgroup by itsimmunological, pathogenetic and molecular characteristics : identificationof subgroup-specific pentapeptides. Virus Research 30, 129–144.

Gaunt, M. W., Jones, L. D., Laurenson, K., Hudson, P. J., Reid, H. W. &Gould, E. A. (1997). Definitive identification of louping ill virus byRT–PCR and sequencing in field populations of Ixodes ricinus on theLochindorb estate. Archives of Virology 142, 1181–1191.

Gould, E. A., Zanotto, P. M. A. & Holmes, E. C. (1997). The geneticevolution of the flaviviruses. In Factors in the Emergence of ArbovirusesDiseases, pp. 51–63. Edited by J.-F. Saluzzo & B. Dodet. Paris : Elsevier.

Gritsun, T. S. & Gould, E. A. (1995). Infectious transcripts of tick-borneencephalitis virus, generated in days by RT–PCR. Virology 214, 611–618.

Heinz, F. X., Collett, M. S., Purcell, R. H., Gould, E. A., Howard, C. R.,Houghton, M., Moormann, R. J. M., Rice, C. M. & Thiel, H. J. (2000).Family Flaviviridae. In Virus Taxonomy. Seventh International Committee forthe Taxonomy of Viruses, pp. 859–878. Edited by M. H. V. vanRegenmortel, C. M. Fauquet & D. H. L. Bishop. San Diego : AcademicPress.

Holmes, E. C. (1998). Molecular epidemiology and evolution ofemerging infectious diseases. British Medical Bulletin 54, 533–543.

Hommel, D., Talarmin, A., Deubel, V., Reynes, J. M., Drouet, M. T.,Sarthou, J. L. & Hulin, A. (1998). Dengue encephalitis in French Guiana.Research in Virology 149, 235–238.

Jones, D. T., Taylor, W. R. & Thornton, J. M. (1992). The rapidgeneration of mutation data matrices from protein sequences. ComputerApplications in the Biosciences 8, 275–282.

Karabatsos, N. (1995). International Catalogue of Arboviruses. SanAntonio : American Society of Tropical Medicine and Hygiene.

Kuno, G., Chang, G.-J. J., Tsuchiya, K. R., Karabatsos, N. & Cropp,S. B. (1998). Phylogeny of genus Flavivirus. Journal of Virology 72,73–83.

Lanciotti, R. S., Roehrig, J. T., Deubel, V., Smith, J., Parker, M., Steele,K., Crise, B., Volpe, K. E., Crabtree, M. B., Scherret, J. H., Hall, R. A.,MacKenzie, J. S., Cropp, C. B., Panigrahy, B., Ostlund, E., Schmitt, B.,Malkinson, M., Banet, C., Weissman, J., Komar, N., Savage, H. M.,Stone, W., McNamara, T. & Gubler, D. J. (1999). Origin of the WestNile virus responsible for an outbreak of encephalitis in the northeasternUnited States. Science 286, 2333–2337.

Lum, L. C. S., Lam, S. K., Choy, Y. S., George, R. & Harun, F. (1996).Dengue encephalitis : a true entity? American Journal of Tropical Medicineand Hygiene 54, 256–259.

McGuire, K., Holmes, E. C., Gao, G. F., Reid, H. W. & Gould, E. A.(1998). Tracing the origins of louping ill virus by molecular phylogeneticanalysis. Journal of General Virology 79, 981–988.

Marin, M. S., Zanotto, P. M., Gritsun, T. & Gould, E. A. (1995).Phylogeny of TYU, SRE, and CFA virus : different evolutionary rates inthe genus Flavivirus. Virology 206, 1133–1139.

Mitchell, C. J. (1988). Occurrence, biology and physiology of diapausein overwinteringmosquitoes. InThe Arboviruses : Epidemiology and Ecology,vol. 1. Edited by T. P. Monath. Boca Raton : CRC Press.

Monath, T. P. & Heinz, F. X. (1990). Flaviviruses. In Fields Virology, 2ndedn, pp. 763–814. Edited by B. N. Fields & D. M. Knipe. New York :Raven Press.

Porterfield, J. S. (1980). Antigenic characteristics and classification ofTogaviridae. In The Togaviruses, pp. 13–46. Edited by R. W. Schlesinger.New York : Academic Press.

Possado, D. & Krandall, K. A. (1998). MODELTEST: testing the modelof DNA substitution. Computer Applications in the Biosciences 14, 817–818.

Proutski, V. & Holmes, E. C. (1998). SWAN: a new Macintoshapplication for the sliding window analysis of nucleotide sequencevariability. Computer Applications in the Biosciences 14, 467–468.

Rambaut, A. & Grassly, N. C. (1997). Seq-Gen : an application for theMonte Carlo simulation of DNA sequence evolution along phylogenetictrees. Computer Applications in the Biosciences 13, 235–238.

Randolph, S. E., Green, R. M., Peacey, M. F. & Rogers, D. J. (2000).Seasonal synchrony : the key to tick-borne encephalitis foci identified bysatellite data. Parasitology 121, 15–23.

Reid, H. W. (1984). Epidemiology of louping ill. In Vectors in Biology, pp.161–178. London : Academic Press.

BIHF

M. W. Gaunt and othersM. W. Gaunt and others

Rice, C. M. (1996). Flaviviridae : the viruses and their replication. In FieldsVirology, 3rd edn, pp. 931–959. Edited by B. N. Fields, D. M. Knipe &P. M. Howley. Philadelphia : Lippincott–Raven.

Rice, C. M., Lenches, E. M., Eddy, S. R., Shin, S. J., Sheets, R. L. &Strauss, J. H. (1985). Nucleotide sequence of yellow fever virus :implications for flavivirus gene expression and evolution. Science 229,726–733.

Robertson, L. C., Prior, S., Apperson, C. S. & Irby, W. S. (1993).Bionomics of Anopheles quadrimaculatus and Culex erraticus (Diptera :Culicidae) in the Falls Lake basin, North Carolina : seasonal changes inabundance and gonotropic status, and host-feeding patterns. Journal ofMedical Entomology 30, 689–698.

Shilton, L. A., Altringham, J. D., Compton, S. G. & Whittaker, R. J.(1999). Old World fruit bats can be long-distance seed dispersersthrough extended retention of viable seeds in the gut. Proceedings of theRoyal Society of London Series B Biological Scienes 266, 219–223.

Solomon, T., Dung, N. M., Vaughn, D. W., Kneen, R., Thao, L. T. T.,Raengsakulrach, B., Loan, H. T., Day, N. P. J., Farrar, J., Myint, K. S. A.,Warrell, M. J., James, W. S., Nisalak, A. & White, N. J. (2000).Neurological manifestations of dengue infection. Lancet 335, 1053–1059.

Strimmer, K. & von Haeseler, A. (1996). Quartet puzzling : a quartetmaximum likelihood method for reconstructing tree topologies. Mol-ecular Biology and Evolution 13, 964–969.

Strode, G. K. (1951). Yellow Fever. New York : McGraw–Hill.

Tabachnick, W. J. (1991). Evolutionary genetics and arthropod-bornedisease : the yellow fever mosquito. American Entomologist 37, 14–24.

Swofford, D. L. (1999). PAUP* : Phylogenetic Analysis Using Par-simony (* and other methods), version 4.0. Sinauer Associates,Sunderland, MA, USA.

Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W:improving the sensitivity of progressive multiple sequence alignmentthrough sequence weighting, position-specific gap penalties and weightmatrix choice. Nucleic Acids Research 22, 4673–4680.

Wang, E., Ni, H., Xu, R., Barrett, A. D., Watowich, S. J., Gubler, D. J. &Weaver, S. C. (2000). Evolutionary relationships of endemic}epidemicand sylvatic dengue viruses. Journal of Virology 74, 3227–3234.

Zanotto, P. M. A., Gao, G. F., Gritsun, T., Marin, M. S., Jiang, W. R.,Venugopal, K., Reid, H. W. & Gould, E. A. (1995). An arbovirus clineacross the northern hemisphere. Virology 210, 152–159.

Zanotto, P. M. A., Gould, E. A., Gao, G. F., Harvey, P. H. & Holmes,E. C. (1996). Population dynamics of flaviviruses revealed by molecularphylogenies. Proceedings of the National Academy of Sciences, USA 93,548–553.

Received 28 February 2001; Accepted 25 April 2001

BIHG