Global Dispersal Pattern of HIV Type 1 Subtype CRF01_AE: A Genetic Trace of Human Mobility Related...

33
Accepted Manuscript 1 © The Author 2014. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: [email protected]. Global dispersal pattern of HIV-1 CRF01_AE: A genetic trace of human mobility related to heterosexual activities centralized in South-East Asia Konstantinos Angelis 1,† , Jan Albert 2,3 , Ioannis Mamais 1,† , Gkikas Magiorkinis 1,4 , Angelos Hatzakis 1 , Osamah Hamouda 5 , Daniel Struck 6 , Jurgen Vercauteren 7 , Annemarie M.J. Wensing 8 , Ivailo Alexiev 9 , Birgitta Åsjö 10 , Claudia Balotta 11 , Ricardo J. Camacho 12 , Suzie Coughlan 13 , Algirdas Griskevicius 14 , Zehava Grossman 15 , Andrzej Horban 16 , Leondios G. Kostrikis 17 , Snjezana Lepej 18 , Kirsi Liitsola 19 , Marek Linka 20 , Claus Nielsen 21 , Dan Otelea 22 , Roger Paredes 23 , Mario Poljak 24 , Elisabeth Puchhammer- Stöckl 26 , Jean-Claude Schmit 6 , Anders Sönnerborg 3,26 , Danica Staneková 27 , Maja Stanojevic 28 , Charles A.B. Boucher 29 , Lauren Kaplan 3 , Anne-Mieke Vandamme 7,12 and Dimitrios Paraskevis 1,* 1 Department of Hygiene, Epidemiology and Medical Statistics, Medical School, University of Athens, Greece 2 Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden 3 Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden 4 Department of Zoology, University of Oxford, United Kingdom 5 Robert Koch-Institut, Berlin, Germany 6 Centre de Recherche Public de la Sante, Luxembourg, Luxembourg 7 Clinical and Epidemiological Virology, Rega Institute for Medical Research, Department of Microbiology and Immunology, KU Leuven, Leuven, Belgium 8 Department of Virology, University Medical Center, Utrecht, The Netherlands 9 National Center of Infectious and Parasitic Diseases, Sofia, Bulgaria 10 University of Bergen, Bergen, Norway Journal of Infectious Diseases Advance Access published December 15, 2014 by guest on December 18, 2014 http://jid.oxfordjournals.org/ Downloaded from

Transcript of Global Dispersal Pattern of HIV Type 1 Subtype CRF01_AE: A Genetic Trace of Human Mobility Related...

Acce

pted M

anus

cript

1

© The Author 2014. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: [email protected].

Global dispersal pattern of HIV-1 CRF01_AE: A genetic trace of human

mobility related to heterosexual activities centralized in South-East Asia

Konstantinos Angelis1,†, Jan Albert2,3, Ioannis Mamais1,†, Gkikas Magiorkinis1,4,

Angelos Hatzakis1, Osamah Hamouda5, Daniel Struck6, Jurgen Vercauteren7,

Annemarie M.J. Wensing8, Ivailo Alexiev9, Birgitta Åsjö10, Claudia Balotta11, Ricardo J.

Camacho12, Suzie Coughlan13, Algirdas Griskevicius14, Zehava Grossman15, Andrzej

Horban16, Leondios G. Kostrikis17, Snjezana Lepej18, Kirsi Liitsola19, Marek Linka20,

Claus Nielsen21, Dan Otelea22, Roger Paredes23, Mario Poljak24, Elisabeth Puchhammer-

Stöckl26, Jean-Claude Schmit6, Anders Sönnerborg3,26, Danica Staneková27, Maja

Stanojevic28, Charles A.B. Boucher29, Lauren Kaplan3, Anne-Mieke Vandamme7,12 and

Dimitrios Paraskevis1,*

1Department of Hygiene, Epidemiology and Medical Statistics, Medical School, University

of Athens, Greece

2Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm,

Sweden

3Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden

4Department of Zoology, University of Oxford, United Kingdom

5Robert Koch-Institut, Berlin, Germany

6Centre de Recherche Public de la Sante, Luxembourg, Luxembourg

7Clinical and Epidemiological Virology, Rega Institute for Medical Research, Department of

Microbiology and Immunology, KU Leuven, Leuven, Belgium

8Department of Virology, University Medical Center, Utrecht, The Netherlands

9National Center of Infectious and Parasitic Diseases, Sofia, Bulgaria

10University of Bergen, Bergen, Norway

Journal of Infectious Diseases Advance Access published December 15, 2014 by guest on D

ecember 18, 2014

http://jid.oxfordjournals.org/D

ownloaded from

Acce

pted M

anus

cript

2

11University of Milan, Milan, Italy

12Centro de Malária e OutrasDoenças Tropicais and Unidade de Microbiologia, Instituto de

Higiene e Medicina Tropical, Universidade Nova de Lisboa, Lisbon, Portugal

13University College Dublin, Dublin, Ireland

14Lithuanian AIDS Center, Vilnius, Lithuania

15Tel Aviv University, Tel Aviv, Israel

16Hospital of Infectious Diseases, Warsaw, Poland

17University of Cyprus, Nicosia, Cyprus

18Department of Molecular Diagnostics and Flow Cytometry, University Hospital for

Infectious Diseases "Dr. F. Mihaljevic", Zagreb, Croatia

19National Institute of health and welfare, Helsinki, Finland

20National Reference Laboratory of AIDS, National Institute of Health, Prague, Czech

Republic

21Statens Serum Institute, Copenhagen, Denmark

22National Institute for Infectious Diseases “Prof. Dr. Matei Bals”, Bucharest, Romania

23IrsiCaixa Foundation, Badalona, Spain

24Slovenian HIV/AIDS Reference Centre, University of Ljubljana, Faculty of Medicine,

Ljubljana, Slovenia

25University of Vienna, Vienna, Austria

26Divisions of Infectious Diseases and Clinical Virology, Karolinska Institute, Stockholm,

Sweden

27Slovak Medical University, Bratislava, Slovakia

28University of Belgrade Faculty of Medicine, Belgrade, Serbia

29Erasmus MC, University Medical Center, Rotterdam, The Netherlands

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

3

*Correspondence: Dimitrios Paraskevis, Department of Hygiene, Epidemiology and Medical

Statistics, Medical School, University of Athens, 75 Mikras Asias Street, 115 27 Athens,

Greece, Phone: +30 210 7462119, e-mail: [email protected]

†Contributed equally to the work

Abstract

Background: HIV-1 subtype CRF01_AE originated in Africa and then passed to Thailand

where it established a major epidemic. Despite the global presence of CRF01_AE little is

known about its subsequent dispersal pattern.

Methods: We assembled a global dataset of 2,736 CRF01_AE sequences by pooling

sequences from public databases and patient-cohort studies. We estimated viral dispersal

patterns using statistical phylogeography run over bootstrap trees estimated by the maximum

likelihood (ML) method.

Results: We show that Thailand has been the source of viral dispersal to most areas

worldwide, including 17 out of 20 sampled countries in Europe. Japan, Singapore, Vietnam

and other Asian countries have played a secondary role in the viral dissemination. In contrast,

China and Taiwan have mainly imported infections from neighbouring Asian countries,

North America and Africa without any significant exporting transmissions.

Discussion: The central role of Thailand in the global spread of CRF01_AE can be probably

explained by the popularity of Thailand as a vacation destination characterized by sexual

tourism and by Thai emigration to the Western world. Our study highlights the unique case of

CRF01_AE, the only globally distributed non-B clade for which its global dispersal was not

originated in Africa.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

4

Background

HIV-1 strains are divided into four major genetic groups M, O, N and P which represent four

separate transmissions of simian strains to humans [1]. Among them, group M is by far the

most prevalent and has established a global epidemic, while the other three are mainly

restricted to West and Central Africa. Group M is further classified into nine major genetic

subtypes named A-D, F-H, J and K and several recombinant forms [2, 3]. If a unique

recombinant form (URF) succeeds in being transmitted to several individuals, the URF is

called circulating recombinant form (CRF). One of the most prevalent CRFs is the

CRF01_AE, which is considered to have originated by recombination of a subtype A variant

and a putative extinct subtype E ancestor [4-6].

The CRF01_AE was first identified in female sex workers in northern Thailand in 1989

[7-9]. In the following years, phylogenetic studies showed that the CRF01_AE originated in

Central Africa, in the 1970s, and then migrated to Thailand, most likely through heterosexual

transmission [10]. Subsequently, CRF01_AE established a large-scale epidemic in South-

East Asia with a proportion of 79% among HIV-1 infections [11] and became one of the

successfully spreading recombinants [12, 13]. Currently, CRF01_AE has formed a nearly

global epidemic with viral strains reported in North America, Europe, Central and West

Africa, Asia and Australia and is estimated to represent approximately 5% of global HIV-1

infections [14]. The dispersal pattern after the transmission of CRF01_AE from Africa to

Thailand is largely unknown. Knowledge of viral dispersal pathways can inform the design of

interventions to prevent onward transmissions [15]. Previously published studies track the

mobility of the epidemic only to a local scale, within a country or to a small group of

neighbouring countries [16-21]. Larger studies, but still not on a global scale, have also been

reported [19]. To our knowledge no previous study has traced the dispersal pattern of

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

5

CRF01_AE on a global scale. Here, we use a global dataset of 2,736 CRF01_AE sequences

aiming to uncover global transmission routes using a statistical phylogeographic analysis.

Methods

Data assembly

We downloaded non-European sequences from the Los Alamos HIV sequence database [22].

Several sequences from the same patient might be available in the database, due to patient

inclusion in follow-up studies. To avoid potential bias due to driven viral evolution from

antiretroviral therapy, we kept only the oldest available sequence per patient. Duplicate

sequences from the same country were discarded using the ElimDupes [23] online tool. The

European sequences come from the SPREAD (Strategy to Control Spread of HIV Drug

Resistance) surveillance programme [24]. Information on this dataset has been previously

published [24, 25].The sequences come from most of the countries with an established

CRF01_AE epidemic as shown from Table 1. For the European sample some demographic

information (gender, transmission route and country of origin) obtained from standardized

questionnaires used in the SPREAD project, is reported in Table 2.

Alignment

The pol gene, protease (PR) and partial reverse transcriptase (RT) regions, of the HIV-1

genome was used, since it is among the most frequently sequenced genomic regions. We

aligned the sequences using the ClustalW algorithm incorporated into the MEGA5 program

[26] and manually corrected the alignment according to the encoded reading frame. We

discarded nucleotides corresponding to certain amino acid positions in order to avoid

potential biases due to convergent evolution caused by antiretroviral therapy (see

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

6

Supplementary Information). The final alignment was consisted of 2,736 sequences and 777

nucleotides.

Phylogenetic tree reconstruction

Phylogenetic reconstruction was performed under the general time reversible model of

nucleotide substitution with gamma-distributed rate heterogeneity (GTR + Γ4) among sites

[27, 28], using the maximum likelihood (ML) method as implemented in the RAxML

program [29]. To account for phylogenetic uncertainty 300 ML bootstrap trees were also

inferred using the rapid bootstrapping algorithm of RAxML [30] as implemented in the

CIPRES project cluster [31].

Inference of migration events

For each bootstrap tree we estimated CRF01_AE migration events using the approach

described by Slatkin and Maddison [32] as implemented in the PAUP 4.0 program [33]. All

tips of the bootstrap trees were assigned with a character (state) according to their geographic

origin (e.g. K, L, M for Austria, Belgium, Africa, respectively, etc.). The geographical

locations used in our study are described in Table 1. The algorithm reconstructs ancestral

states in the internal nodes of a tree using the parsimony criterion which minimizes the total

number of state changes.

When two branches from the same location (e.g. K and K) join each other the

reconstructed ancestral state is the same (e.g. K) and there is no migration. When the

branches come from different locations (e.g. K and L) the reconstructed state is ambiguous

and is assigned to be the union of the two characters [K, L] (Figure 4A, nodes 3, 6). This

implies a character change and thus a migration event is assigned. The ambiguity and thus the

direction of the migration may sometimes be resolved by the states of earlier nodes (Figure

4B, nodes 3, 6). In our analysis only unambiguous migration events were used.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

7

Migration matrices

Once the migration events between any two regions were estimated from all the 300

bootstrap trees, they were stored in 300 matrices (one matrix per bootstrap tree) with the rows

and columns to represent area of origin and destination, respectively. In this way a cell of a

matrix contains the number of estimated migration events from the area of the row to the area

of the column for a bootstrap tree. We summarized the medians of the events across the

bootstrap trees for all pathways in a single matrix (e.g. Supplementary Table1 1).

Steps of analysis

Migration events were estimated with two geographic grouping strategies. First

sequences were grouped into large geographic areas (analysis A, see Supplementary

Information)

At a next step (analysis B) we attempted to identify more specific dispersal routes

between countries, rather than between large geographic areas. Consequently, we grouped

together sequences from the same country, with the exception of African sequences which

formed a single group and sequences from the Asian countries Hong Kong, South Korea,

Afghanistan, Iran, Pakistan, Indonesia, Malaysia, Myanmar and Philippines, which formed

another group called "Rest Asia" (see Table 1). This was done because of the small number

of available sequences from these Asian countries. Although for some European countries

(e.g. Cyprus, Serbia,) the number of available sequences was small as well, we decided not to

merge them into a single group because transmission routes towards and from European

countries were of major interest.

Statistical phylogeography

In a random mixing population, an infected individual would have the same probability

to transmit the HIV-1 to any other healthy individual. Thus, a random shuffling of taxa at the

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

8

tips of any phylogenetic tree would simulate a tree inferred from such a population. To

estimate which transmission pathways of our analyses are significant, we performed a

random shuffling of taxa at the tips of the 300 bootstrap trees, and then inferred the number

of migration events between any two areas. We call these events "expected migration events",

as they are expected to occur under the assumption of random mixing, in contrast to the

"observed events" inferred from the bootstrap trees before shuffling. We then compared the

distribution of the events inferred from the two sets of trees (before and after shuffling) for

any two areas, using the one-sided Mann-Whitney test. The level of significance (α = 5%)

was adjusted according to Bonferroni correction for multiple comparisons. Pathways with

statistically significantly higher number of observed events than expected were identified as

significant. In case two pathways connecting the same two areas (i.e. K → L and L → K) are

both significant, a bidirectional relationship is defined between the two areas.

The ratio (r) of the observed versus the expected number of migration events provides

information on the amount of viral flow between different areas that is not attributed to

chance. For example, a ratio r = 1.20 for a pathway would indicate that the observed viral

flow across the pathway is 20% higher than what is expected from chance (random mixing

population). An estimate of this ratio for a pathway can be obtained by dividing the mean of

observed migration events by the mean of expected. Such ratios were estimated for all

pathways, except those having 0 mean of expected migration events. Ratios estimated in this

way are sensitive to the mean of expected events, as a mean of expected events very close to

0 may produce high ratio values (see Supplementary Information).

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

9

Results

Phylogenetic Analysis

The best-scoring maximum likelihood (ML) tree suggests an extensive pattern of viral

dispersal (Figure 1) with Thai sequences to be highly dispersed across the tree and positioned

basal to several monophyletic clusters. In contrast, the majority of Chinese and Vietnamese

sequences form large monophyletic clusters (four Chinese and two Vietnamese clusters,

Figure 1). This indicates that the CRF01_AE epidemic in those countries originated from a

limited number of viral introductions, which subsequently spread locally. Although Thai

sequences are highly dispersed across the tree, there are only a few Thai strains within the

Chinese and Vietnamese clusters suggesting limited interaction between Thailand and these

countries after the first main introductions.

European sequences are also dispersed in the tree and lie mainly next to Thai sequences,

indicating extensive interaction between European and Thai strains. The Japanese sequences

and those from North America and from other sampled Asian countries (denoted with

"Others" in Figure 1) are also dispersed across the tree, highly mixed with Thai strains. The

majority of African sequences seem to form a single cluster (top left corner of Figure 1).

However, no safe deduction can be made concerning the level of interaction between African

viral strains and strains from the rest of the world due to the small African sample size.

Overall, the above findings indicate a high level of phylogenetic mixing between Thai and

European, Japanese, North American strains and strains from sampled Asian countries

excluding China and Vietnam.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

10

Statistical Phylogeography

To statistically investigate the global mobility pattern of CRF01_AE, we estimated the

number of migration events between different countries and geographic areas. We then

compared them with the expected number of events under the null hypothesis of complete

geographic mixing, in order to infer significant dispersal pathways (see Methods).

Migration patterns with respect to countries

We grouped the sampled sequences with respect to country as described in Table 1

(analysis B). Results of this analysis revealed the striking role of Thailand and the secondary

of Japan to the global CRF01_AE viral epidemic (Figure 2).

Specifically, Thailand has exported the epidemic to 17 of the 20 European countries

included in our sample, with the exception of Cyprus, Czech Republic and Serbia (Table 3,

Supplementary Table 1). United Kingdom, Slovenia, Switzerland and Austria are among the

countries with the highest ratios (3.58, 3.26, 3.13, 2.99, respectively) of importing

transmissions from Thailand (Table 3). In addition, Canada and the United States in North

America, as well as Japan, Singapore, and Taiwan in Asia, have constituted migratory targets

of Thai strains. Significant viral spread was also detected from Thailand to other Asian

countries, grouped and named "Rest Asia" in our analysis, including Hong Kong, South

Korea, Afghanistan, Iran, Pakistan, Indonesia, Malaysia, Myanmar and Philippines. Despite

the extensive viral exporting network inferred for Thailand, a single source of viral import

from Africa was detected in accordance to previous studies [10, 17, 19].

Japan also seems to have had an important role in viral dissemination. Japan has

exported the virus to several European countries (Austria, Denmark, Spain, Sweden and

Switzerland) as well as to Canada (Supplementary Table 1). These dispersal routes are

supported by high migration ratios (Supplementary Table 1). Singapore, Taiwan and "Rest

Asia" are also inferred as migratory targets for Japanese CRF01_AE strains. Our results also

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

11

suggest that Japan has imported viral strains from Spain, Singapore, "Rest Asia" and

Thailand (Supplementary Table 1). Consequently, Japan is a country with high bidirectional

transcontinental viral mobility.

Singapore, Vietnam and "Rest Asia" seem to have played a secondary role in viral

dissemination, as they interact (both as exporters or importers of the epidemic) with a smaller

number of countries mainly from Asia and Europe (Table 3, Supplementary Table 1). In

contrast, China and Taiwan have been migratory targets, having imported the epidemic

mainly from neighbouring Asian countries, and the United States and Africa, respectively,

without having any significant exporting pathways. Notably, we found significant migration

from Vietnam to China (Supplementary Table 1).

Concerning Africa it seems that it has been a major exporter, spreading the epidemic to

several European countries, such as Belgium, France, Italy, Sweden, Switzerland and Czech

Republic, to the United States as well as to Asian countries, such as Taiwan and Thailand

(Figure 2, Supplementary Table 1). Notably, no significant viral introduction was detected.

Viral mobility within Europe

Within Europe we found that Sweden and Finland seem to have played an important role

in disseminating the epidemic. Specifically, Sweden has exported viral strains to the

Netherlands, Norway, Russia and Finland (Supplementary Table 1). Notably, we found

significant export from Finland to Sweden as well, in accordance to previous studies [34, 35].

For Sweden, we found significant viral import from Thailand, Japan and Africa. Almost all

significant migratory pathways of Sweden are supported by high ratios of observed versus

expected number of events (Supplementary Table 1). Significant viral dispersal was also

detected from Germany to Poland and from Bulgaria to Czech Republic. Transmission routes

within Europe are illustrated in Figure 3.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

12

To examine parameters potentially associated with the observed patterns of viral

dispersal we examined the demographic characteristics of patients infected with CRF01_AE

in Europe. Among those with a known country of origin, 29.7% and 11.6% were from

Thailand and Vietnam, respectively. Importantly, 70.7% of individuals reported heterosexual

contact as the main risk of HIV transmission (Table 2). In contrast, men having sex with men

(MSM) was reported as a transmission risk only for 15.7% of the patients.

Discussion

The origin of the HIV-1 CRF01_AE in Africa several years ago and its spread to

Thailand provided the first events for the current CRF01_AE global epidemic [4]. However,

the subsequent pattern of viral spread has remained largely unknown. Our study provides a

global phylogeographic analysis of the CRF01_AE epidemic using the largest dataset

currently available.

Concerning the transcontinental patterns of the epidemic spread we can note five

remarks. First, Africa has acted as a source of viral spread to several areas in Asia, Europe

and North America. Second, Thailand has shown the most extensive network with viral

dispersal to many countries in Europe, Asia and North America. Third, Japan and some other

Asian countries (e.g. Singapore, Vietnam) have had a secondary role in CRF01_AE

dissemination to Europe, Asia and North America. Fourth, China and Taiwan have had

monophyletic CRF01_AE epidemics. Fifth, Europe and North America have acted mainly as

sinks with import of CRF01_AE infections from several different countries.

Our results pinpoint the central role of Thailand in the global CRF01_AE epidemic. This

country has established an extensive network of within and transcontinental transmissions,

including neighbouring Asian countries as well many European countries and North America

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

13

(Table 3, Figure 2). Thailand experienced a dramatic increase in HIV-1 infections in early

nineties, where most of infections were linked to sexual transmission from sex workers [36].

The proportion of new infections due to CRF01_AE, named subtype E at that time, increased

rapidly from 2.6% in 1988-1989 to 4% in 1992-1993 [13]. The situation was alarming and

Thailand responded early by implementing the "100 percent condom" program that

successfully reduced the spread of HIV-1 and other sexually transmitted diseases among sex

workers. Specifically, between 1991 and 2001, the number of new HIV-1 infections in

Thailand dropped from 143,000 per year to less than 14,000 [14, 36]. However, the

generalized epidemic of CRF01_AE among heterosexuals and sex workers, the large

numbers of female sex workers operating in the country, the Thailand’s sexual tourism

industry [5] and human emigration to Europe and North America may explain the extensive

viral export from the country. In addition, Thailand is a popular touristic destination for many

Europeans and has an important geographical position as it is located in the crossroad of

Africa, Asia and Oceania attracting many travellers.

Considerable viral mobility was also indicated for Japan with export mainly to Europe,

North America and some neighbouring Asian countries and import mainly from Asian

countries. Trading connections of Japan with many Asian, European and American countries

as well as extensive tourism may have contributed to this pattern. Singapore and Vietnam

seem to have had a secondary role in viral dissemination. Interestingly, CRF01_AE was

introduced in those countries mainly by sexual transmission [37, 38].

In contrast to Thailand and Japan, our results suggest that China and Taiwan are sinks,

importing CRF01_AE strains mainly from other Asian countries. Interestingly, Thailand is

not one of them despite its extensive exporting viral network. This finding differs from the

results of Abubakar et al. (2013) who reported close phylogenetic relationships and

transmissions between Thai, Chinese and Vietnamese CRF01_AE strains [19]. Abubakar et

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

14

al. used a limited number of non-Thai, Chinese and Vietnamese sequences (83 out of 1957

sequences, 4.2%) and based their results on a single ML tree and a molecular clock analysis

using BEAST [39]. Here, we used a large number of sequences from countries other than

Thailand, China and Vietnam (736 out of 2736, 26.9%) and used a statistical

phylogeographic approach to estimate viral dispersal. It is likely that this difference between

findings is due to differences in sampling and phylogenetic methodology. On the other hand,

we detected significant viral dissemination from Vietnam and Singapore to China, in

accordance with Abubakar et al. [19].

Despite the high prevalence of the CRF01_AE in almost all South-East and East Asian

countries, those do not show the same pattern of viral dispersal. This may be explained by

differences in human migration, tourism, commerce and factors related to the characteristics

of local epidemics in Asia. For example in Thailand and Japan the CRF01_AE epidemics

were mainly driven by heterosexuals [5, 14] while in China, Vietnam and Taiwan by

heterosexuals, MSM and IDUs [17, 36, 37, 40, 41]. Furthermore, in our sample of HIV-1

infected individuals in Europe, we found 46 with Thai origin. Although it is unknown

whether these individuals became infected before emigrating to Europe, the low prevalence

of CRF01_AE in Europe and the high prevalence in Thailand suggests probably that most of

these infections took place in Thailand. These factors appear to be the most plausible

explanations for the different role of Thailand in the global CRF01_AE epidemic compared

to other Asian countries.

The European CRF01_AE sub-epidemic seems to be a result of viral transmissions

mainly from Thailand, Japan and Africa (Supplementary Table 1). International tourism and

extensive emigration into Europe around the turn of the century, might have contributed to

the observed transmission pattern. In particular, in Thailand the number of European tourists

visiting the country rose significantly in the past fifteen years, with travellers from United

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

15

Kingdom, France, Italy, Switzerland, Russia, Denmark and Finland being among the most

frequent in 2002 [16]. Extended commercial and tourist connections between West European

countries and Japan might also favour transmissions. Notably, all European viral movement

from Japan detected in our analysis concerned Western (and not Eastern) European countries

where commercial and tourist relationships with Japan are more intense (Supplementary

Table 1).

Conclusions

Our study describes for the first time the global dispersal pattern of CRF01_AE ,

highlighting the central role of Thailand as a major viral exporter to the Western world. The

key factor in this pattern might be Thailand’s popularity as a touristic destination, the

extensive network of commercial sex workers operating in the country and human migration.

The case of CRF01_AE is likely unique regarding endemicity and factors causing its global

dispersal pattern in contrast to other non-B infections mainly associated with immigration

from Africa. Africa provided the source for all globally prevalent non-B subtypes (A, C, D,

F) and CRFs. Currently, South Eastern Asia provides the only exception of a non-African

source for the global dissemination of CRF01_AE. Notably, this CRF spread beyond the area

of South Eastern Asia to distant western countries probably occurred through heterosexual

activities. Our study highlights the advance of phylogeographic analyses towards a better

understanding of infectious diseases epidemics and factors potentially associated with their

spread.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

16

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

DP designed the analysis and contributed to writing the manuscript. IM and KA performed

the analysis. KA, DP, GM and JA prepared the manuscript. All co-authors reviewed the

manuscript and contributed to data provision and interpretation of the results.

Funding

This work was supported by the European Community’s Seventh Framework Programme

(FP7/2007-2013) under the project “Collaborative HIV and Anti-HIV Drug Resistance

Network (CHAIN)”. GM is supported by the Medical Research Council. This work was in

part supported by the AIDS Reference Laboratory of Leuven that receives support from the

Belgian Ministry of Social Affairs through a fund within the Health Insurance System; the

Fonds voor Wetenschappelijk Onderzoek – Flanders (FWO) [PDO/11 to K.T., G069214N].

Previously presented information

This study has been accepted as a late breaker presentation at the 20th International AIDS

conference in Melbourne.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

17

Figure legends

Figure 1. Best-scoring unrooted ML tree estimated from RAxML using 2,736 HIV-1

CRF01_AE sequences from 46 countries around the globe. Tree branches are coloured

according to sampling origin. Different colours were used for sequences from Europe, Africa,

China, Thailand, Vietnam, Japan and from all other regions.

Figure 2. Significant importing and exporting pathways around the globe. Countries and

geographic regions with at least one significant pathway are illustrated with different colours.

The same colour (light blue) was used for all European countries and significant pathways

within Europe are not reported for clarity of the figure. Asian countries clustered as "Rest

Asia" and Singapore are not reported for the same reason. The direction and thickness of the

arrows show the direction and level (based on the ratios of Supplementary Table 1) of

CRF01_AE migration, respectively, across the transmission routes.

Figure 3. Significant importing and exporting pathways within Europe. Sampling countries

are in blue. The direction and thickness of the arrows show the direction and level (based on

the ratios of Supplementary Table 1) of CRF01_AE migration, respectively, across the

transmission routes.

Figure 4. An example of how the parsimony algorithm is used to compute migration events

on an unrooted phylogeny of 8 taxa, sampled from 2 different geographic regions. Tips are

labeled with characters K or L which correspond to the geographic origin and internal nodes

are numbered. Ancestral states are recursively assigned to the internal nodes according to the

majority rule criterion. A: For example, the ancestral state at node 1 is K because the two

joined branches have the same character K. Similarly for node 2. When the branches come

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

18

from different regions, K, L, the ancestral state is set to be the union of the two characters [K,

L], as in node 3. This implies that a character change has occurred and thus a migration event

is assigned. However, because the state of the node is ambiguous (either K or L), the

direction of the migration is undetermined. The ambiguity and thus the direction of migration

may be resolved by the states of neighboring nodes using the parsimony criterion. B: In this

case, because the state of node 4 is [K] the ambiguity of node 3 is solved into [K], and thus

the migration is from region K to L (denoted by → in B). In this example, both ambiguities in

nodes 3 and 6 are resolved under the parsimony criterion into state K. However, not always

ambiguities can be resolved, especially in case the sampling of taxa is from many different

locations.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

19

Tables

Table 1

Number of sequences per country and grouping strategies

Country N Grouping

in A NA

Grouping

in B NB

Bulgaria (BGR) 19

Central-

East

Europe

83

BGR 19

Cyprus (CYP) 1 CYP 1

Czech Republic (CZE) 53 CZE 53

Poland (POL) 4 POL 4

Russia (RUS) 3 RUS 3

Serbia (SRB) 1 SRB 1

Slovenia (SVN) 2 SVN 2

Austria (AUT) 21

West

Europe 255

AUT 21

Belgium (BEL) 14 BEL 14

Denmark (DNK) 28 DNK 28

Finland (FIN) 16 FIN 16

France (FRA) 9 FRA 9

Germany (DEU) 18 DEU 18

Italy (ITA) 10 ITA 10

Netherlands (NLD) 3 NLD 3

Norway (NOR) 8 NOR 8

Spain (ESP) 5 ESP 5

Sweden (SWE) 93 SWE 93

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

20

Switzerland (CHE) 28 CHE 28

United Kingdom (GBR) 2 GBR 2

United States (USA) 16 North

America 20

USA 16

Canada (CAN) 4 CAN 4

Australia (AUS) 8 AUS 8 AUS 8

Angola (ANG) 1

Africa 20 Africa 20

Cameroon (CMR) 6

Central African

Republic (CAF) 6

Chad (TCD) 1

Democratic Republic

of the Congo (COD) 3

Gabon (GAB) 1

Mali (MLI) 1

Senegal (SEN) 1

China (CHN) 723

East

Asia 927

CHN 723

Japan (JPN) 167 JPN 167

Taiwan (TWN) 33 TWN 33

Hong Kong (HKG) 1

Rest Asia 19

South Korea (KOR) 3

Afghanistan (AFG) 1 Central

Asia 3 Iran (IRN) 1

Pakistan (PAK) 1

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

21

Indonesia (IDN) 2

South-East

Asia 1420

Malaysia (MYS) 5

Myanmar (MMR) 1

Philippines (PHL) 4

Singapore (SGP) 131 SGP 131

Thailand (THA) 692 THA 692

Vietnam (VNM) 585 VNM 585

Total 2736 2736 2736

Note.− N is the number of available sequences per country in the dataset. NA and NB describe

the number of sequences in each group in analyses A and B, respectively. In each analysis the

grouping of countries was performed as described by the third and fifth columns. Text in

brackets refers to countries' 3-letter codes according to the United Nations.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

22

Table 2

Characteristics of European patients with known demographic record

Characteristic N %

Gender Male 100 64.10

Female 56 35.90

Country of origin

West Europe

Austria 13 8.39

Belgium 3 1.94

Denmark 8 5.16

Finland 13 8.39

Germany 9 5.81

Netherlands 1 0.65

Norway 1 0.65

Sweden 24 15.48

Central-East

Europe

Bulgaria 1 0.65

Czech Republic 4 2.58

Poland 2 1.29

Serbia 2 1.29

South America Colombia 1 0.65

Africa

Cameroon 1 0.65

Ethiopia 1 0.65

Nigeria 1 0.65

Tunisia 1 0.65

Asia China 2 1.29

Thailand 46 29.68

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

23

Vietnam 18 11.61

Myanmar 1 0.65

Indonesia 1 0.65

Iraq 1 0.65

Infection route

Patients with

any origin

Hete a 99 70.71

MSM b 22 15.71

IDU c 19 13.57

Patients with

Thai origin

Hete 36 78.27

MSM 6 13.04

IDU 1 2.17

Nwa d 3 6.52

Note.− Country of origin refers to the actual origin of the patient and not to the sampling

country. The infection route refers to the cause of infection, according to patients' responses. a

heterosexual, bmen having sex with men, c injection drug users, d no willing to answer.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

24

Table 3

Medians of observed (outside brackets) and expected (in brackets) migration events and ratio

of mean observed over mean expected events for Thailand as an exporter, inferred from

analysis B.

From

To

THA

Median Ratio

AUS 2 (1) 1.14

AUT 11 (4) 2.99

BEL 7 (2) 2.96

BGR 6 (3) 1.77

CAN 2 (1) 3.17

CHE 15 (4) 3.13

CHN 53 (71) 0.74

CYP 0 (0) 0.01

CZE 8 (9) 0.91

DEU 9 (3) 2.72

DNK 14 (5) 2.82

ESP 1 (1) 1.39

FIN 7 (3) 2.40

FRA 4 (1) 2.64

GBR 1 (0) 3.58

ITA 3 (2) 1.82

JPN 65 (27) 2.42

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

25

NLD 1 (0) 2.66

NOR 3 (1) 2.51

POL 1 (1) 1.73

RUS 1 (0) 2.33

SGP 59 (21) 2.74

SVN 1 (0) 3.26

SRB 0 (0) 1.18

SWE 35 (15) 2.29

TWN 15 (6) 2.71

USA 8 (3) 2.75

VNM 22 (66) 0.33

Rest Asia 8 (3) 2.53

Africa 4 (3) 1.21

Note.− Cells in bold indicate statistically significant pathways under the null hypothesis of

random mixing population. Countries' codes are according to Table 1. This table is part of

Supplementary Table 1 and summarizes results for Thailand as an exporter.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

26

References

1. Sharp PM, Hahn BH. Origins of HIV and the AIDS pandemic. Cold Spring Harb Perspect Med

2011; 1:a006841.

2. Sharp PM, Hahn BH. The evolution of HIV-1 and the origin of AIDS. Philos Trans R Soc Lond B

Biol Sci 2010; 365:2487-94.

3. Vidal N, Peeters M, Mulanga-Kabeya C, et al. Unprecedented degree of human immunodeficiency

virus type 1 (HIV-1) group M genetic diversity in the Democratic Republic of Congo suggests that the

HIV-1 pandemic originated in Central Africa. J Virol 2000; 74:10498-507.

4. Gao F, Robertson DL, Morrison SG, et al. The heterosexual human immunodeficiency virus type 1

epidemic in Thailand is caused by an intersubtype (A/E) recombinant of African origin. J Virol 1996;

70:7013-29.

5. WORLD HEALTH ORGANIZATION ROftWP. SEX WORK IN ASIA, July 2001.

6. Robertson DL, Sharp PM, McCutchan FE, Hahn BH. Recombination in HIV-1. Nature 1995;

374:124-6.

7. Carr JK, Salminen MO, Koch C, et al. Full-length sequence and mosaic structure of a human

immunodeficiency virus type 1 isolate from Thailand. J Virol 1996; 70:5935-43.

8. McCutchan FE, Hegerich PA, Brennan TP, et al. Genetic variants of HIV-1 in Thailand. AIDS Res

Hum Retroviruses 1992; 8:1887-95.

9. Nelson KE, Celentano DD, Suprasert S, et al. Risk factors for HIV infection among young adult

men in northern Thailand. JAMA 1993; 270:955-60.

10. McCutchan FE, Artenstein AW, Sanders-Buell E, et al. Diversity of the envelope glycoprotein

among human immunodeficiency virus type 1 isolates of clade E from Asia and Africa. J Virol 1996;

70:3331-8.

11. Hemelaar J, Gouws E, Ghys PD, Osmanov S, Isolation W-UNfH, Characterisation. Global trends

in molecular epidemiology of HIV-1 during 2000-2007. AIDS 2011; 25:679-89.

12. Vanichseni S, Kitayaporn D, Mastro TD, et al. Continued high HIV-1 incidence in a vaccine trial

preparatory cohort of injection drug users in Bangkok, Thailand. AIDS 2001; 15:397-405.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

27

13. Wasi C, Herring B, Raktham S, et al. Determination of HIV-1 subtypes in injecting drug users in

Bangkok, Thailand, using peptide-binding enzyme immunoassay and heteroduplex mobility assay:

evidence of increasing infection with HIV-1 subtype E. AIDS 1995; 9:843-9.

14. UNAIDS. Evaluation of the 100% Condom Programme in Thailand UNAIDS Case Study, 2000.

15. Cohen MS, Chen YQ, McCauley M, et al. Prevention of HIV-1 infection with early antiretroviral

therapy. N Engl J Med 2011; 365:493-505.

16. Tourism Authority of Thailand. Tourism Statistics. Available at:

http://www2.tat.or.th/stat/web/static_tsi_detail.php?L=&TsiID=1.

17. Liao H, Tee KK, Hase S, et al. Phylodynamic analysis of the dissemination of HIV-1 CRF01_AE

in Vietnam. Virology 2009; 391:51-6.

18. Ng KT, Ng KY, Khong WX, et al. Phylodynamic profile of HIV-1 subtype B, CRF01_AE and the

recently emerging CRF51_01B among men who have sex with men (MSM) in Singapore. PloS One

2013; 8:e80884.

19. Abubakar YF, Meng Z, Zhang X, Xu J. Multiple independent introductions of HIV-1 CRF01_AE

identified in China: what are the implications for prevention? PloS One 2013; 8:e80487.

20. Ng KT, Ong LY, Lim SH, Takebe Y, Kamarulzaman A, Tee KK. Evolutionary history of HIV-1

subtype B and CRF01_AE transmission clusters among men who have sex with men (MSM) in Kuala

Lumpur, Malaysia. PloS One 2013; 8:e67286.

21. Ye J, Xin R, Yu S, et al. Phylogenetic and temporal dynamics of human immunodeficiency virus

type 1 CRF01_AE in China. PloS One 2013; 8:e54238.

22. Los Alamos National Laboratory. HIV Databases. Available at:

http://www.hiv.lanl.gov/content/index.

23. Los Alamos National Laboratory. HCV sequence database. Available at:

http://hcv.lanl.gov/content/sequence/ELIMDUPES/elimdupes.html.

24. Wensing AMJ VJ, Van De Vijver DA, Albert J, Åsjö B, Balotta C, Camacho R, Coughlan S,

Grossman Z, Horban A, Kücherer C, Nielsen C, Paraskevis D, Loke WC, Poggensee G, Puchhammer-

Stöckl E, Riva C, Ruiz L, Schmit JC, Schuurman R, Salminen M, Sonnerborg A, Stanojevic M,

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

28

Struck D, Vandamme AM, Bouche, CAB, Spread programme. Transmission of drug-resistant HIV-1

in Europe remains limited to single classes. AIDS 2008; 22:625-35.

25. Paraskevis D, Pybus O, Magiorkinis G, et al. Tracing the HIV-1 subtype B mobility in Europe: a

phylogeographic approach. Retrovirology 2009; 6:49.

26. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary

genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony

methods. Mol Biol Evol 2011; 28:2731-9.

27. Yang Z. Estimating the pattern of nucleotide substitution. J Mol Evol 1994; 39:105-11.

28. Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates

over sites: approximate methods. J Mol Evol 1994; 39:306-14.

29. Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with

thousands of taxa and mixed models. Bioinformatics 2006; 22:2688-90.

30. Stamatakis A, Hoover P, Rougemont J. A rapid bootstrap algorithm for the RAxML Web servers.

Syst Biol 2008; 57:758-71.

31. CIPRES Cyberinfrastructure for Phylogenetic Research. The CIPRES Science Gateway V. 3.3.

Available at: http://www.phylo.org/index.php/portal/.

32. Slatkin M, Maddison WP. A cladistic measure of gene flow inferred from the phylogenies of

alleles. Genetics 1989; 123:603-13.

33. Swofford D.L. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version

4. Sinauer Associates 2003.

34. Skar H, Axelsson M, Berggren I, et al. Dynamics of two separate but linked HIV-1 CRF01_AE

outbreaks among injection drug users in Stockholm, Sweden, and Helsinki, Finland. J Virol 2011;

85:510-8.

35. Skar H, Sylvan S, Hansson HB, et al. Multiple HIV-1 introductions into the Swedish intravenous

drug user population. Infect Genet Evol 2008; 8:545-52.

36. World Health Organization. Thailand's new condom crusade. Bulletin of the World Health

Organization, 2010.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

29

37. Kalish ML, Korber BT, Pillai S, et al. The sequential introduction of HIV-1 subtype B and

CRF01AE in Singapore by sexual transmission: accelerated V3 region evolution in a subpopulation of

Asian CRF01 viruses. Virology 2002; 304:311-29.

38. Nerurkar VR, Nguyen HT, Dashwood WM, et al. HIV type 1 subtype E in commercial sex

workers and injection drug users in southern Vietnam. AIDS Res Hum Retroviruses 1996; 12:841-3.

39. Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC

Evol Biol 2007; 7:214.

40. Ou CY, Takebe Y, Luo CC, et al. Wide distribution of two subtypes of HIV-1 in Thailand. AIDS

Res Hum Retroviruses 1992; 8:1471-2.

41. Ubolyam S, Ruxrungtham, Sirivichayakul S, Okuda K, Phanuphak P. Evidence of three HIV-1

subtypes in subgroups of individuals in Thailand. Lancet 1994; 344:485-6.

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

31

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

32

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

33

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from

Acce

pted M

anus

cript

34

by guest on Decem

ber 18, 2014http://jid.oxfordjournals.org/

Dow

nloaded from