Molecular and Structural Characterization of Dissolved Organic Matter from the Deep Ocean by...

8
Molecular and structural characterization of HIV-1 subtype B Brazilian isolates with GWGR tetramer at the tip of the V3-loop Élcio Leal , Wilson P. Silva, Maria C. Sucupira, L. Mário Janini, Ricardo S. Diaz Federal University of São Paulo, Rua Pedro de Toledo, 781-16° andar, Vila Clementino.São PauloSP.CEP 04039 032, Brazil abstract article info Article history: Received 6 June 2008 Returned to author for revision 30 June 2008 Accepted 19 August 2008 Available online 23 September 2008 Keywords: HIV-1 V3 loop GWGR Coreceptor CCR5 CXCR4 3D-structure One of most intriguing features of the HIV-1 subtype B epidemic in Brazil is the high frequency of isolates exhibiting tryptophan (W) in the tetramer (GWGR) at the tip of the V3 loop. We observed that the frequencies of glutamic and aspartic acids at site 25 of the V3 loop are quite distinct in GWGR isolates compared with viruses with other tetramers. The basic amino acids at sites 11 and 25 of V3 are strongly linked with CCR5-to-CXCR4 coreceptor shift. We therefore predicted phenotype usage and found that GWGR isolates are exclusively CCR5-using. Further evidence of this came from intrahost sequences, where basic amino acid substitutions at sites 11 and 25 emerged only in isolates presenting a tryptophan-to-glycine replacement at the tetramer of the V3. In addition, modeled 3D-structures of the V3 loop of GWGR and GGGR in intrahost viruses differ essentially in the binding region of the coreceptor. © 2008 Elsevier Inc. All rights reserved. Introduction Infection with HIV-1 is primarily seen in CD4+ T cells and macrophages, although other cell types can also be infected. The cellular tropism is essentially determined by a series of interactions between the viral envelope glycoprotein (Env) and cell receptors. To enter target cells, HIV-1 typically needs to recognize and bind two membrane receptors. These events are perhaps the most crucial for HIV-1 infection, initially involving the interaction of the envelope with the main receptor (CD4) and then with chemokine coreceptors (Berger et al., 1999; Boyd et al., 1993; Wu et al., 1996). The envelope is a noncovalently associated trimeric glycoprotein composed of surface (gp120) and transmembrane (gp41) subunits (Roux and Taylor, 2007). The envelope interaction with the cell surface causes conformational changes in the viral glycoprotein. It is thought that, following the binding of gp120 to CD4, conformational modications enable the gp41 subunit to interact with the cell membrane, thereby allowing the virus to enter the host cell (Wei et al., 2003). The V3 region of gp120 plays a fundamental role in various characteristics of HIV. For example, upon HIV infection, there is an immunodominant antibody response directed primarily against the V3 region (Frost et al., 2005). In addition to viruscell binding, the V3 region also determines which of two chemokine coreceptors (CCR5 or CXCR4) will be exploited for entry (Choe et al., 1996; Coakley et al., 2005; Rizzuto et al., 1998). In roughly 50% of all infected individuals, the virus changes its chemokine receptor usage during the progression of HIV-1 infection (Boyd et al., 1993; Schuite- maker et al., 1992). In the early phase of HIV-1 infection, the viruses isolated are typically the R5 variants, which present tropism for the CCR5 receptors, whereas the X4 variants, which preferentially use the CXC4 receptors, emerge later in the infection. In the late phase of HIV-1 infection, the intrahost viral population can present a variety of compositions; it can be composed exclusively of X4 variants, it can consist of variants capable of using both coreceptors (R5X4 variants), or it can present equal numbers of the R5 and X4 variants (Kuiken, 1999; Scarlatti et al., 1997; Shioda et al., 1991). In European and North American countries, HIV-1 infection has been well characterized and involves a predominance of subtype B strains. In Brazil, subtype B also prevails in the HIV-1 epidemic, although nearly half of the infections caused by the subtype B strain are dueto viruses with the unusual GWGR motif in the V3 loop of the env gene. Although GW viruses have sporadically been observed in other countries, such as China, France, the Czech Republic, the Philippines, and Cuba, high frequencies of GW isolates are seen only in Brazil. This suggests that the presence of tryptophan (W) at the tip of the V3 loop, although rare in other countries, is not deleterious to HIV-1. Previous studies have demonstrated that the GW and GP viruses are not phylogenetically distinguishable (Kuiken, 1999; Meng et al., 2002; Morgado et al., 1994; Potts et al., 1993; Santoro-Lopes et al., 2000), whereas the HIV-1 subtype B found in Thailand differs from the subtype B isolates of other regions in terms of their tetramers (Kuiken, 1999). Nevertheless, the evolutionary mechanisms responsible for the fact Virology 381 (2008) 222229 Corresponding author. Fax: +55 11 5081 5394/5571 2130/5579 8226. E-mail address: [email protected] (É. Leal). 0042-6822/$ see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.virol.2008.08.029 Contents lists available at ScienceDirect Virology journal homepage: www.elsevier.com/locate/yviro

Transcript of Molecular and Structural Characterization of Dissolved Organic Matter from the Deep Ocean by...

Virology 381 (2008) 222–229

Contents lists available at ScienceDirect

Virology

j ourna l homepage: www.e lsev ie r.com/ locate /yv i ro

Molecular and structural characterization of HIV-1 subtype B Brazilian isolates withGWGR tetramer at the tip of the V3-loop

Élcio Leal ⁎, Wilson P. Silva, Maria C. Sucupira, L. Mário Janini, Ricardo S. DiazFederal University of São Paulo, Rua Pedro de Toledo, 781-16° andar, Vila Clementino.São Paulo–SP.CEP 04039 032, Brazil

⁎ Corresponding author. Fax: +55 11 5081 5394/5571E-mail address: [email protected] (É. Leal).

0042-6822/$ – see front matter © 2008 Elsevier Inc. Aldoi:10.1016/j.virol.2008.08.029

a b s t r a c t

a r t i c l e i n f o

Article history:

One of most intriguing feat Received 6 June 2008Returned to author for revision30 June 2008Accepted 19 August 2008Available online 23 September 2008

Keywords:HIV-1V3 loopGWGRCoreceptorCCR5CXCR43D-structure

ures of the HIV-1 subtype B epidemic in Brazil is the high frequency of isolatesexhibiting tryptophan (W) in the tetramer (GWGR) at the tip of the V3 loop. We observed that thefrequencies of glutamic and aspartic acids at site 25 of the V3 loop are quite distinct in GWGR isolatescompared with viruses with other tetramers. The basic amino acids at sites 11 and 25 of V3 are stronglylinked with CCR5-to-CXCR4 coreceptor shift. We therefore predicted phenotype usage and found that GWGRisolates are exclusively CCR5-using. Further evidence of this came from intrahost sequences, where basicamino acid substitutions at sites 11 and 25 emerged only in isolates presenting a tryptophan-to-glycinereplacement at the tetramer of the V3. In addition, modeled 3D-structures of the V3 loop of GWGR and GGGRin intrahost viruses differ essentially in the binding region of the coreceptor.

© 2008 Elsevier Inc. All rights reserved.

Introduction

Infection with HIV-1 is primarily seen in CD4+ T cells andmacrophages, although other cell types can also be infected. Thecellular tropism is essentially determined by a series of interactionsbetween the viral envelope glycoprotein (Env) and cell receptors. Toenter target cells, HIV-1 typically needs to recognize and bind twomembrane receptors. These events are perhaps the most crucial forHIV-1 infection, initially involving the interaction of the envelope withthe main receptor (CD4) and thenwith chemokine coreceptors (Bergeret al., 1999; Boyd et al., 1993; Wu et al., 1996). The envelope is anoncovalently associated trimeric glycoprotein composed of surface(gp120) and transmembrane (gp41) subunits (Roux and Taylor, 2007).The envelope interaction with the cell surface causes conformationalchanges in the viral glycoprotein. It is thought that, following thebinding of gp120 to CD4, conformationalmodifications enable the gp41subunit to interact with the cell membrane, thereby allowing the virusto enter the host cell (Wei et al., 2003). The V3 region of gp120 plays afundamental role in various characteristics of HIV. For example, uponHIV infection, there is an immunodominant antibody response directedprimarily against the V3 region (Frost et al., 2005). In addition to virus–cell binding, the V3 region also determines which of two chemokinecoreceptors (CCR5 or CXCR4) will be exploited for entry (Choe et al.,1996; Coakley et al., 2005; Rizzuto et al., 1998). In roughly 50% of all

2130/5579 8226.

l rights reserved.

infected individuals, the virus changes its chemokine receptor usageduring the progression of HIV-1 infection (Boyd et al., 1993; Schuite-maker et al., 1992). In the early phase of HIV-1 infection, the virusesisolated are typically the R5 variants, which present tropism for theCCR5 receptors, whereas the X4 variants, which preferentially use theCXC4 receptors, emerge later in the infection. In the late phase of HIV-1infection, the intrahost viral population can present a variety ofcompositions; it can be composed exclusively of X4 variants, it canconsist of variants capable of using both coreceptors (R5X4 variants), orit can present equal numbers of the R5 and X4 variants (Kuiken, 1999;Scarlatti et al., 1997; Shioda et al., 1991).

In European and North American countries, HIV-1 infection hasbeen well characterized and involves a predominance of subtype Bstrains. In Brazil, subtype B also prevails in the HIV-1 epidemic,although nearly half of the infections caused by the subtype B strain aredue to viruses with the unusual GWGR motif in the V3 loop of the envgene. Although GW viruses have sporadically been observed in othercountries, such as China, France, the Czech Republic, the Philippines,and Cuba, high frequencies of GW isolates are seen only in Brazil. Thissuggests that the presence of tryptophan (W) at the tip of the V3 loop,although rare in other countries, is not deleterious to HIV-1. Previousstudies have demonstrated that the GW and GP viruses are notphylogenetically distinguishable (Kuiken, 1999; Meng et al., 2002;Morgado et al., 1994; Potts et al., 1993; Santoro-Lopes et al., 2000),whereas theHIV-1 subtype B found inThailanddiffers from the subtypeB isolates of other regions in terms of their tetramers (Kuiken, 1999).Nevertheless, the evolutionary mechanisms responsible for the fact

Fig. 1. Amino acid frequencies in the V3 loop of HIV-1. Consensus sequences of GW viruses (panel A) and viruses with other motifs (i.e., GP, GF, GL, etc) at the V3 loop (panel B) arewithin the dark grey area, and the V3 loop tetramer is underlined. Below each nonconserved site, the specific frequencies of the most common residues are depicted. The alternativeamino acids are indicated above the sites of consensus sequences; the number above each residue represents its occurrence in the data set. The numbers inside the parenthesesindicate sites 11 and 25, which are important for the R5-to-X4 coreceptor shift. Arrows indicate sites where the two consensuses differ.

able 1requencies of aspartic (D) and glutamic (E) acids at site 25 of the V3 loop of HIV-1

Fisher's two-tailed probability calculated in 2×2 contingency tables.These sequences together with their coreceptor preference status were obtained frome HIV Sequence Database (http://hiv-web.lanl.gov).

223É. Leal et al. / Virology 381 (2008) 222–229

that GW and GP viruses are found in equal proportions in Brazilianepidemics have yet to be explored. Therefore, we focused on diverseaspects of the evolution of the GWand GP isolates of subtype B virusesco-circulating in Brazil during a specific period (1983–2000).

We found that the frequencies of amino acids at sites of the V3 looplinked with CCR5-to-CXCR4 shift are quite distinct between the GWand GP viruses. Additionally, GW viruses may use the CCR5 coreceptorexclusively, according to the predicted phenotype based on the V3sequences. Lastly, the modeled 3-D structure of the V3 loop of the GWand GG variants from intrahost samples differs mostly in the bindingregion of the coreceptor.

Results

Amino acid composition at the V3 loop

In order to characterize HIV-1 variants, we analyzed the amino acidsubstitutions in the V3 loop on a site-by-site basis. Fig. 1 depicts theconsensus sequences and the alternative amino acids obtained fromthe alignments of GWGR viruses and subtype B viruses with othertetramers at the tip of V3 loop (e.g., GPGR, GFGR, GGGR, etc). Inaddition to the second V3 tetramer site, consensuses also differed atsites 14 and 25 (arrows in Fig. 1). Although a limited number ofsequences were analyzed, both viruses presented an extensivenumber of amino acid substitutions throughout the V3 region. Thiswas expected, since the V3 region has been characterized as one of themost diverse regions of the HIV-1 genome. Most of the sites werehighly variable, although a great number of the substitutions observedwere due to singleton mutations. Equally, in GWGR viruses andsubtype B isolates with other tetramers (GXGX), many V3 sitescontained a variety of single amino acids. Notably, some amino acidspresented at high frequencies, whereas others were unique substitu-tions. For example, at site 29, the aspartic acid-to-asparaginesubstitution reached a high frequency in both viruses, while other

substitutions, such as aspartic acid-to-glutamic acid, were lesscommon. This is probably due to the genetic barrier of the ancestralsequence, since replacement of aspartic acid to asparagine involvesonly one G-to-A transition while the replacement to glutamic acidrequires at least one transversion (i.e., GAT-to-GGA or GAC-to-GAA/GAG). In contrast, at site 14, the proportions of methionine, leucine,and isoleucine were, respectively, 61%, 29%, and 10% in GWGR viruses,compared with 29%, 13%, and 54% in GXGX viruses. Although thesubstitution methionine-to-isoleucine involves only one transition atthe third site of the triplet (i.e., ATG-to-ATA), the genetic barriercannot be evoked to explain why the most frequent amino acid at site14 of the V3 loop of GWGR viruses is methionine, while in GXGXviruses, it is isoleucine (Fig. 1).

Proportions of aspartic and glutamic acid at site 25 of the V3 loop

The most noticeable dissimilarity between the GWGR and GXGXviruses was in the proportions of aspartic acid (D) and glutamic acid

TF

#⁎

th

Fig. 2. Prediction of coreceptor usage of HIV-1. The medians of PSSM scores of the GWGR (n=41), GXGX (n=78), R5 (n=268), and X4 (n=56) data sets are depicted as filled squares.Boxes depict 25–75% intervals of PSSM scores values. Higher PSSM scores are indicative of CXCR4-using viruses. The R5 and X4 sequences and their coreceptor preference status wereobtained from the HIV Sequence Database (http://hiv-web.lanl.gov).

224 É. Leal et al. / Virology 381 (2008) 222–229

(E) at site 25 of the V3 loop (Table 1). The presence of negative aminoacids at site 25 has been implicated in the better replicative fitness ofHIV-1 R5-using strains (da Silva, 2006; Hung et al., 1999). Conse-quently, for comparative purposes we also analyzed 268 sequences ofR5-using viruses and 56 sequences of X4-using viruses with knowncoreceptor status obtained from the Los Alamos HIV database (http://hiv-web.lanl.gov). Likewise, a great variety of amino acids wereobserved at site 25 of the HIV-1 V3 loop, and, in most viruses, asparticacid and glutamic acid was present in high proportions. Equivalentquantities of the ratio of aspartic acid to glutamic acid (D:E) wereobserved between the GXGX and X4 viruses (p=0.629, Table 1).However, the overall quantity of aspartic acid plus glutamic acid (D+E)in the GXGX viruses was roughly 73%, whereas in the X4 viruses itcomprised only 41% of the amino acids found at site 25. The fact thatthe quantities of aspartic acid and glutamic acid observed in the GXGXvirus and the X4 variant were similar (1:1) might indicate that the twoacids have a comparable impact on the fitness of HIV-1. On the otherhand, in the GWGR and R5 viruses there is an imbalance in theproportion of aspartic and glutamic acids at site 25 of the V3 loop. Forexample, the quantity of glutamic acid was disproportionately high(23/41) in the GWGR viruses compared to the proportion of glutamic(26/78) in the GXGX viruses (p=0.007, Table 1), indicating that thealternative amino acids might be detrimental to the GWGR viruses.Conversely, in the R5 viruses the quantity of aspartic acid at site 25was high (182/268) in comparisonwith the proportion of aspartic acid(11/56) observed in the X4 viruses (p=0.001, Table 1).

Predicted coreceptor usage

We noticed some differences regarding the frequencies of aminoacids at sites 11 and 25 of the V3 loop region between the GWGR andGXGR viruses since these sites are linked with CCR5-to-CXCR4coreceptor shift. We were then interested in determining whetherthis distinction between the two viruses was also observed in thecontext of corereceptor usage, as predicted by the position-specificscoring matrices (PSSM) method. To that end, we determined thePSSM score for each sequence of the GWGR and GXGR viruses using

the x4r5 matrix (Jensen et al., 2003). We also determined the PSSMscores of 268 sequences of R5 viruses and 56 sequences of X4 viruses.The overall PSSM scores of the GWGR, GXGX, R5, and X4 data sets areplotted in Fig. 2. The medians, 25–75% intervals, and range of non-outliers for the GWGR, GXGX, and R5 viruses have negative values thatdid not differ significantly between these data sets. On the other hand,the X4 viruses have positive values that are significantly distinct fromthe other data sets. This is expected because high PSSM scores areindicative of CXCR4-using viruses. In this context, we detected sixisolates (6/78) of the GXGX data set with high PSSM scores (opendiamonds in Fig. 2) that are within the non-outlier range for the X4viruses, therefore suggesting that these isolates are CXCR4-usingviruses.

Intrahost analysis of the V3 loop

The above analyses were performed on sequences from unlinkedindividuals (interhost) and therefore reveal the evolution of GWviruses at the population level. To further our understanding of theevolution of GW viruses during intrahost infection, we studiedsequences of a region of the env gene from an HIV-1-infected patient.These sequences were from molecular clones sampled duringantiretroviral therapy, and again at 2 weeks and 12 weeks afterstructured therapy interruption. In a previous study, we noted thatcomplete discontinuation of therapy-induced recrudescence of wild-type drug-sensitive viruses (Silva et al., 2006). Specifically, drug-resistant viruses with a GGGR tetramer at the tip of the V3 looppredominated during antiretroviral therapy, whereas at 12 weeksafter the interruption of therapy, the intrahost population wascompletely replaced by viruses presenting the GWGR tetramer atthe tip of the V3 loop. However, the less glycosylated pattern of X4variants has been associated with structural modification of the V3loop region during coreceptor shift (Ogert et al., 2001). Our intrahostanalysis indicated that the GW sequences presented a lowerproportion of N-linked glycosylation sites than did the GG sequences.In the GW sequences, the proportions of N-linked glycosylation,specifically at two sites prior to the V3 loop, were 60% and 50%,

225É. Leal et al. / Virology 381 (2008) 222–229

respectively, comparedwith 100% at both sites in the GG sequences. Inaddition, the GG sequences also presented basic amino acids at sites 11and 25 of the V3 loop that typically confer the status of CXCR4-using toHIV-1. In contrast, the GW sequences had an uncharged (serine) andnegative amino acid at V3 loop sites 11 and 25, respectively.

Intrahost V3 loop evolution

For the intrahost HIV-1 population, the phylogenetic analysisrevealed a clearer scenario concerning the dynamics of the GW andGG variants. As can be seen in Fig. 3, the maximum a posteriori (MAP)phylogenetic tree shows that the GW sequences formed two clusterslocated near the root, whereas the GG sequences formed a unique andwell-defined cluster distant from the root. We also used themaximumlikelihood approach to reconstruct the ancestral sequences of thisintrahost population. This reconstruction (based on a midpoint rootedMAP tree) showed that the most recent common ancestral (MRCA)virus of the intrahost population most likely had a GWGR tetramer atthe tip of the V3 loop (Fig. 3). In addition, the reconstructed ancestralstates showed an extensive number of amino acid substitutions,including basic amino acids, at sites 11 and 25 (sites 30 and 44,respectively, in our alignment), that emerged only in the lineage of GGviruses. Notably, most of the changes in the GG lineage, including an

Fig. 3. Phylogenetic tree of intrahost viruses. Maximum a posteriori tree of HIV-1 from sequeThe numbers above the branches indicate the posteriori probability support for each clustereconstruction of amino acid substitutions that occurred in the lineages are indicated by one-The dotted area indicates amino acid changes with residues inside the V3 loop region. The sitereplacement in the V3 loop tetramer. The sequence B.US.98.1058_11 (AY331295) was used a

increase in the number of positive charged amino acids, occurredwithin the V3 loop.

Modeled structure of GW and GG V3 loop

To gain additional insights about the changes associated with theGWGR to GGGR shift, we evaluated the characteristics of the threedimensional (3D) structure of the V3 loop predicted from thesevariants. To construct the 3Dmodels, we used the ancestral sequencespreviously reconstructed with ML analysis. Initially, a blast search wasperformed in order to identify the best template structure. The best-scored structure was from Gp120 of the HIV-1 JR-FL variant(PDB=2b4c). Next, we extracted the V3 loop region from the 2b4cstructure and used it as a backbone template to model the side chainsof the GW and GG ancestral sequences. Our modeled structures werequite similar to the 2b4c template, and the quality of thesemodels wasevaluated by comparing the Ramachandran plot of the V3 loop of thetemplate with plots of the GW and GG structures (data not shown).The V3 loop structure of the GW and GG variants indicated that theywere similar to each other, and differences in the structure were dueto side chains of some amino acids, such as the tetramer at the tip ofthe V3 tetramer (grey region in Fig. 4). Additionally, the most strikingdifference between the V3 structures of the GW and GG variants was

nces of an individual infected by GWGR lineage that later was replaced by a GGGR virus.r of the tree. Grey areas indicate the main clusters. (A) Maximum likelihood ancestralletter amino acid code, along with the position in the alignment and the charge residues.s 11 and 25 of V3 are underlined. The arrow indicates the tryptophan (W)-to-glycine (G)s outgroup.

Fig. 4. Structures of the V3 loop of GWGR and GGGR variants. Models of 3-D structures of the V3 loop of ancestral sequences of GWGR and GGGR variants from intrahost samples aredepicted (panels A and B, respectively). The models were based on the Gp120 structure of a HIV-1 subtype B isolate (PDB=2b4c). The tetramer at the tip of the V3 loop is depicted ingrey, and the hydrophilic base of coreceptor-binding region (N6N7) is depicted in green in the structures. The replacement from asparagine (N) to lysine (K) at site 7 of V3 is alsodepicted inmagenta (panel B). Likewise, replacement of Glycine (G) to Glutamic (E) at site 24 of the V3 loop in the GG variants is depicted in cyan (Panel B) in the binding pocket of theV3 structure. Sites 11 and 25 of V3 usually linked with CCR5-to-CXCR4 coreceptor shift are also depicted (yellow) in the structures.

226 É. Leal et al. / Virology 381 (2008) 222–229

seen in the coreceptor-binding region of the V3 loop (stem region ofthe V3 loop in Fig. 4). This region has a hydrophilic base formed byuncharged asparagines at sites 6 and 7 of the V3 loop (depicted ingreen in Fig. 4), where CCR5 binds and interacts predominantly withthe asparagine at site 7 (N7) (Huang et al., 2007). In our case, the sidechains of amino acids found in the V3 loop of the GG variants had agreat impact on the structure of the pocket region. For example, theside chain of glutamic acid (E) at site 24 of the V3 loop found in the GGvariants partially occluded the binding pocket (cyan in panel B of Fig.4). Interestingly, a replacement from asparagine (N) to lysine (K) at site7 of the V3 loop was observed in the GG variants (magenta in panel Bof Fig. 4), which could completely abrogate CCR5 binding.

Discussion

Our study indicates that the GWGR and GXGX viruses were quitedistinct in terms of the frequencies of amino acids at V3 sites 14 and 25(Fig. 1). The amino acid most frequently identified at site 14 of theGWGR viruses was methionine (61%), whereas, in the GXGX viruses,that distinction was held by isoleucine (54%). This disparity betweenthe GWGR and GXGX viruses in terms of the proportions ofmethionine and isoleucine might be related to the founder effect,and the impact that such a disparity has on the fitness of HIV-1 mightbe irrelevant, since the two amino acids share the same hydrophobiccharacteristics. However, the differences at site 25 might have asignificant impact on viral replicative fitness. Specifically, the fitnesseffect of different amino acids at site 25 of the V3 loop of HIV-1subtype B has been demonstrated by site-directed mutagenesis in R5variants (Hung et al., 1999). This study showed that, at site 25, theresidues that were the most advantageous for virus infectivity wereaspartic acid and glutamic acid, whereas hydrophobic amino acids

drastically reduced the fitness of HIV-1. In addition, a strong relation-ship has been demonstrated between viral infectivity and thefrequency of certain amino acids at site 25 of the V3 loop in R5variants of HIV-1 (da Silva, 2006). Furthermore, the most commonamino acid at site 25 of the V3 region (aspartic acid) is associated withthe highest level of viral infectivity (da Silva, 2006). Consequently,aspartic acid provides the best replicative fitness for R5 variants ofHIV-1. Moreover, the second most common amino acid at site 25(glutamic acid) is also associated with high levels of viral infectivity;however, its selection coefficient, relative to aspartic acid, is 0.18 (daSilva, 2006). This means that viruses presenting glutamic acid at site25 of the V3 loop in R5 variants are 18% less fit than those presentingaspartic acid at the same site.

The high variability of the V3 region has been extensivelydescribed, and positive selection has been implicated in the main-tenance of such diversity, in individuals as well as at the populationlevel (Leal et al, 2007; Lemey et al., 2007; Shankarappa et al., 1999;Yang et al, 2003). The source of this selective pressure has also beenextensively studied. It is likely that the principal force driving theevolution of the V3 region of HIV-1 is cell receptor usage, escape fromhost immune response, or a combination of the two (Ross and Rodrigo,2002; Williamson, 2003). In particular, coreceptor usage (i.e., CXCR4usage) has been implicated as the main cause of the high diversityobserved at site 25 of the V3 loop (da Silva, 2006). In GWGR viruses,the proportions at site 25 are 56% (23/41) for glutamic acid and only17% (7/41) for aspartic acid, clearly indicating that glutamic acid is theamino acid preferred by these viruses. In contrast, in R5 viruses,aspartic acid is by far the most frequent amino acid (68%; 182/268) atsite 25 of the V3 loop. Equal balance between the proportion ofaspartic acid and glutamic (D:E) acid was observed in the GXGXviruses and X4 variants (Table 1). Interestingly, the proportion of

227É. Leal et al. / Virology 381 (2008) 222–229

aspartic acid plus glutamic acid (D+E) in the GXGX viruses was 73%(57/78) of the total amount of amino acids observed at site 25 of theV3 loop, but in the X4 viruses, the proportion was 41% (23/56). Thelower proportion of aspartic and glutamic acids in the X4 variant is inagreement with the observation that the R5-to-X4 shift is accom-panied by basic mutations at sites 11 and 25 of the V3 loop, the so-called 11/25 rule. However, if the glutamic acid–aspartic acidequilibrium is principally driven by selection based on coreceptorusage, then both acids are equally advantageous at site 25. Therefore,the different replicative fitness reported by da Silva (2006) might beneglected in HIV-1, since glutamic and aspartic acid are maintained athigh frequencies in GXGX viruses and X4 variants. In addition, theimbalance in favor of glutamic acid at site 25 in GWGR viruses mightindicate distinct preferences in coreceptor usage by viruses with W atthe tip of V3 loop. Alternatively, immune pressure exerted on GWGRviruses could differ drastically from that observed in GXGR viruses;however, since the GWGR and GXGX viruses are co-circulating in thesame host population, it is unlikely that these viruses would elicitdistinctive host responses to the infection.

The intrahost samples showed that HIV-1 variants containingbasic mutations at sites 11 and 25 also replaced the larger tryptophanwith the tiny glycine. Although the GG viruses were not empiricallycharacterized by their coreceptor usage, the presence of basic aminoacids at both sites typically confers HIV-1 X4 phenotype status (DeJong et al., 1992). An additional feature that is considered a strongselective pressure is the variation outside epitopes at N-linkedglycosylation sites mediated by humoral immune escape antibodyattack: the glycan shield mechanism (Wei et al., 2003). During theinitial phase of HIV-1 infection, the virus needs to escape theantibody attack, and this is achieved by acquisition of mutations atN-linked glycosylation sites, resulting in structural modifications atthe surface of sugar moieties on the viral envelope. These mutationspermit the HIV to avoid the neutralizing antibodies in order to bindto the viral surface proteins, thus allowing the virus to escape fromthe humoral immune surveillance (Dacheux et al., 2004). Althoughthis mechanism is not directly associated with viral diversification, itis strongly linked to host immune pressure. In addition, we becameinterested in this issue because the reduced glycosylation in the X4variants had been associated with the structural modification of theV3 loop during the coreceptor shift (Ogert et al., 2001). Theproportion of N-linked glycosylation sites also changed with theemergence of GG viruses. It has been demonstrated that this sort ofmodification within or near variable regions of the env gene affectsHIV-1 coreceptor usage and cell tropism (Ogert et al., 2001).Consequently, the amino acid modifications observed in our analysismight be linked to the R5-to-X4 shift.

Binding of gp120 to CD4 also causes conformational modificationsthat project the V3 loop to interact with the cell coreceptor, and thisprotrusion renders the V3 loop flexible in order to interact with thecoreceptor (Hung et al., 1999; Kwong et al., 1998). It has been shownthat intrahost evolution of the V3 loop structure is highly diverse andstructurally flexible throughout the infection (Watabe et al., 2006).Therefore, structural modifications seem to be related to immunepressures and interactions with the cell coreceptor (Sander et al.,2007). Indeed, monoclonal antibodies directed against the D19epitope within the V3 region neutralize only the X4 variants (Lussoet al., 2005). Consequently, inaccessibility of this antibody to R5 mightindicate that there are significant V3 loop conformational differencesbetween these two variants (Sander et al., 2007). In fact, it has beendemonstrated that intricate electrostatic interactions occur amongamino acid residues in order to preserve the proper tertiary structureof the V3 loop (Rosen et al., 2006). This sort of structural modificationwas also observed in our intrahost analysis, where the 3D-structuresof V3 in the GW and GG variants differ primarily in the coreceptor-binding pocket region. Furthermore, it has been demonstrated thatCCR5 interacts with the conserved sequence (P4N5N6N7) of the V3

loop, and that binding of this coreceptor is blocked when N7 isreplaced by charged amino acid (Huang et al., 2007). Similarly, ourintrahost analysis also showed that the polar asparagine (N) at site 7 ofthe V3 loop was replaced by a positive lysine (K) in the GG variants.

The results of the present study show that the composition ofcharged amino at sites 11 and 25 of the V3 loop is quite distinctbetween the GWGR and GXGX viruses. Consequently, the coreceptorpreference of these viruses might be equally affected. In addition, ourintrahost analysis showed that basic residues were present at sites 11and 25 of the V3 loop only when tryptophan at the GWGR tetramer(GW viruses) was replaced by glycine in the GG viruses. GWGR isolateswere characterized by a shortage of basic amino acids at sites 11/25 ofthe V3 loop and by the absence of X4 variants based on the phenotypeprediction. In addition, 3D models of intrahost viruses showedstructural changes in the CCR5 binding pocket region when thetryptophan at the tip of V3was replaced by a glycine in variants withinthe host. Although we have analyzed a limited number of sequencesand the tryptophan replacement was observed in only one individual,our findings suggest that there are constraints on the envelope of HIV-1 to use the CXCR4 coreceptor in strains with the GWGR tetramer atthe tip of the V3 loop. Consequently, one way to maintain the propertertiary conformation and to use the CXCR4 coreceptor would be thereplacement of the tryptophan.

Finally, our results belie the concept that GWGR viruses are lesspathogenic, based on the rate of disease progression, than are othersubtype B strains in Brazil (Santoro-Lopes et al., 2000). Since X4variants are generally associated with a rapid decline in CD4+ cellcounts and accelerated progression to AIDS (Schuitemaker et al., 1992;Shankarappa et al., 1999), any comparisons between the GWGR andGPGR viruses should incorporate the profile of coreceptor usage. Thepotential pitfall resides in the fact that GWGR viruses might only haveR5 variants, whereas GPGR viruses might have R5 and X4 variantsindiscriminately. Consequently, there will be a bias promoting theincorrect notion that infectionwith a GPGR virus results in lower CD4+cell counts or more rapid progression to AIDS than does infectionwitha GWGR virus.

In summary, we thoroughly analyzed sequences of HIV-1 subtype Bwith the GWGR tetramer at the tip of the V3 loop from virusescirculating in Brazil. We find a considerable difference between theGWGR and GXGX viruses in terms of the amino acid profile in the V3region, particularly at sites 11 and 25, which are associated with theR5-to-X4 coreceptor shift. Notably, the scarcity of basic amino acids atsites 11 and 25 within the V3 loop of GWGR viruses might indicatelimited usage of the alternative coreceptor. Further evidence of theconstrained use of the CXCR4 coreceptor by the GWGR viruses camefrom our intrahost analysis, in which the tryptophan in the tetramerwas found to be replaced byglycine only in isolateswith X4 amino acidprofiles (i.e., basic amino acids at sites 11 and 25 of the V3 loop). In viewof these facts, we hypothesize that the V3 loop of GWGR viruses mightprevent the electrostatic interaction of charged amino acids necessaryto maintain the proper V3 loop structure after coreceptor binding.

Materials and methods

Samples

The informed consent for collecting blood samples and the protocolof this study was approved by the Federal University of São Paulo. Wegenerated 41 sequences from GWGR viruses and 78 sequences fromGXGR viruses, all taken from a 219-bp fragment of the V3 region of theenv gene (gp120) of HIV-1. These sequences were all taken fromBrazilian individuals clinically asymptomatic, naïve to drug therapy,and with CD4 levels above 200 cells/mm3. The samples were collectedbetween1983 and2000. The remaining sequenceswere obtained fromthe HIV Sequence Database (http://hiv-web.lanl.gov). We also ana-lyzed 32 sequences; these were obtained over time from a single

228 É. Leal et al. / Virology 381 (2008) 222–229

individual under therapywith CD4 levels below 200 cells/mm3, from a270-bp fragment of the V3 region of HIV-1 subtype B, sampled duringantiretroviral therapy and again at 2 weeks after and 12 weeks afterstructured therapy interruption. These fragments of the V3 region ofthe env gene (gp120) of HIV-1 were amplified and sequenced usingprimers and conditions previously described (Silva et al., 2006).

Sequence alignment

The sequences were initially aligned using the ClustalX program(Thompson et al., 1997). All sites with deletions and insertions werethen excluded in order to preserve the reading frames of the genes.After this editing process, the sequences were manually aligned usingthe SE-AL program, version 2.0 (http://evolve.zoo.ox.ac.uk/software/).

Prediction of coreceptor usage

In addition to the presence of basic amino acids at sites 11 and 25 ofthe V3 loop (11/25 rule), we also used the position-specific scoringmatrices (PSSM)method to analyze our sequences (Jensen et al., 2003).The PSSM detects nonrandom distributions of amino acids in a set ofsequences that may be associated with some empirically determinedproperty (for example, coreceptor usage) and constructs a matrix. Thematrix is thenused to compare the amino acids of query sequences andPSSM assigns scores to each query sequence. Consequently, thesescores can be used to predict the likelihood of a query sequence havingthe property of interest. We predicted the coreceptor usage of oursequences using a PSSMmatrix (x4r5) constructed with the V3 regionof HIV-1 subtype B with a known X4 or R5 phenotype (http://mullinslab.microbiol.washington.edu/computing/pssm/). In this con-text, high scoring sequences are more likely to use the CXCR4coreceptor, whereas low scoring sequences are more likely to use theCCR5 coreceptor.

Proportion of N-linked glycosylation sites

The N-linked glycosylation sites were determined by identifyingthe NX[ST]Y pattern (where X can be any amino acid). Amino acids,such as proline, in the X or Y position can be important determinantsof N-linked glycosylation efficiency (Zhang et al., 2004). The analyseswere performed using the web interface available at http://www.hiv.lanl.gov/content/hiv-db/GLYCOSITE/glycosite.html.

Phylogenetic inference

To infer the intrahost phylogenetic tree, we used Bayesianinference, assuming GTR and a gamma correction model, as isimplemented in the MrBayes program, version 3.1.2 (Huelsenbeckand Ronquist, 2001). For this analysis, two independent Markov chainMonte Carlo (MCMC) were run for 107 generations with the initial 10%of each run discarded as burn-in. The convergencewas evaluated usingthe TRACER software, version 1.2 (http://beast.bio.ed.ac.uk/), and runswere acceptedwhenall parameters presented the effective sample sizenumber (ESS) greater than 100. The ancestral reconstruction of the V3loop sequence used the model 2 (M2: selection) that allows differentproportions of conserved sites (dN/dS (ω)=0), neutral sites (ω=1), andan additional class of sites, with itsω ratio (which can be N1) estimateddirectly from the data. We then mapped the inferred sites on thephylogenetic intrahost tree. This reconstruction was performed usingthe CODEMLprogram from the PAMLv. 3.14 package (Yanget al., 2000).

3-D structure prediction

Ancestral reconstructed sequences from intrahost samples wereused to build the V3 structures of the GWGR and GGGR viruses. In orderto identify structural homologues in the protein data bank (http://www.

pdb.org), we used sequences of 35 amino acids (V3) and performed psi-BLASTsearches using theMeta-Sever at Centre de Biochimie Structurale(http://bioserv.cbs.cnrs.fr/HTML_BIO/frame_meta.html). Based onthese searches, the structure 2b4c, corresponding to the proteinGp120 of a HIV-1 CCR5-using variant, was retrieved, and from thisstructurewe then extracted theV3 loop region in theG chain andused itas a template to build themodels. The C-α backbone of the 2b4cV3 loopstructure was used to model the side chain coordinates of the ancestralsequences of the GWGR and GGGR variants from intrahost samples ofHIV-1 using the SCRWL software (Canutescu et al, 2003). Visualizationand editing of the structures were done using the PyMOL software(http://www.pymol.org).

Acknowledgments

Wewould like to thank Hirohisa Kishino for his valuable commentsand suggestions on this work. We also would like to thank twoanonymous reviewers for their criticisms and comments that havegreatly improved this work. EL is a postdoctoral researcher supportedby FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo,Foundation for the Support of Research in the State of São Paulo; grantsno. 04/10372-3, 07/52841-8).

References

Berger, E.A., Murphy, P.M., Farber, J.M., 1999. Chemokine receptors as HIV-1 coreceptors:roles in viral entry, tropism, and disease. Annu. Rev. Immunol. 17, 657–700.

Boyd,M.T., Simpson, G.R., Cann, A.J., Johnson, M.A.,Weiss, R.A.,1993. A single amino acidsubstitution in the V1 loop of human immunodeficiency virus type 1 gp120 alterscellular tropism. J. Virol. 67 (6), 3649–3652.

Canutescu, A.A., Shelenkov, A.A., Dunbrack Jr., R.L., 2003. A graph-theory algorithm forrapid protein side-chain prediction. Protein. Sci. 12 (9), 2001–2014.

Choe, H., Farzan, M., Sun, Y., Sullivan, N., Rollins, B., Ponath, P.D., Wu, L., Mackay, C.R.,LaRosa, G., Newman, W., Gerard, N., Gerard, C., Sodroski, J., 1996. The beta-chemokine receptors CCR3 and CCR5 facilitate infection by primary HIV-1 isolates.Cell 85 (7), 1135–1148.

Coakley, E., Petropoulos, C.J., Whitcomb, J.M., 2005. Assessing chemokine co-receptorusage in HIV. Curr. Opin. Infect. Dis. 18 (1), 9–15.

da Silva, J., 2006. Site-specific amino acid frequency, fitness and the mutationallandscape model of adaptation in human immunodeficiency virus type 1. Genetics174 (3), 1689–1694.

Dacheux, L., Moreau, A., Ataman-Onal, Y., Biron, F., Verrier, B., Barin, F., 2004.Evolutionary dynamics of the glycan shield of the human immunodeficiencyvirus envelope during natural infection and implications for exposure of the 2G12epitope. J. Virol. 78 (22), 12625–12637.

De Jong, J.J., De Ronde, A., Keulen, W., Tersmette, M., Goudsmit, J., 1992. Minimalrequirements for the human immunodeficiency virus type 1 V3 domain tosupport the syncytium-inducing phenotype: analysis by single amino acidsubstitution. J. Virol. 66 (11), 6777–6780.

Frost, S.D., Wrin, T., Smith, D.M., Kosakovsky Pond, S.L., Liu, Y., Paxinos, E., Chappey, C.,Galovich, J., Beauchaine, J., Petropoulos, C.J., Little, S.J., Richman, D.D., 2005.Neutralizing antibody responses drive the evolution of human immunodeficiencyvirus type 1 envelope during recent HIV infection. Proc. Natl. Acad. Sci. U. S. A. 102(51), 18514–18519.

Huang, C.C., Lam, S.N., Acharya, P., Tang, M., Xiang, S.H., Hussan, S.S., Stanfield, R.L.,Robinson, J., Sodroski, J., Wilson, I.A., Wyatt, R., Bewley, C.A., Kwong, P.D., 2007.Structures of the CCR5 N terminus and of a tyrosine-sulfated antibody with HIV-1gp120 and CD4. Science 317 (5846), 1930–1934.

Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: Bayesian inference of phylogenetic trees.Bioinformatics 17 (8), 754–755.

Hung, C.S., Vander Heyden, N., Ratner, L., 1999. Analysis of the critical domain in the V3loop of human immunodeficiency virus type 1 gp120 involved in CCR5 utilization.J. Virol. 73 (10), 8216–8226.

Jensen, M.A., Li, F.S., van 't Wout, A.B., Nickle, D.C., Shriner, D., He, H.X., McLaughlin, S.,Shankarappa, R., Margolick, J.B., Mullins, J.I., 2003. Improved coreceptor usageprediction and genotypic monitoring of R5-to-X4 transition by motif analysis ofhuman immunodeficiency virus type 1 env V3 loop sequences. J. Virol. 77 (24),13376–13388.

Kuiken, C.L., Leitner, T., 1999. HIV-1 subtyping. In: a. G. R. L. Rodrigo A.G (Ed.),Computational and evolutionary analysis of HIV molecular sequences, Volume 1.Kluwer Academic Publisher, Norwell, Massachusetts, USA, pp. 27–53.

Kwong, P.D., Wyatt, R., Robinson, J., Sweet, R.W., Sodroski, J., Hendrickson, W.A., 1998.Structure of an HIV gp120 envelope glycoprotein in complex with the CD4 receptorand a neutralizing human antibody. Nature 393 (6686), 648–659.

Leal, E., Janini, M., Diaz, R.S., 2007. Selective pressures of human immunodeficiencyvirus type 1 (HIV-1) during pediatric infection. Infect. Genet. Evol. 7 (6), 694–707.

Lemey, P., Kosakovsky Pond, S.L., Drummond, A.J., Pybus, O.G., Shapiro, B., Barroso, H.,Taveira, N., Rambaut, A., 2007. Synonymous substitution rates predict HIV diseaseprogressionas a result of underlying replicationdynamics. PLoS. Comput. Biol. 3 (2), e29.

229É. Leal et al. / Virology 381 (2008) 222–229

Lusso, P., Earl, P.L., Sironi, F., Santoro, F., Ripamonti, C., Scarlatti, G., Longhi, R., Berger, E.A.,Burastero, S.E., 2005. Cryptic nature of a conserved, CD4-inducible V3 loopneutralization epitope in the native envelope glycoprotein oligomer of CCR5-restricted, but not CXCR4-using, primary human immunodeficiency virus type 1strains. J. Virol. 79 (11), 6957–6968.

Meng, G., Wei, X., Wu, X., Sellers, M.T., Decker, J.M., Moldoveanu, Z., Orenstein, J.M.,Graham, M.F., Kappes, J.C., Mestecky, J., Shaw, G.M., Smith, P.D., 2002. Primaryintestinal epithelial cells selectively transfer R5 HIV-1 to CCR5+ cells. Nat. Med. 8(2), 150–156.

Morgado, M.G., Sabino, E.C., Shpaer, E.G., Bongertz, V., Brigido, L., Guimaraes, M.D.,Castilho, E.A., Galvao-Castro, B., Mullins, J.I., Hendry, R.M., 1994. V3 regionpolymorphisms in HIV-1 from Brazil: prevalence of subtype B strains divergentfrom North American/European prototype and detection of subtype F. AIDS. Res.Hum. Retroviruses. 10 (5), 569–576.

Ogert, R.A., Lee, M.K., Ross, W., Buckler-White, A., Martin, M.A., Cho, M.W., 2001.N-linked glycosylation sites adjacent to and within the V1/V2 and the V3 loops ofdualtropic human immunodeficiency virus type 1 isolate DH12 gp120 affectcoreceptor usage and cellular tropism. J. Virol. 75 (13), 5998–6006.

Potts, K.E., Kalish, M.L., Lott, T., Orloff, G., Luo, C.C., Bernard, M.A., Alves, C.B., Badaro, R.,Suleiman, J., Ferreira, O., 1993. Genetic heterogeneity of the V3 region of the HIV-1envelope glycoprotein in Brazil. Brazilian Collaborative AIDS Research Group. Aids 7(9), 1191–1197.

Rizzuto, C.D., Wyatt, R., Hernandez-Ramos, N., Sun, Y., Kwong, P.D., Hendrickson, W.A.,Sodroski, J., 1998. A conserved HIV gp120 glycoprotein structure involved inchemokine receptor binding. Science 280 (5371), 1949–1953.

Rosen, O., Sharon, M., Quadt-Akabayov, S.R., Anglister, J., 2006. Molecular switch foralternative conformations of the HIV-1 V3 region: implications for phenotypeconversion. Proc. Natl. Acad. Sci. U. S. A. 103 (38), 13950–13955.

Ross, H.A., Rodrigo, A.G., 2002. Immune-mediated positive selection drives humanimmunodeficiency virus type 1 molecular variation and predicts disease duration.J. Virol. 76 (22), 11715–11720.

Roux, K.H., Taylor, K.A., 2007. AIDS virus envelope spike structure. Curr. Opin. Struct.Biol. 17 (2), 244–252.

Sander, O., Sing, T., Sommer, I., Low, A.J., Cheung, P.K., Harrigan, P.R., Lengauer, T.,Domingues, F.S., 2007. Structural descriptors of gp120 V3 loop for the prediction ofHIV-1 coreceptor usage. PLoS. Comput. Biol. 3 (3), e58.

Santoro-Lopes, G., Harrison, L.H., Tavares, M.D., Xexeo, A., Dos Santos, A.C., Schechter, M.,2000. HIV disease progression and V3 serotypes in Brazil: is B different from B-Br?AIDS Res. Hum. Retroviruses. 16 (10), 953–958.

Scarlatti, G., Tresoldi, E., Bjorndal, A., Fredriksson, R., Colognesi, C., Deng, H.K., Malnati,M.S., Plebani, A., Siccardi, A.G., Littman, D.R., Fenyo, E.M., Lusso, P., 1997. In vivo

evolution of HIV-1 co-receptor usage and sensitivity to chemokine-mediatedsuppression. Nat. Med. 3 (11), 1259–1265.

Schuitemaker, H., Koot, M., Kootstra, N.A., Dercksen, M.W., de Goede, R.E., van Steenwijk,R.P., Lange, J.M., Schattenkerk, J.K., Miedema, F., Tersmette, M., 1992. Biologicalphenotype of human immunodeficiency virus type 1 clones at different stages ofinfection: progression of disease is associated with a shift from monocytotropic toT-cell-tropic virus population. J. Virol. 66 (3), 1354–1360.

Shankarappa, R., Margolick, J.B., Gange, S.J., Rodrigo, A.G., Upchurch, D., Farzadegan, H.,Gupta, P., Rinaldo, C.R., Learn, G.H., He, X., Huang, X.L., Mullins, J.I., 1999. Consistentviral evolutionary changes associated with the progression of human immunode-ficiency virus type 1 infection. J. Virol. 73 (12), 10489–10502.

Shioda, T., Levy, J.A., Cheng-Mayer, C., 1991. Macrophage and T cell-line tropisms of HIV-1 are determined by specific regions of the envelope gp120 gene. Nature 349(6305), 167–169.

Silva, W.P., Santos, D.E., Leal, E., Brunstein, A., Sucupira, M.C., Sabino, E.C., Diaz, R.S.,2006. Reactivation of ancestral strains of HIV-1 in the gp120 V3 env region inpatients failing antiretroviral therapy and subjected to structured treatmentinterruption. Virology 354 (1), 35–47.

Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. TheCLUSTAL_X windows interface: flexible strategies for multiple sequence alignmentaided by quality analysis tools. Nucleic Acids Res. 25 (24), 4876–4882.

Watabe, T., Kishino, H., Okuhara, Y., Kitazoe, Y., 2006. Fold recognition of the humanimmunodeficiency virus type 1 V3 loop and flexibility of its crown structure duringthe course of adaptation to a host. Genetics 172 (3), 1385–1396.

Wei, X., Decker, J.M.,Wang,S., Hui,H., Kappes, J.C.,Wu,X., Salazar-Gonzalez, J.F., Salazar,M.G.,Kilby, J.M., Saag, M.S., Komarova, N.L., Nowak,M.A., Hahn, B.H., Kwong, P.D., Shaw, G.M.,2003. Antibody neutralization and escape by HIV-1. Nature 422 (6929), 307–312.

Williamson, S., 2003. Adaptation in the env gene of HIV-1 and evolutionary theories ofdisease progression. Mol. Biol. Evol. 20 (8), 1318–1325.

Wu, L., Gerard, N.P., Wyatt, R., Choe, H., Parolin, C., Ruffing, N., Borsetti, A., Cardoso, A.A.,Desjardin, E., Newman, W., Gerard, C., Sodroski, J., 1996. CD4-induced interaction ofprimary HIV-1 gp120 glycoproteins with the chemokine receptor CCR-5. Nature384 (6605), 179–183.

Yang, Z., Nielsen, R., Goldman, N., Pedersen, A.M., 2000. Codon-substitution models forheterogeneous selection pressure at amino acid sites. Genetics 155 (1), 431–449.

Yang, W., Bielawski, J.P., Yang, Z., 2003. Widespread adaptive evolution in the humanimmunodeficiency virus type 1 genome. J. Mol. Evol. 57 (2), 212–221.

Zhang, M., Gaschen, B., Blay, W., Foley, B., Haigwood, N., Kuiken, C., Korber, B., 2004.Tracking global patterns of N-linked glycosylation site variation in highly variableviral glycoproteins: HIV, SIV, and HCV envelopes and influenza hemagglutinin.Glycobiology 14 (12), 1229–1246.