Analysis of Genomic Sequences of 95 Papillomavirus Types: Uniting Typing, Phylogeny, and Taxonomy

11
1995, 69(5):3074. J. Virol. S Y Chan, H Delius, A L Halpern and H U Bernard phylogeny, and taxonomy. papillomavirus types: uniting typing, Analysis of genomic sequences of 95 http://jvi.asm.org/content/69/5/3074 Updated information and services can be found at: These include: CONTENT ALERTS more» cite this article), Receive: RSS Feeds, eTOCs, free email alerts (when new articles http://journals.asm.org/site/misc/reprints.xhtml Information about commercial reprint orders: http://journals.asm.org/site/subscriptions/ To subscribe to to another ASM Journal go to: on August 15, 2014 by guest http://jvi.asm.org/ Downloaded from on August 15, 2014 by guest http://jvi.asm.org/ Downloaded from

Transcript of Analysis of Genomic Sequences of 95 Papillomavirus Types: Uniting Typing, Phylogeny, and Taxonomy

  1995, 69(5):3074. J. Virol. 

S Y Chan, H Delius, A L Halpern and H U Bernard phylogeny, and taxonomy.papillomavirus types: uniting typing, Analysis of genomic sequences of 95

http://jvi.asm.org/content/69/5/3074Updated information and services can be found at:

These include:

CONTENT ALERTS more»cite this article),

Receive: RSS Feeds, eTOCs, free email alerts (when new articles

http://journals.asm.org/site/misc/reprints.xhtmlInformation about commercial reprint orders: http://journals.asm.org/site/subscriptions/To subscribe to to another ASM Journal go to:

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

JOURNAL OF VIROLOGY, May 1995, p. 3074–3083 Vol. 69, No. 50022-538X/95/$04.0010Copyright q 1995, American Society for Microbiology

Analysis of Genomic Sequences of 95 Papillomavirus Types:Uniting Typing, Phylogeny, and Taxonomy

SHIH-YEN CHAN,1 HAJO DELIUS,2 AARON L. HALPERN,3 AND HANS-ULRICH BERNARD1*

Laboratory for Papillomavirus Biology, Institute of Molecular and Cell Biology, National University of Singapore,Singapore 0511, Republic of Singapore1; German Cancer Research Center, 69120 Heidelberg,Germany2; and Theoretical Biology and Biophysics, Theoretical Division, Los Alamos

National Laboratory, Los Alamos, New Mexico 875453

Received 26 September 1994/Accepted 16 February 1995

Our aim was to study the phylogenetic relationships of all known papillomaviruses (PVs) and the possibilityof establishing a supratype taxonomic classification based on this information. Of the many detectablyhomologous segments present in PV genomes, a 291-bp segment of the L1 gene is notable because it is flankedby the MY09 and MY11 consensus primers and contains highly conserved amino acid residues which simplifysequence alignment. We determined the MY09-MY11 sequences of human PV type 20 (HPV-20), HPV-21,HPV-22, HPV-23, HPV-24, HPV-36, HPV-37, HPV-38, HPV-48, HPV-50, HPV-60, HPV-70, HPV-72, HPV-73,ovine (sheep) PV, bovine PV type 3 (BPV-3), BPV-5, and BPV-6 and created a database which now encompassesHPV-1 to HPV-70, HPV-72, HPV-73, seven yet untyped HPV genomes, and 15 animal PV types. Threeadditional animal PVs were analyzed on the basis of other sequence data. We constructed phylogenies basedon partial L1 and E6 gene sequences and distinguished five major clades that we call supergroups. One of themunites 54 genital PV types, which can be further divided into eleven groups. The second supergroup has 24 typesand unites most PVs that are typically found in epidermodysplasia verruciformis patients but also includesseveral types typical of other cutaneous lesions, like HPV-4. The third supergroup unites the six knownungulate fibropapillomaviruses, the fourth includes the cutaneous ungulate PVs BPV-3, BPV-4, and BPV-6,and the fifth includes HPV-1, HPV-41, HPV-63, the canine oral PV, and the cottontail rabbit PV. The chaffinchPV and two rodent PVs, Micromys minutus PV and Mastomys natalensis PV, are left ungrouped because of therelative isolation of each of their lineages. Within most supergroups, groups formed on the basis of cladisticprinciples unite phenotypically similar PV types. We discuss the basis of our classification, the concept of thePV type, speciation, PV-host evolution, and estimates of their rates of evolution.

During the last decade, papillomaviruses (PVs) have at-tracted increasing scientific attention, as they are quantitativelythe most important group of viruses associated with benign andmalignant neoplasia in humans (40, 73). Currently, more than70 different human PV (HPV) types are known (15), andadditional evidence from partial sequences indicates the exis-tence of a further 13 which would qualify as novel HPV types(2, 5). Other PV types have also been identified in mammalsand birds (61). The large number of PV types and poorlydefined lower- (e.g., subtypes) and higher-order classifications(e.g., mucosal HPVs) have become a potential source of con-fusion, and it seems desirable to investigate the possibility ofgiving them some internal taxonomic organization. Before thisis done, it is necessary to have a clear definition of the objectsone is classifying and reliable hypotheses about their phylog-eny.PV types are the objects to be classified, and they are defined

by genomic properties rather than serology; thus, the termserotype is not used. The original definition was by DNA hy-bridization criteria (9), but this method had a serious limitationin that the values determined for genomic similarities werefrequently very different from and inconsistently related tonucleotide sequence similarities. In fact, PVs which show littleor no relatedness by hybridization could still have a greaterthan 50% nucleotide similarity. This is due to the scattered

distribution of short sequence homologies. To overcome thisinconsistency, an HPV genome is now defined as a new HPVtype when it shows a more than 10% dissimilarity in the com-bined nucleotide sequences of the E6, E7, and L1 genes whencompared with those of any previously known type (15). Thisrequires the sequencing of about 2.4 kb, or roughly one-third,of the genome of all new isolates and then sequence alignmentand the calculation of dissimilarity according to certain oper-ational criteria.Phylogenetic research uses homologous features such as nu-

cleotide sequences to reconstruct evolutionary relationshipswhich are graphically represented as trees. As a result of largesystematic sequencing efforts (for a review, see reference 13),the PVs have become the best-known DNA virus family and animportant resource for the study of virus evolution (4, 7, 31, 50,65). Since the first extensive phylogenetic studies of PVs weredone (7, 65), there has been a doubling of the number of PVtypes for which complete or partial sequence information isknown. There are indications that the rate at which new genitalHPV types are detected is slowing down (5, 66), while onlysmall efforts to detect novel PVs in animals are presently made.However, many new HPV genomes related to the epider-modysplasia verruciformis (EV) HPVs have recently beenfound in skin cancers from renal transplant patients (2, 57).Because of this increase in the number of known sequences,the advent of fresh analytical approaches, and the need fortaxonomic investigation, we felt it would be helpful to under-take a revision of PV phylogeny.Given the objects to be classified and their phylogeny, there

are broadly three main taxonomic approaches one can take.

* Corresponding author. Mailing address: Laboratory for Papillo-mavirus Biology, Institute of Molecular and Cell Biology, NationalUniversity of Singapore, Singapore 0511, Republic of Singapore. Fax:(65) 7791117.

3074

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

The differences arise mainly from the different philosophicalperceptions of what a taxonomic scheme should represent andwhat qualities it should possess. The cladistic approach usesevolutionary relationships deduced from shared derived char-acters, and it establishes taxonomic groups that must be mono-phyletic groups (clades) (60, 70). Pheneticists use an overallsimilarity measure (or its complement, the dissimilarity) ofreliably measured phenotypic characters (59). Finally, evolu-tionary systematists use a combination of both criteria (43),differing from that of cladists in that they also accept non-monophyletic groups as taxonomic groups. Genetic sequencedata may be used by all the schools because the DNA sequenceis both inherited material suitable for phylogenetic evaluationand also a phenotypic feature that can be determined reliably.There exist other phenotypic features such as virion morphol-ogy, protein structure, disease pathology, and immune re-sponses that could conceivably enter into a taxonomic scheme.However, many of these either are poorly understood, are notdistinctive enough to provide a foundation for a taxonomy, ormay also have resulted from sharing an ancestral trait or fromindependent evolution of the trait. They, however, may play anadjunct role, and as knowledge of these features increases, itwill be interesting to see how well they correlate to the existinggroups. We adopted the cladistic methodology, as it is a par-ticularly consistent and useful approach for classifying PVs.

MATERIALS AND METHODS

Origin of PV clones and sequences. Original references to the isolation of allthe HPV types numbered 1 to 69 are contained in three reviews (13, 14, 66).These reviews and reference 48 also cite all publications of the complete genomicnucleotide sequences of 43 HPV types. We cite only those concerning specificstatements made in this study. The sequences of all HPV types and all animalPVs not mentioned separately below were obtained from GenBank (Release83.0) and have been recently published as part of a compendium (48). Thiscompendium also contains partial sequences of seven HPV genomes which willprobably attain type status after complete isolation and DNA sequencing. Theseseven clones were identified during several major epidemiological studies sum-marized in reference 5, and their codes (CP, IS, LV, and MM) refer to therespective epidemiological studies. Three additional genomes found during thislatter study were assumed to be novel HPV types but have since then becomeidentified as variants of HPV-70 (CP141), HPV-72 (CP4173-LVX100), andHPV-73 (MM9) (68). Since the genomic sequences of the reference clones ofthese three types were not yet available, we used the sequences of these threevariants to represent these types.The L1 sequences of HPV-20, HPV-21, HPV-22, HPV-23, HPV-24, HPV-36,

HPV-37, HPV-38, HPV-48, HPV-50, HPV-60, and ovine PV were determined byone of us (H.D.) during an ongoing project to establish the complete genomicsequences of most PVs. The sequences of bovine PV type 3 (BPV-3) and BPV-6were obtained with sequencing primers modeled after the nucleotide sequence oftheir presumed nearest relative, BPV-4 (34), and that of BPV-5 was obtained byamplifying it with PCR primers modeled after BPV-1. These sequences areavailable by anonymous file transfer protocol from the HPV database([email protected]). The clones of HPV-20, HPV-21, HPV-22, HPV-23, HPV-24,HPV-36, HPV-37, HPV-38, HPV-48, HPV-50, and HPV-60 were obtained fromE.-M. de Villiers at the Reference Center for Human Pathogenic Papillomavi-ruses, Heidelberg, Germany. The clone of ovine PV was supplied by PhillipBaird, Sydney, Australia, and the clones of BPV-3, BPV-5, and BPV-6 DNAwere furnished by Saveria M. Campo, Glasgow, United Kingdom.Data analysis. Sequence alignments, distances, and manipulations were done

with the Genetics Computer Group Sequence Analysis Software Package version7.3.1-UNIX (23) and the Multiple Aligned Sequence Editor (20). The pairwisesimple distance (dissimilarity) was determined on aligned sequences, excludinggaps, and is the proportion of nucleotides that differed for a pair of sequences.Linear correlation analysis, with the best-fit line additionally constrained to passthrough the origin, was performed with the FIT and REGRESS programs inMathematica version 2.2 (Wolfram Inc., Champaign, Ill.). The difference be-tween constrained and unconstrained fits was negligible. Distance matrix andmaximum-likelihood phylogenies and distance matrix bootstrap analyses weredone with the Phylogeny Inference Package version 3.5 (21, 22) and the fastDNAmlversion 1.0 program (49). Maximum and weighted parsimony analyses wereconducted with the Phylogenetic Analysis Using Parsimony package version 3.1.1(62). The weighted parsimony analyses (29) used an inverse frequency substitu-tion matrix generated with the help of MacClade version 3.1.1 (41). MultipleAligned Sequence Editor, Genetics Computer Group Software, and Phylogeny

Inference Package runs were performed on a Silicon Graphics server runningIRIX V.4 and a Tatung Sparc 5 model 85 running Solaris 2.3. PhylogeneticAnalysis Using Parsimony and MacClade were run on a Macintosh Quadra 800running System Software 7.1. Mathematica was run on an IBM PC-compatible486 running OS/2 2.2.

RESULTS

Alignment of homologous partial L1 sequences of 92 PVtypes. Strict amino acid conservation allows the alignment ofseveral genomic segments among all presently known PV ge-nomes. Specifically, good alignment is possible for the centralportion of the E6 gene (10, 67), the DNA binding and thetranscription activation domains of the E2 gene (24), and sev-eral gene segments of the E1, L2, and L1 genes (7, 48, 64).Alignment is not possible for several segments that have accu-mulated extensive substitutions, deletions, and insertions, andsuch segments include the overlap of the E2 hinge region andthe E4 gene (16) and the E5 gene (1). The conserved segmentsare useful for phylogenetic analysis, provided that they arelarge enough to be sufficiently informative. In previous studiesof two 152-bp and 132-bp segments of the E1 and L1 genes, wefound similar relationships among 48 PV types irrespective ofthe nature of the evaluated genomic segment (7). Most ofthese affinities were in agreement with those found duringanother study of a 384-bp segment of the E6 gene (65).Figure 1 shows the alignment of the amino acid sequences

that correspond to part of the L1 gene of 92 PV types andpresently represents the most extensive database for any PVsegment. It resulted from the extension of research on thedetectability of unknown genital HPV genomes (5) with thePCR primer pair MY09-MY11 (42). These primers amplify asegment of approximately 460 bp of the L1 gene of mostgenital HPV types. A 291-bp subsegment of this amplimer canbe confidently aligned because of highly conserved aminoacid residues which are indicated in Fig. 1. This particularsubsegment excludes sequences corresponding to those of theMY09-MY11 primers and the 59 portion of the segment whichis of low conservation, with numerous small insertions anddeletions (5). In the following, we refer to these sequences asthe 291-bp L1 segments. The database did not include se-quences for the colobus monkey PV (CgPV) (53), the onlyavian (chaffinch) PV (FPV) (46) analyzed so far, and the Mi-cromys minutus (mouse) PV (MmPV) (67), because the rele-vant L1 sequences and the cloned genomes were not available.However, in addition to all the types whose full sequences arepublished, we did evaluate about 30 partially sequenced HPVs;BPV-3, BPV-4, and BPV-6, which do not have E6 genes (34);and animal PVs that have not attracted much sequencing ef-fort.The 291-bp L1 segment and E6 gene contain sufficient se-

quence information for typing.Given the large amount of data,the ready accessibility of the 291-bp L1 segment by PCR, andits utility for phylogenetic studies, we studied how it comparedwith the combined E6-E7-L1 segment for typing purposes. Tothe extent that the 291-bp L1 segment could be used to reliablyreplicate the definition of types determined with the full E6,E7, and L1 sequences, this would give us confidence that thetyping of novel sequences with the 291-bp segment would yieldthe same result as would typing based on the current standard.A potential concern in this regard, and in using the 291-bpsegment for phylogenetic purposes, was that such a small well-conserved segment might lack the resolving power required forthese analyses.Using the alignments from the HPV database compendium

(48), we constructed a combined alignment of E6, E7, and L1for all types for which the sequences of all three genes were

VOL. 69, 1995 PV PHYLOGENY AND TAXONOMIC CLASSIFICATION 3075

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

available. After removing all columns from the combinedalignment which contained gaps in one or more sequences, wecalculated dissimilarities for each pair of sequences as thenumber of mismatches divided by the number of aligned po-sitions compared (e.g., 25 mismatches out of 200 positions 512.5%). After doing the same for the 291-bp L1 segment andfor a 399-bp E6 segment (see below), we attempted a linearcorrelation analysis. The results plotted in Fig. 2 indicate thatboth the 291-bp L1 and the 399-bp E6 segment distances arehighly linearly correlated with those of the E6-E7-L1 se-quences used for typing (r5 0.99 in both cases), with gradientsof 0.92 and 1.28, respectively. The gradients suggest that the291-bp L1 segment sequence is on the average more conservedand that the 399-bp E6 segment sequence is less conservedthan the E6-E7-L1 sequences. Thus, the current 10% criterionfor typing is roughly equivalent to 9.2% and 12.8% criteria,respectively, if these other segments are used.We further checked the distances between several of the

most closely related pairs of HPV types to see if using thesmaller 291-bp L1 distances would frequently lead to a dis-agreement with the current standard on the type status. Table1 shows that even the most closely related types have distancesthat significantly exceed 0.092 for the 291-bp L1 segment, andtherefore we conclude that the shorter 291-bp L1 distances donot normally lead to errors about type status. However, thereare two pairs with clearly lower 291-bp distances, HPV-34 andHPV-64 (0.050) and HPV-44 and HPV-55 (0.057). The pre-liminary genomic sequence data for HPV-64 (15) in fact sug-gest that the HPV-34–HPV-64 pair may not be separate types.The verdict on the HPV-44–HPV-55 pair awaits more se-quence information.

FIG. 1. Alignment of the amino acid sequences corresponding to the highlyconserved 291-bp L1 segments of 92 PV types. The upper part of the figureindicates the genomic position of this sequence in the L1 gene as well as thoseof the MY09-MY11 PCR primer pair. The solid arrows indicate amino acidresidues that are strictly conserved in all PVs. These and many other highlyconserved amino acid residues are useful for aligning the sequences and indi-cating the PV origin of partial genomic segments (5). Striped arrows indicateamino acid residues that strictly covary between genital and EV HPVs, i.e., thesequence MXYXH of genital PVs becomes LXQXN in EV HPVs. HPV-68,HPV-70, HPV-72, and HPV-73 sequences may have slight nucleotide sequencedifferences from the reference isolates. Abbreviations: LCR, long control region;BPV, bovine PV; CgPV, colobus monkey PV; COPV, canine oral PV; CRPV,cottontail rabbit PV; DPV, deer PV; EPV, European elk PV; FPV, chaffinch(avian) PV; HPV, human PV; MmPV, Micromys minutus (mouse) PV; MnPV,Mastomys natalensis (South African mouse) PV; OvPV, ovine (sheep) PV;PCPV, pigmy chimpanzee PV; RhPV, rhesus monkey PV. Note the absence ofHPV-46, which is now considered to be a subtype of HPV-20 (15).

FIG. 2. Linear correlation analyses of all available 291-bp L1 (A) and 399-bpE6 distances against the combined E6, E7, and L1 combined gene distances, thelatter being derived from published alignments (48). Over 2,000 datum pointswere plotted for each anlaysis. The best-fit line was constrained to pass throughthe origin by adjusting the model parameters within Mathematica.

3076 CHAN ET AL. J. VIROL.

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

The phylogeny of 92 PV types as determined on the basis ofpartial L1 sequences. Figure 3 shows a maximum-likelihoodphylogeny of 92 PV types as determined on the basis of nucle-otide sequences of the 291-bp L1 segments (Fig. 1). Neighbor-joining analysis gave results very similar to those given bymaximum likelihood, regardless of distance corrections (re-sults not shown). A maximum-parsimony analysis led to a largenumber of equally parsimonious trees. This finding, in con-junction with the highly divergent character of the sequences,suggests that there is a lot of noise due to back mutation andhomoplasy and prompted the use of the weighted parsimonyapproach, which led to the tree shown in Fig. 4.In both trees, the same two large branches that have been

observed before (5, 7, 8, 65) are clearly present. One unitesHPV types associated with genital lesions (genital HPVs, re-ferred to in the following as supergroup A), and the otherunites HPV types associated with EV (supergroup B). On

branches positioned between them are most animal PVs andthree HPVs associated with cutaneous lesions, namely, HPV-1,HPV-41, and HPV-63. One of these branches unites all fibro-papillomaviruses from ungulates, and we call this supergroupC. A fourth branch unites BPV-3, BPV-4, and BPV-6 (super-group D), and the fifth unites HPV-1, HPV-41, HPV-63, ca-nine oral PV, and cottontail rabbit PV (supergroup E). Table2 summarizes the PV types within these supergroups and fur-ther proposes groups within these supergroups formed of typeswith relatively close relationships. Details of the rationale be-hind this classification will be discussed below.Both phylogenies are very similar, but we note the following

differences (see Table 2 for the group identities). (i) There areinternal topology differences for supergroup A, e.g., the vari-ous positions of groups A1, A3, A5, and A6. (ii) There areinternal differences for group B1. (iii) Supergroups C and D(all BPVs) together form a monophyletic group in Fig. 3 butnot in Fig. 4 or in a bootstrapped consensus neighbor-joiningtree (data not shown). (iv) Supergroup E, although apparentlya clade, has members that are very distantly related.Other causes of uncertainty arise from the large number of

types, making the search for the optimal tree necessarily far

FIG. 3. A phylogenetic tree of 92 PV types based on a maximum-likelihoodevaluation of the 291-bp L1 segment. In addition, FPV, CgPV, and MmPV areincluded in this tree, but their relationships are indicated by dashed branches,because they were derived from different sequence data. For abbreviations, seethe legend to Fig. 1.

FIG. 4. A phylogenetic tree of 92 PV types based on a weighted parsimonyevaluation of the 291-bp L1 segment. Branch lengths are proportional to thenumbers of weighted steps between reconstructed bifurcations of lineages. Aninitial unweighted analysis involving 10 random sequence addition replicates andnearest neighbor interchange searching yielded 217 equally parsimonious treesof 4,021 steps. An inverse-frequency substitution weight matrix was generatedwith MacClade 3.1.1 as follows. The average number over all 217 trees of eachof the 12 possible nucleotide substitutions was calculated, counting only unam-biguously reconstructed changes. Weights for each type of substitution were thencalculated as the total number of changes of all types over the number of changesof that particular type. Weighted parsimony analyses were performed with theseweights. Nearest neighbor interchange replicates (n 5 75) yielded a single besttree with a weighted length of 35,564. Additional ad hoc searches involvingsubtree pruning-regrafting on user-defined starting trees yielded a better treewith a weighted length of 35,536, presented here. The abbreviations are asdefined in the legend to Fig. 1.

TABLE 1. Comparison of pairwise distances for 10 veryclosely related HPV types

HPV pair E6-E7-L1 distance 291-bp L1 distance

HPV-2–HPV-27 0.091 0.100HPV-2–HPV-57 0.153 0.154HPV-3–HPV-10 0.176 0.143HPV-4–HPV-65 0.151 0.158HPV-5–HPV-12 0.207 0.186HPV-6–HPV-11 0.152 0.115HPV-7–HPV-40 0.137 0.129HPV-11–HPV-13 0.231 0.190HPV-18–HPV-45 0.157 0.136HPV-19–HPV-25 0.137 0.143

VOL. 69, 1995 PV PHYLOGENY AND TAXONOMIC CLASSIFICATION 3077

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

from exhaustive, and from the further possibility of alternativeweighting schemes. Nevertheless, an informal comparison ofthe many near-optimal trees and trees based on differentweighting schemes (data not shown) does confirm that many ofthe groups of interest are stable. Such a comparison also re-veals the more uncertain aspects of the phylogeny. Instability isparticularly noteworthy for CP8061, LVX82, HPV-54, and rhe-sus monkey PV, although all still definitely belong to super-group A. Neighbor-joining and bootstrapping analyses re-vealed a low value of confidence for group A9, which wasoccasionally dismembered in trees with different weightingschemes. Within supergroup B, instability was high for HPV-49. Finally, the inclusion of Mastomys natalensis PV withineither supergroup C, D, or E was an alternative to a placementon a separate lineage.We included in Fig. 3 the relative positions of FPV, CgPV,

and MmPV. The 291-bp L1 sequences of these viruses werenot known, and their relationships with the 92 PV types listedin Fig. 1 were derived from an E1 database in the case of CgPV(7), from E6 sequences in the case of MmPV (Fig. 4) (67), andfrom partial E1 and L1 sequences in the case of FPV (46).The phylogeny of 57 PV types as determined on the basis of

a 399-bp E6 segment. To compare our findings with those of a

previous study of E6 gene sequences (65), we used the samesegment of the expanded HPV database (13) to constructphylogenies. We refer to these data as the 399-bp E6 segment,because addition of some PV types required the assumption ofinsertions in the original 384-bp alignment. Figure 5 shows aneighbor-joining phylogeny done with these E6 sequences. Itshows all previously sequenced HPV types in positions similarto those that have been previously published (65) and displaysboth the stable features and instabilities mentioned before,such as those for supergroup A, the distinction between groupB1 and B2, and supergroup C. Supergroup D is missing, asthese BPV types do not have E6 genes. On the basis of E6sequences, supergroup E is very diverse. Cottontail rabbit PVwas excluded from supergroup E because its E6 gene could notbe confidently aligned. The overall similarities between the E6-and L1-based phylogenies support previous findings that PVgene trees are a good reflection of trees of PV types because ofthe apparent absence of intertype recombinations.Distances are inadequate as an operational tool for higher

classification. The proliferation of PV types over the past yearshas led to the need for more general taxonomic categories.Given the success of the distance criteria in operationally de-fining types, it is tempting to assume that additional levels of

TABLE 2. Classification of PVs on the basis of cladistic relationships

PV supergroupa

Supergroup A (genital HPVs)Group A1: HPV-32 and HPV-42 (100%)Group A2: HPV-3, HPV-10, HPV-28, and HPV-29 (100%)Group A3: HPV-61, HPV-62, HPV-72, CP6108, CP8304, and MM8 (94%)Group A4: HPV-2, HPV-27, and HPV-57 (100%)Group A5: HPV-26, HPV-51, HPV-69, ISO39, and MM4 (97%)Group A6: HPV-30, HPV-53, HPV-56, and HPV-66 (100%)Group A7: HPV-18, HPV-39, HPV-45, HPV-59, HPV-68, and HPV-70 (78%)Group A8: HPV-7, HPV-40, and HPV-43 (100%)Group A9: HPV-16, HPV-31, HPV-33, HPV-35, HPV-52, HPV-58, HPV-67, and RhPV (34%)Group A10: HPV-6, HPV-11, HPV-13, HPV-44, HPV-55, and PCPV (100%)Group A11: HPV-34, HPV-64, and HPV-73 (100%)Poorly resolved lineages within supergroup A: CgPV (291-bp L1 sequences unavailable), CP8061, LVX82, and HPV-54

Supergroup B (EV HPVs)Group B1: HPV-5, HPV-8, HPV-9, HPV-12, HPV-14, HPV-15, HPV-17, HPV-19, HPV-20, HPV-21, HPV-22, HPV-23, HPV-24, HPV-25,HPV-36, HPV-37, HPV-38, HPV-47, and HPV-49 (100%)Group B2: HPV-4, HPV-48, HPV-50, HPV-60, and HPV-65 (73%)

Supergroup C (ungulate fibropapillomaviruses)Group C1: BPV-1 and BPV-2 (94%)Group C2: DPV, EPV, and OvPVIsolated lineage within supergroup C with likely group rank: BPV-5

Supergroup D (BPVs causing true papillomas)Group D1: BPV-3, BPV-4, and BPV-6 (86%)

Supergroup E (animal and human cutaneous PVs)Group E1: HPV-1 and HPV-63Isolated lineages within supergroup E with likely group rank: COPV, CRPV, and HPV-41

Poorly resolved lineages which likely represent taxa of supergroup rankFPV (291-bp L1 sequences unavailable)MmPV (291-bp L1 sequences unavailable)MnPV

a The cladistic relationships were determined on the basis of the phylogenies depicted in Fig. 3 and 4. Supergroups A, B, C, D, and E are clades with well-recognizeddifferences in biology. Simple distances based on the 291-bp L1 segment generally exceeded 0.37 for types from different supergroups. Within a supergroup, types fromdifferent groups generally had distances ranging from 0.30 to 0.37. Types within groups generally had distances ranging from 0.10 to 0.35. The large value for the upperboundary of this range is due to the large distances within group B1, which should be split into two or more smaller groups when the internal phylogeny is better known.Where applicable, the bootstrap values (100 replicates, neighbor-joining method) for the groups are indicated in parentheses as percentages. The abbreviations are thesame as those defined in the legend to Fig. 1.

3078 CHAN ET AL. J. VIROL.

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

classification may also be defined on the basis of larger dis-tance thresholds. However, we found that it is impossible toestablish criteria for supratype classification of PVs that arebased purely on distance (or its equivalent, similarity) and thatgive results which are compatible with reconstructions derivedby phylogenetic methods or which are operationally consistent.The problem can be illustrated by the use of simple dis-

tances. As most nucleotide sequence distances between typesof the same supergroup are less than 37% and those betweentypes belonging to different supergroups are larger, one mightconsider defining a taxonomic category using this distancescore as a cutoff point. When one examines the 291-bp L1segments, one finds that HPV-33 and HPV-16 (both genitalHPVs in supergroup A) are 24% dissimilar but that HPV-33and HPV-38 (a cutaneous HPV in supergroup B) are 30%dissimilar and that HPV-16 and HPV-38 are 38% dissimilar.Given the 37% cutoff point, the distances from HPV-33 indi-cate that all three HPV types should be in the same super-group, and yet given the pairwise distances of HPV-16 andHPV-38, these two types would have to be assigned to differentsupergroups. Similar conflicts can be found for many differenttypes and at many different thresholds. Furthermore, this prob-lem is not resolved under various distance corrections (datanot shown).Figure 6 shows a distribution of the simple distances for

intragroup, between-group, and between-supergroup (includ-ing the isolated types) comparisons for the specific taxonomicgroupings presented in Table 2. It shows that while the modaldistances are clearly different, there is a considerable overlapof the distributions such that no consistent set of classificatorythresholds can be nominated as an operational criterion. Whilethis set of groupings does not give the minimal overlap ofdistances, the problems mentioned above preclude the defini-tion of virtually any useful grouping on the basis of distancesalone.Phylogenetic classification. Cladistic phylogenetic classifica-

tions reflect hypotheses about the evolutionary history of or-ganisms (69, 70), and given the best hypothesis, all acceptableclassifications must reflect monophyletic relationships (28).The same phylogeny may be used to establish alternative clas-sifications which differ in the number of taxons and ranks, thedecision being a matter of simplicity of use and consensusamong taxonomic researchers. Table 2 reflects our proposalfor a phylogenetic classification of the PVs which is aimed atminimizing category names while introducing supratype group-ings at levels we believe will be biologically interesting. Thechoice of cladistic groups was guided by phenetic similarities inbiology, disease pathology, and high bootstrap values; the re-sults to a large part reflect traditional groupings that PV biol-ogists have found useful. Occasionally, the distance criterionwas used to determine the taxonomic organization, such as thatbetween and within supergroups C to E.We have adopted informal taxon names such as group and

supergroup, which are useful for taxa within families or genera(70), but have avoided the use of formal taxon names, includ-ing the names of specific viruses. The use of alphanumericnames maintains the stability of the nomenclature in the faceof possible changes to the phylogeny and group membershipbecause it keeps the group nomenclature independent of itsconstituent taxon names.By definition, a clade is formed by all species with shared

derived ancestral characters and their most recent commonancestor. Because of the lack of a fossil record, viral phylog-enies cannot be firmly rooted, and so we cannot be sure of thetemporal direction of homologous character changes, which isrequired to recognize monophyly. We used the rooting of theweighted parsimony analysis (62) and possible outgroups suchas the BPVs to root the HPVs, as suggested by the evidence ofvirus-host coadaptation with strict host specificity (6, 7, 56), topartially overcome this uncertainty. Remarks about many ofthe groupings have been addressed in the HPV database com-pendium (48), and additional points will be raised in the Dis-cussion.

FIG. 5. A phylogenetic tree of 57 PV types based on a neighbor-joining(Kimura two-parameter distances) evaluation of a 399-bp E6 gene alignment.The abbreviations are as defined in the legend to Fig. 1.

FIG. 6. A histogram of simple distances (dissimilarities) between pairs oftaxa for the 291-bp L1 segment, separated into within-group, across-group butwithin-supergroup, and across-supergroup comparisons. Note the overlap in thedistributions of the distances, indicating the impossibility of using distancethresholds to unambiguously define the groupings of interest (see text for de-tails).

VOL. 69, 1995 PV PHYLOGENY AND TAXONOMIC CLASSIFICATION 3079

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

DISCUSSION

This paper investigates how nucleotide sequence informa-tion may be used to reconstruct the genealogies of PVs as aprerequisite for a phylogenetic taxonomy. The genomic seg-ment that we chose meets the strict criteria for establishingphylogenies, is sufficiently discriminating for typing purposes,and has the advantage over other conserved genomic segmentsin being easily amplified by PCR. In the following, we describeour view of the taxonomy of PVs at the levels of variants,subtypes, types, groups, and supergroups, which correspond toincreasing phenetic levels of diversity. For completeness, wehave included in this discussion variants and subtypes, al-though the analysis below the type level was not a part of thisstudy.Variants of PV types. Variants of a PV type differ from the

reference type (the clone sequenced to describe that type) bydistances of up to 0.020 within genes and slightly more outsidegenes, a statement which reflects empirical observations ratherthan a definition. Sequence variants from HPV-5 and HPV-8(11, 11a), HPV-6 and HPV-11 (27, 37), HPV-16 (19, 31, 32,52), HPV-18, and HPV-45 (50) have been systematically stud-ied. While the finding of sequence diversity among indepen-dent isolates is somewhat trivial, it is remarkable that no largerdiversities were observed, i.e., the continuous process of sub-stitutional alterations that has linked PV types in their evolu-tionary history has no complete record in today’s HPV isolates.Old and new concepts about subtypes. The term subtype was

originally used for certain HPV isolates of a type that hadrestriction patterns different in Southern blots from those ofthe reference type. Point mutations in restriction sites can altersuch patterns dramatically, and it was thought that these iso-lates might differ significantly elsewhere in their genomes. Sub-sequent sequence analysis of subtypes of HPV-5 (71) andHPV-6 (27, 37) revealed a sequence diversity of only about1%, i.e., as little as that between variants. The long controlregions of independent isolates of HPV-6 and HPV-11 candiffer by small insertions and deletions (27, 37) which areprobably derived from sequences with a propensity for gener-ating slippage (27), but these isolates do not differ significantlyelsewhere in the genome. Against this background, we recom-mend that the term subtype be abandoned for these isolates.However, it can be useful for describing isolates with distancesof 0.020 to 0.100 from their reference PV types, i.e., a diversitythat is lower than that among types and higher than that amongvariants. Surprisingly, subtypes in this new sense of the wordhave been very rarely encountered, and we are aware of onlythree examples at the nucleotide level, the pairs HPV-34–HPV-64 and HPV-44–HPV-55 and the HPV genome in theME180 cell line in comparison with that for HPV-68 (15, 54).PV types as species. The well-known ‘‘biological species’’

concept of Mayr (43, 44) applies exclusively to sexually repro-ducing species. Less well-known is the fact that it is a specialcase of a broader concept, the ‘‘evolutionary species’’ (45),which explicitly includes asexual organisms. We take the defi-nition of an evolutionary species to be a single lineage ofindividuals separated from other extant lineages and from theircommon ancestors by substantial genetic differences (58, 69).Why is it important to identify the evolutionary species, and

to which level of the organization of PVs does it correspond?The species is the highest taxonomic level (i.e., the largestgroup of individuals) subject to evolutionary selectional pres-sures through direct competition and as such the highest groupthat will tend to evolve as a unit. Higher taxa are mere group-ings of species that exist solely by virtue of being named (43).We have previously implied that the PV type is the evolution-

ary species (3, 4). As such, the type identifies a population ofviruses which may show minor internal variation and a largegenetic discontinuity between types.PV speciation and the basis of classification from molecular

sequences. Sexually replicating species achieve cohesivenessthrough a vertical bond between parents and offspring and ahorizontal bond between mates. In the face of evolutionarynovelties, cohesiveness of the species can be maintained, pro-vided that mating bonds are not disrupted and gene flow oc-curs among mating populations. Asexual species lack the mat-ing bond, and species cohesiveness depends on a mixture ofstabilizing selection, stable environments, and genetic cohe-siveness, whose molecular basis is the high fidelity of DNAreplication and the low frequency of mutations (55). Whileevolutionary processes in sexual and asexual lineages undoubt-edly differ, both reproductive modes result in lineages of or-ganisms and hence are amenable to classification by the sameprinciples (58).It is a characteristic of today’s PVs, as with many other

organisms, that one observes related genetic entities (the spe-cies or types) separated from each other by genetic disconti-nuities. This may in principle result from either separation oflineages into distinct populations via geographic segregation oradoption of different ecological niches or, if mutation rates arehigh enough and levels of competition between lineages arelow enough, random mutation and drift of lineages whicheventually accumulate high levels of diversity. As an ancestraltype diverged into distinct lineages, intermediate variantswould tend to become extinct, either because they were not asfit or simply as the result of chance, leaving behind relativelyisolated lineages. The many host-specific PVs and the adapta-tion to cutaneous or mucosal epithelial environments supportthe role of selection for lineages specifically adapted to differ-ent ecological niches, but the number of types occupying sim-ilar ecological niches with similar geographic distributions sug-gests that mutation and random extinction of lineages musthave played a part in the development of the current range ofPV types as well.The operational definition of a PV type was arbitrary but

apparently felicitous, insofar as few borderline cases and nogenome less than 10% from both of two previously identifiedtypes have been confirmed. To some part, this observation canbe explained on a statistical basis, because the evolution ofsequences is, to a first approximation, governed by a molecularclock, so that mutations accumulate at roughly equal rates indifferent lineages. The consequence is that, in a system evolv-ing in this fashion, distances back to a common ancestor will allbe approximately equal, and if two sequences are more than acertain distance apart, the chance of a third sequence beingless than that distance from both of the two previous sequencesis vanishingly small. As this assumption seems to apply to PVs,it is likely that PV classifications at the type level will remainstable after future additions of novel genomes. As to the ap-parent scarcity of subtypes (defined as sequences between 2and 10% different from a prototype sequence), it would be ofinterest to investigate whether the number of subtypes is lessthan would be expected under various models of evolution andpopulation genetics.PV groups and supergroups. Below the taxonomic level of

the subfamily, we have avoided the term genus, which mighteventually become introduced after further study of PVgenomic diversity. Instead, we use the neutral term super-group, and in the following, specific characteristics of thesesupergroups and some isolated lineages will be addressed. Phe-notypically, the recognition of supergroups A and B is sup-ported by well-known differences in natural disease biology

3080 CHAN ET AL. J. VIROL.

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

and by gene regulation differences, like similar cis-responsiveelements that are shared by types of the same supergroup (18,63). However, both supergroups do contain some anomalies,which are discussed below.Supergroup A.All HPV types traditionally called genital and

mucosal HPVs belong to supergroup A. Neither term is strictlycorrect, because typical genital HPVs (e.g., HPV-6) are alsofound in oral and laryngeal mucosas, and at genital sites, theyalso infect the cutaneous epithelia, e.g., the penile foreskin.Presently, it is not known whether any gene functions areunique to the genital HPV types. However, they do exhibit acommon architecture in the long control region which is absentfrom unrelated PVs and which probably correlates with similarregulatory properties. Most conspicuously, there are an SP1and two E2 binding sites immediately upstream of the TATAbox that is involved in E6 transcription (63).Within supergroup A, we distinguish 11 groups on the basis

of monophyly, high bootstrap scores (generally over 90%), andbiological similarity. This classification highlights several inter-esting features of the supergroup’s evolution; in particular,there are three groups associated with cutaneous lesions.Members of group A2 are normally found in skin warts, andgroup A4 viruses exhibit a dual predilection for both cutaneousand mucosal surfaces (4, 15). Group A8 is an unusual mixtureof the cutaneous butcher’s wart virus HPV-7 and two othermucosal types. These peculiar associations are also supportedby nucleotide alignments of the E6, E1, and different parts ofthe L1 genes and therefore do seem to reflect natural evolu-tionary groupings (6, 7). The members of these groups mayhold the secrets of how the mucosal-cutaneous distinctionarose.The groups A6, A7, and A9 include the most commonly

found high-risk types (4, 39), and these are the groups thathave attracted the attention of most researchers. However, it isunfortunate that A7 and A9 have rather low bootstrap scores.As members of these groups are among the most widely stud-ied, there is hope that a future synthesis of molecular func-tional data coupled with a comparative approach will help tomake the relationships more clear. Most members of the re-maining groups have been isolated from benign genital or orallesions or from latent infections, but on occasion, they can alsobe found in malignant disease (15).Previously, hierarchical associations of genital HPVs reflect-

ing similar biologies have been proposed (65). The cutaneous-genital HPVs (groups A2 and A4) were claimed to associatewith the HPV-6 lineage to form the low-risk types, while mostother HPVs formed the high-risk types. Our reexamination ofthe same E6 data set does not support this model. E6 phylog-enies constructed by distance matrix, parsimony, and maxi-mum-likelihood methods and evaluated by bootstrapping re-veal that the cutaneous-genital HPVs, the low-risk genitalHPVs (e.g., group A10), and the high-risk genital HPVs formindependent groups within supergroup A. There is no reliablenested hierarchy of clades for these groups and consequentlyno evidence of a nested hierarchy of virulence.Supergroup B. EV is a rare skin disorder characterized by

disseminated wartlike lesions and skin cancers (for reviews, seereferences 25 and 33). The 17 HPV types originally isolatedfrom EV patients form the majority in supergroup B. Twogroups, B1 and B2, are well-defined, and they both also containtypes that have not been found in EV-associated lesions.Group B2 is more atypical, as only HPV-50 is considered to bean EV virus, while the rest have been isolated from non-EVpatients and from such lesions as common warts, flat warts, andsquamous cell carcinomas (for a review, see reference 15).Nevertheless, the grouping also has the support of amino acid

sequence similarities throughout the entire genome (13).Group B1 is a large group of mainly EV HPVs, except forHPV-37, HPV-38, and HPV-49. The phylogenies are conflict-ing as to the internal group structure but do give a hint ofsubdivisions partially consistent with previous hybridizationstudies (38). In particular, there is agreement that HPV-9,HPV-15, HPV-17, HPV-22, HPV-23, HPV-37, and HPV-38form a clade which also has a high bootstrap value of 93%, andthe remaining types, excluding HPV-49, form another clade(bootstrap value, 91%). Because of the large distances betweentypes in this group, it will be a prime candidate for splitting intosmaller groups when the phylogenetic relationships are betterunderstood. Recently, six new HPV genomes (2) have beendetected by PCR in cutaneous cancers of renal transplantpatients. A neighbor-joining analysis of a segment overlappingour 291-bp L1 region shows that they belong in group B1.Finally, it should be pointed out that EV patients frequentlyhave flat warts associated with HPV-3 and HPV-10, two typesthat phylogeny assigns to the genital HPVs, although patho-logically they are sometimes included among the EV HPVs.Supergroups C and D. Each of these two supergroups reli-

ably unites ungulate PVs, which have always been consideredas distinct because of their unique genomic organization anddistinctive pathologies, i.e., fibropapillomas (BPV-1, BPV-2,BPV-5, European elk PV, deer PV, and ovine PV) versus truepapillomas (BPV-3, BPV-4, and BPV-6) (for references, seereference 34).Supergroup E. In this supergroup are united the two fairly

closely related viruses HPV-1 and HPV-63 (17), together withthe very distantly related type HPV-41. On the level of verylarge distances, these three HPV types form a clade togetherwith canine oral PV and cottontail rabbit PV. We have ex-cluded the most distantly relatedMastomys natalensis PV fromthis supergroup. Future animal PV sequences may better de-fine these relationships.FPV. PVs appear to be widespread among vertebrate hosts

other than mammals, but only one PV sequence from a bird,the chaffinch, has been reported (46). To estimate the relation-ship of FPV to other PVs, we aligned the published sequenceagainst homologous segments of typical representatives of thefive supergroups, i.e., HPV-1, HPV-5, HPV-16, BPV-1, BPV-4,and Mastomys natalensis PV (data not shown). Nearly all dis-similarities between the FPV sequence and those of the othersix PVs exceeded those among the six PVs. We take this toindicate that in spite of the limited interpretability of simpledistances, FPV is less related to any of the mammalian PVsthan they are related to one another.Taxonomic flexibility. This taxonomic scheme is an outline

and does not involve the full spectrum of activities traditionallyassociated with erecting a taxonomy, e.g., provision of identi-fication keys. However, it does show how one may organizePVs by cladistic principles while giving recognition to levels ofbiomedical interest. It is an expression of our best phylogenetichypotheses and should not be seen as an unchangeable con-vention but rather as a tool to inspire research to change it forthe better. The proposed names do not mean that other col-lective names (e.g., high-risk group) are discouraged; the tax-onomic convention permits these so long as they are useful andit is understood that they have no formal taxonomic statuswithin a cladistic classification. Additional classificatory levelssuch as genus and subgroup may be erected when needed. Wehope that an increasingly comparative approach aided by theHPV database (48) will speed us toward the goal of an increas-ingly informative taxonomy.Host-PV evolution. All known PVs have been detected in a

single mammalian or bird host species (61), and no interspe-

VOL. 69, 1995 PV PHYLOGENY AND TAXONOMIC CLASSIFICATION 3081

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

cies transfer has yet been reported. This host specificity and theinapparent or benign nature of most PV infections suggest thatthese viruses are extremely well-adapted parasites. We hadpreviously suggested that this could indicate host-PV coevolu-tion (7). A reexamination of the phylogeny and argumentsderived from plant-insect studies (36) now indicates that itdoes not accord with coevolution in the strictest sense of theword (26, 35) for the following reasons. (i) The virus phylogenydoes not mirror the host phylogeny accurately (Fahrenholz’srule), e.g., in supergroups A and C and in the dissimilaritybetween supergroups A and B compared with the other super-groups. (ii) In hosts harboring many different types of well-adapted parasites, e.g., humans and cattle and their PVs, theselection pressure exerted by any one kind of parasite on thehost is negligible by comparison with what the host can exerton the parasites. Thus, PV evolution is more likely to be dom-inated by unilateral host selection (sequential evolution [55])rather than by mutual coevolution, as it has to adjust to themolecular mechanism of the host cell (56).The presence of three monkey PVs closely related to HPVs

of supergroup A is an interesting observation that may beexplained as a consequence of infection of monkey-humanancestors by the PV ancestors of supergroup A. Given thatHPV-16 and HPV-18 variants indicate a very slow rate ofevolution (as little as one mutation in 300 bp over severalthousands of years) and assuming a molecular clock (72), su-pergroup A radiation could have overlapped that of its primatehosts. Alternatively, interpretation of nonsynonymous- to syn-onymous-change plots, especially with reference to the HPV-13–pigmy chimpanzee PV comparison, suggests that the close-ness could also be the result of cross-species transmission (48).PV evolution would then have parallels to the evolutionaryhistory of the human immunodeficiency virus-simian immuno-deficiency virus group (47). Although reports are lacking, itmay be possible to test for PV viability and gene functionfollowing experimental infection of heterologous hosts. Shouldthere be no cross-species viability, one might then attribute aneven greater age to the affinities between animal PVs andHPVs in supergroup E and all other PVs, an age possiblysimilar to that of land vertebrates, i.e., about 200 million years.But whatever the opinions, there is certainly a need for thesampling and study of many more animal PVs.Future directions. The establishment of the HPV sequence

database at the National Laboratory in Los Alamos by theNational Institutes of Health (48) should give a boost to com-parative studies. Research done there will be published annu-ally in the form of a compendium that will be sent to interestedscientists and that will provide updates on the relationships ofnewly typed PVs. One interesting project will be a search forcorrelations between amino acid and nucleotide sequence el-ements common to subgroups of PVs which may shed light onthe molecular basis of PV biology and pathology.In the search for novel HPVs, one probably should not

expect many more genital PVs, as most of the members of thisgroup may have been found (5). However, the present phylo-genetic isolation of HPV-1, HPV-41, and HPV-63 may indicatethat many cutaneous HPV types (17, 30) are still to be found,and the recent identification of new EV-related viruses (2, 57)seems to prove this. Aside from this, it is apparent that futureprogress in the understanding of PV evolution will require thediscovery of large numbers of novel animal PV genomes, e.g.,genital PVs from primates or nonprimate mammals.On a potentially even deeper level of phylogenetic branch-

ing, it would be of interest to examine the relationship betweenPVs and polyomaviruses, which form subfamilies within thefamily Papovaviridae. They had been grouped together because

they are the only viruses with double-stranded, covalentlyclosed, circular DNA and with similar capsid morphologies. Amonophyletic origin is rather doubtful because of the vastdifferences in genome organization and lack of sequence sim-ilarities. Only two short amino acid sequence similarities,which may be homologous, are known: (i) the binding site forthe retinoblastoma protein of the simian virus 40 T antigen andof some HPV E7 proteins and (ii) a segment of the simian virus40 T antigen that can be aligned with PV E1 proteins (8).These similarities possibly emerged through convergence, buteven if they did reflect a common ancestry of protein domains,they may be taxonomically irrelevant, as they may be ancestralcharacters shared with eukaryotic or adenovirus proteins (12,51). Evidence for shared derived characters could come fromstudies of three-dimensional structures. For example, onecould speculate that the PV capsid proteins may have domainsthat interact with one another and with DNA in a mannersimilar to that of the domains of polyomavirus capsid proteinsor that the E1 and E2 proteins might have DNA bindingdomains homologous to those of the polyoma T antigen. Buton the strength of the present evidence, polyomaviruses andPVs should be considered separate families.

ACKNOWLEDGMENTS

We thank Shih-Ping Chan, Department of Mathematics, NationalUniversity of Singapore, for computer assistance; Chiew-Hoon Tan fortechnical support; Gerald Myers, Theoretical Division, Los AlamosNational Laboratory, for enlightening discussions; and the two review-ers for their invaluable suggestions.A.L.H. is a postdoctoral fellow sponsored by the Division of Micro-

biology and Infectious Diseases of the National Institute of Allergy andInfectious Diseases through an interagency agreement with the LosAlamos National Laboratory.

REFERENCES

1. Banks, L., and G. Matlashewski. 1993. Cell transformation and the HPV E5gene. Papillomavirus Rep. 4:1–4.

2. Berkhout, R. J. M., L. M. Tieben, H. L. Smits, J. N. Bouwes Bavinck, B. J.Vermeer, and J. ter Schegget. 1995. Nested PCR approach for detection andtyping of epidermodysplasia verruciformis-associated human papillomavirustypes in cutaneous cancers from renal transplant recipients. J. Clin. Micro-biol. 33:690–695.

3. Bernard, H. U. 1994. Coevolution of papillomaviruses with human popula-tions. Trends Microbiol. 2:140–143.

4. Bernard, H. U., S. Y. Chan, and H. Delius. 1994. Evolution of papillomavi-ruses. Curr. Top. Microbiol. Immunol. 186:33–54.

5. Bernard, H. U., S. Y. Chan, M. M. Manos, C. K. Ong, L. L. Villa, H. Delius,C. L. Peyton, H. M. Bauer, and C. M. Wheeler. 1994. Identification andassessment of known and novel human papillomaviruses by PCR amplifica-tion, restriction fragment length polymorphism, nucleotide sequence, andphylogenetic algorithms. J. Infect. Dis. 170:1077–1085.

6. Chan, S.-Y. 1992. Ph.D. thesis. National University of Singapore, Singapore.7. Chan, S.-Y., H.-U. Bernard, C.-K. Ong, S.-P. Chan, B. Hofmann, and H.Delius. 1992. Phylogenetic analysis of 48 papillomavirus types and 28 sub-types and variants: a showcase for the molecular evolution of DNA viruses.J. Virol. 66:5714–5725.

8. Clertant, P., and I. Seif. 1984. A common function of polyoma virus large-Tand papillomavirus E1 proteins. Nature (London) 311:276–279.

9. Coggin, J. R., and H. zur Hausen. 1979. Workshop on papillomaviruses andcancer. Cancer Res. 39:545–546.

10. Danos, O., and M. Yaniv. 1987. E6 and E7 gene products evolved by am-plification of a 33-amino-acid peptide with a potential nucleic-acid-bindingstructure, p. 145–150. In B. M. Steinberg, J. L. Brandsma, and L. B. Taich-man (ed.), Papillomaviruses (cancer cells 5). Cold Spring Harbor Labora-tory, Cold Spring Harbor, N.Y.

11. Deau, M.-C., M. Favre, S. Jablonska, L.-A. Rueda, and G. Orth. 1993.Genetic heterogeneity of oncogenic human papillomavirus type 5 (HPV5)and phylogeny of HPV5 variants associated with epidermodysplasia verru-ciformis. J. Clin. Microbiol. 31:2918–2926.

11a.Deau, M. C., M. Favre, and G. Orth. 1991. Genetic heterogeneity amonghuman papillomaviruses (HPV) associated with epidermodysplasia verruci-formis: evidence for multiple allelic forms of HPV-5 and HPV-8 E6 genes.Virology 184:492–503.

12. Defeo-Jones, D., P. S. Huang, R. E. Jones, K. M. Haskell, G. A. Vuocolo,

3082 CHAN ET AL. J. VIROL.

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from

M. G. Hanobik, H. E. Huber, and A. Oliff. 1991. Cloning of cDNAs forcellular proteins that bind to the retinoblastoma gene product. Nature (Lon-don) 352:251–254.

13. Delius, H., and B. Hofmann. 1994. Primer-directed sequencing of humanpapillomavirus types. Curr. Top. Microbiol. Immunol. 186:13–31.

14. de Villiers, E.-M. 1989. Heterogeneity of the human papillomavirus group. J.Virol. 63:4898–4903.

15. de Villiers, E.-M. 1994. Human pathogenic papillomavirus types: an update.Curr. Top. Microbiol. Immunol. 186:1–12.

16. Doorbar, J. 1991. An emerging function for E4. Papillomavirus Rep.2:145–147.17. Egawa, K., H. Delius, T. Matsukura, M. Kawashima, and E. M. de Villiers.

1993. Two novel types of human papillomavirus, HPV 63 and HPV 65:comparisons of their clinical and histological features and DNA sequences toother HPV types. Virology 194:789–799.

18. Ensser, A., and H. Pfister. 1990. Epidermodysplasia verruciformis associatedhuman papillomaviruses present a subgenus-specific organisation of the reg-ulatory genome region. Nucleic Acids Res. 18:3919–3922.

19. Eschle, D., M. Durst, J. ter Meulen, J. Luande, H. C. Eberhardt, M. Pawlita,and L. Gissmann. 1992. Geographical dependence of sequence variation inthe E7 gene of human papillomavirus type 16. J. Gen. Virol. 73:1829–1832.

20. Faulkner, D. V., and J. Jurka. 1988. Multiple Aligned Sequence Editor(MASE). Trends Biochem. Sci. 13:321–322.

21. Felsenstein, J. 1982. Numerical methods for inferring evolutionary trees. Q.Rev. Biol. 57:379–404.

22. Felsenstein, J. 1985. Confidence limits on phylogenies: an approach usingthe bootstrap. Evolution 39:783–791.

23. Genetics Computer Group Wisconsin USA. 1991. Program manual of theGCG package, version 7.

24. Giri, I., and M. Yaniv. 1988. Structural and mutational analysis of E2 trans-activating proteins of papillomaviruses reveals three distinct functional do-mains. EMBO J. 7:2823–2829.

25. Grußendorf-Conen, E. I. 1987. Papillomavirus-induced tumors of the skin:cutaneous warts and epidermodysplasia verruciformis, p. 158–181. In K.Syrjaenen, L. Gissmann, and L. G. Koss (ed.), Papillomaviruses and humandisease. Springer Verlag, Berlin.

26. Hafner, M. S., and S. A. Nadler. 1988. Phylogenetic trees support the co-evolution of parasites and hosts. Nature (London) 332:258–259.

27. Heinzel, P., S.-Y. Chan, L. Ho, M. O’Connor, P. Balaram, M. Saveria-Campo, K. Fujinaga, N. Kiviat, J. Kuypers, H. Pfister, B. M. Steinberg, S.-K.Tay, L. L. Villa, and H. U. Bernard. Variation of human papillomavirus type6 (HPV-6) and HPV-11 genomes sampled throughout the world. Submittedfor publication.

28. Hennig, W. 1966. Phylogenetic systematics. University of Illinois Press, Ur-bana.

29. Hillis, D. M., J. P. Huelsenbeck, and C. W. Cunningham. 1994. Applicationand accuracy of molecular phylogenies. Science 264:671–677.

30. Hirt, L., A. Hirsch-Behnam, and E. M. de Villiers. 1990. Nucleotide se-quence of human papillomavirus (HPV) type 41: an unusual HPV typewithout a typical binding site consensus sequence. Virus Res. 18:179–190.

31. Ho, L., S.-Y. Chan, R. D. Burk, B. C. Das, K. Fujinaga, J. P. Icenogle, T.Kahn, N. Kiviat, W. Lancaster, P. Mavromara-Nazos, V. Labropoulou, S.Mitrani-Rosenbaum, B. Norrild, M. R. Pillai, J. Stoerker, K. Syrjaenen, S.Syrjaenen, S.-K. Tay, L. L. Villa, C. M. Wheeler, A.-L. Williamson, and H.-U.Bernard. 1993. The genetic drift of human papillomavirus type 16 is a meansof reconstructing prehistoric viral spread and the movement of ancienthuman populations. J. Virol. 67:6413–6423.

32. Icenogle, J. P., P. Sathya, D. L. Miller, R. A. Tucker, and W. E. Rawls. 1991.Nucleotide and amino acid sequence variation in the L1 and E7 open readingframes of human papillomavirus type 6 and type 16. Virology 184:101–107.

33. Jablonska, S., and S. Majewski. 1994. Epidermodysplasia verruciformis:immunological and clinical aspects. Curr. Top. Microbiol. Immunol. 186:157–175.

34. Jackson, M. E., W. D. Pennie, R. E. McCaffery, K. T. Smith, J. Grindlay, andM. S. Campo. 1991. The B group bovine papillomaviruses lack an identifiableE6 open reading frame. Mol. Carcinog. 4:382–387.

35. Janzen, D. H. 1980. When is it coevolution? Evolution 34:611–612.36. Jermy, T. 1984. Evolution of insect/host relationships. Am. Nat. 124:609–630.37. Kitasato, H., H. Delius, H. zur Hausen, K. Sorger, F. Rosl, and E.-M. de

Villiers. 1994. Sequence rearrangements in the upstream regulatory regionof human papillomavirus type 6: are these involved in malignant transition?J. Gen. Virol. 75:1157–1162.

38. Kremsdorf, D., S. Jablonska, M. Favre, and G. Orth. 1982. Biochemicalcharacterization of two types of human papillomaviruses associated withepidermodysplasia verruciformis. J. Virol. 43:436–447.

39. Lorincz, A. T., R. Reid, A. B. Jenson, M. D. Greenberg, W. D. Lancaster, andR. J. Kurman. 1992. Human papillomavirus infection of the cervix: relativerisk associations of 15 common anogenital types. Obstet. Gynecol. 79:328–337.

40. Lowy, D. R., R. Kirnbauer, and J. T. Schiller. 1994. Genital human papil-lomavirus infection. Proc. Natl. Acad. Sci. USA 91:2436–2440.

41. Maddison, W. P., and D. R. Maddison. 1992. MacClade: analysis of phylog-eny and character evolution. Sinauer, Sunderland, Mass.

42. Manos, M. M., Y. Ting, D. K. Wright, A. J. Lewis, T. R. Broker, and S. M.

Wolinsky. 1989. The use of polymerase chain reaction amplification for thedetection of genital human papillomaviruses. Cancer Cells 7:209–214.

43. Mayr, E. 1942. Systematics and and the origin of species. Columbia Univer-sity Press, New York.

44. Mayr, E. 1963. Animal species and evolution. Harvard University Press,Cambridge, Mass.

45. Meglitsch, P. A. 1954. On the nature of the species. Syst. Zool. 3:49–65.46. Moreno-Lopez, J., H. Ahola, A. Stenlund, A. Osterhaus, and U. Pettersson.

1984. Genome of an avian papillomavirus. J. Virol. 51:872–875.47. Moriyama, E. N., and T. Gojobori. 1991. Molecular evolution of human and

simian immunodeficiency viruses, p. 291–301. InM. Kimura and N. Takahata(ed.), New aspects of the genetics of molecular evolution. Japan ScientificSocieties Press, Tokyo.

48. Myers, G., H. U. Bernard, H. Delius, M. Favre, J. Icenogle, M. van Ranst,and C. Wheeler (ed.). 1994. Human papillomaviruses 1994. A compilationand analysis of nucleic and amino acid sequences. Los Alamos NationalLaboratory, Los Alamos, N.Mex.

49. Olsen, G. J., H. Matsuda, R. Hagstrom, and R. Overbeek. Personal commu-nication.

50. Ong, C.-K., S.-Y. Chan, M. S. Campo, K. Fujinaga, P. Mavromara-Nazos, V.Labropoulou, H. Pfister, S.-K. Tay, J. ter Meulen, L. L. Villa, and H.-U.Bernard. 1993. Evolution of human papillomavirus type 18: an ancient phy-logenetic root in Africa and intratype diversity reflect coevolution with hu-man ethnic groups. J. Virol. 67:6424–6431.

51. Phelps, W. C., C. L. Lee, K. Munger, and P. M. Howley. 1988. The humanpapillomavirus type 16 E7 gene encodes transactivation and transformationfunctions similar to those of adenovirus E1A. Cell 53:539–547.

52. Pushko, P., S. Toshiyuki, J. Cuzick, and L. Crawford. 1994. Sequence vari-ation in the capsid protein genes of human papillomavirus type 16. J. Gen.Virol. 75:911–916.

53. Reszka, A. A., J. P. Sundberg, and M. E. Reichmann. 1991. In vitro trans-formation and molecular characterization of colobus monkey venereal pap-illomavirus DNA. Virology 181:787–792.

54. Reuter, S., H. Delius, T. Kahn, B. Hofmann, H. zur Hausen, and E. Schwarz.1991. Characterization of a novel papillomavirus DNA in the cervical carci-noma cell line ME180. J. Virol. 65:5564–5568.

55. Ridley, M. 1993. Evolution. Blackwell Scientific, Cambridge, Mass.56. Shadan, F. F., and L. P. Villareal. 1993. Coevolution of persistently infecting

small DNA viruses and their hosts linked to host-interactive regulatorydomains. Proc. Natl. Acad. Sci. USA 90:4117–4121.

57. Shamanin, V., M. Glover, C. Rausch, C. Proby, I. M. Leigh, H. zur Hausen,and E.-M. de Villiers. 1994. Specific types of human papillomavirus found inbenign proliferations and carcinomas of the skin in immunosuppressed pa-tients. Cancer Res. 54:4610–4613.

58. Simpson, G. G. 1961. Principles of animal taxonomy. Columbia UniversityPress, New York.

59. Sneath, P. H. A., and R. R. Sokal. 1973. Numerical taxonomy. Freeman, SanFrancisco.

60. Stewart, C.-B. 1993. The powers and pitfalls of parsimony. Nature (London)361:603–607.

61. Sundberg, J. P. 1987. Papillomavirus infections in animals, p. 40–103. In K.Syrjaenen, L. Gissmann, and L. G. Koss (ed.), Papillomaviruses and humandisease. Springer Verlag, Berlin.

62. Swofford, D. L. 1993. Phylogenetic Analysis Using Parsimony: PAUP, ver-sion 3.1. Illinois Natural History Survey, Champaign, Ill.

63. Tan, S.-H., L. E.-C. Leong, P. A. Walker, and H.-U. Bernard. 1994. Thehuman papillomavirus type 16 E2 transcription factor binds with low coop-erativity to two flanking sites and represses the E6 promoter through dis-placement of Sp1 and TFIID. J. Virol. 68:6411–6420.

64. van den Brule, A. J. C., P. J. F. Snijders, C. J. L. M. Meijer, and J. M. M.Walboomers. 1993. PCR-based detection of genital HPV genotypes: anupdate and future perspectives. Papillomavirus Rep. 4:95–99.

65. van Ranst, M., J. B. Kaplan, and R. D. Burk. 1992. Phylogenetic classifica-tion of human papillomaviruses: correlation with clinical manifestations. J.Gen. Virol. 73:2653–2660.

66. van Ranst, M. A., R. Tachezy, and R. D. Burk. 1994. Human papillomavirussequences: what is in stock? Papillomavirus Rep. 5:65–75.

67. van Ranst, M. A., R. Tachezy, J. Pruss, and R. D. Burk. 1992. Primarystructure of the E6 protein of Micromys minutus papillomavirus and Mas-tomys natalensis papillomavirus. Nucleic Acids Res. 20:2889.

68. Wheeler, C. M. Personal communication.69. Wiley, E. O. 1978. The evolutionary species concept reconsidered. Syst. Zool.

27:17–26.70. Wiley, E. O. 1981. Phylogenetics. John Wiley & Sons, New York.71. Yabe, Y., A. Sakai, T. Hitsumoto, H. Kato, and H. Ogura. 1991. A subtype of

human papillomavirus 5 (HPV-5b) and its subgenomic segment amplified incarcinoma: nucleotide sequences and genomic organizations. Virology 183:793–798.

72. Zuckerkandl, E., and L. Pauling. 1962. Molecular disease, evolution, andgenetic heredity, p. 189–225. InM. Kasha and B. Pullman (ed.), Horizons inbiochemistry. Academic Press, New York.

73. zur Hausen, H. 1991. Viruses in human cancers. Science 254:1167–1173.

VOL. 69, 1995 PV PHYLOGENY AND TAXONOMIC CLASSIFICATION 3083

on August 15, 2014 by guest

http://jvi.asm.org/

Dow

nloaded from