Molecular population genetic analysis of the enn subdivision of group A streptococcal emm-like...

11
Molecular Microbiology (1995) 15(6), 1039-1048 Molecular population genetic analysis of subdivision of group A streptococcal Adrian M. Whatmore/^ Vivek Kapur,^* James M. Musser^ and Michael A. 'Department of Microbiology, University of Newcastle upon Tyne. Medical School, Eramlington Place. Newcastle upon Tyne, NE2 4HH, UK. ^Section of Molecu!ar Patiiobiology, Department of Pathology, Baylor College of Medicine, Houston, Texas 77030, USA. Summary The group A streptococcal ernm-like genes, which encode the cell-surface M and M-Iike proteins, are divided into distinct mrp, emm and enn subdivisions and are clustered together in a region of the chromo- some called the vir regulon. In order to understand the mechanisms involved in the evolution of emm- like genes, a 180bp fragment of the 5 variable region of the enn gene was characterized in 31 strains for which emm sequences and muitilocus enzyme elec- trophoretic profiles have been previously determined. The results demonstrate that nucleotide polymorph- isms at the enn locus are generated predominantly by point mutations and short deletions or insertions, and that variation among enn and emm genes has arisen by similar mechanisms. However, diversity at the enn locus is restricted in comparison to the emm locus. Moreover, there is strong evidence for intra- genic recombination at the enn locus and the pattern of distribution of emm and enn alleles among strains suggests that these genes may be independently acquired by horizontal transfer and recombination from distinct donor strains, thereby generating a mosaic structure for the wr regulon. The results add to a growing body of evidence that horizontal gene transfer has played a major roie in the evolution of Streptococcus pyogenes Wrregulons. Received 30 Augusl, 1994; revised 28 November, 1994; accepfed 30 November, 1994, Present addresses: fResearcli Division, CAMR, Porton Down, Salisbury SP4 OJG. UK: tDepartrneni of Velerinary PailTObioiogv, University oi Mir\nGsota, Si. Paul, Minnesota 55108, USA, •i=For correspondence, Tel, (091)2228143; Fax (091) 2227736, Introduction The emm-like genes ot group A streptococci {Sirepto- coccus pyogenes) encode a family of structurally related proteins that form cell-wall-associated fibrils witfi tfieir Ay- terminal ends extending outwards from the cell (r'eviewed by Kehoe, 1994), Publisfied emm-like gene sequences siiare highly conserved 5' and 3' regions, corresponding, respectively, to N-terminal sigr-ial peptides and to the C- terminal wall-associated regions of M-lii^e proteins, h-low- ever, the sequences corTesponding to fhe protruding N- terminal halves of fhe mature proteins are highly vari- able. Absorbed fyping-sera against tinese cell-surface proteins divide S, pyoc/enes sfr-ains into c. 100 distinct M types, and immunity to S. pyogenes infections in humans has been associated with the production of opsonic, type-specific anti-M protein antibodies {Laricefieid, 1962; Robinson and Kehoe, 1992), For many decades if was assumed that individual IVI types of S. pyogenes express single type-specific protective M pr-otein antigen, with fibrinogen-binding and antipiiagocytic properties (Lance- field, 1962: Fischetfi. 1989). However, if is now clear fhaf this was an oversimplificafion. Most sfr'ains possess mul- tiple emm-like genes, locafed adjacenf to each otiier af a chromosomal locus calted the vir r-egulon where they are fiankedby the virR{mry) and scpA genes (Fig, 1). S. pyo- genes M fypes are fraditionally divided into OF '" and OF groups based on expression of a poorly characterized serur-n-opacity factor (OF); (Johnson and Kaplan, 1993). Polyr-nerase chain reaction (PCR) and hybridization studies have found fhat OF'' strains possess a friplef of emm-iike genes befween virR and scpA, buf fhe structure of the w> regulon varies among OF " strains (Fig, 1). The emm-like genes are divided into distinct mrp. emm and enn subdivisions, reflecting characteristic differences in their coriserved 5' and 3' regions and fheir relafive posi- tions in w/regulons (Fig, 1). In sfrains fhat confain only a single emm-like gene, this gene invariably belongs to the emm subdivision (Fig, 1), and cioned emm5 and emmS genes have been demonstrafed directly fo express anfi- phagocytic M profeins in strepfococci (Poirier et ai, 1989; Scott et ai, 1986). However, even ar-nong ihe limited number of sfrains fhaf have been sfudied in defail

Transcript of Molecular population genetic analysis of the enn subdivision of group A streptococcal emm-like...

Molecular Microbiology (1995) 15(6), 1039-1048

Molecular population genetic analysis ofsubdivision of group A streptococcal

Adrian M. Whatmore/^ Vivek Kapur, *James M. Musser^ and Michael A.'Department of Microbiology, University of Newcastleupon Tyne. Medical School, Eramlington Place.Newcastle upon Tyne, NE2 4HH, UK.^Section of Molecu!ar Patiiobiology, Department ofPathology, Baylor College of Medicine, Houston, Texas77030, USA.

Summary

The group A streptococcal ernm-like genes, whichencode the cell-surface M and M-Iike proteins, aredivided into distinct mrp, emm and enn subdivisionsand are clustered together in a region of the chromo-some called the vir regulon. In order to understandthe mechanisms involved in the evolution of emm-like genes, a 180bp fragment of the 5 variable regionof the enn gene was characterized in 31 strains forwhich emm sequences and muitilocus enzyme elec-trophoretic profiles have been previously determined.The results demonstrate that nucleotide polymorph-isms at the enn locus are generated predominantlyby point mutations and short deletions or insertions,and that variation among enn and emm genes hasarisen by similar mechanisms. However, diversity atthe enn locus is restricted in comparison to the emmlocus. Moreover, there is strong evidence for intra-genic recombination at the enn locus and the patternof distribution of emm and enn alleles among strainssuggests that these genes may be independentlyacquired by horizontal transfer and recombinationfrom distinct donor strains, thereby generating amosaic structure for the wr regulon. The results addto a growing body of evidence that horizontal genetransfer has played a major roie in the evolution ofStreptococcus pyogenes Wrregulons.

Received 30 Augusl, 1994; revised 28 November, 1994; accepfed 30November, 1994, Present addresses: fResearcli Division, CAMR,Porton Down, Salisbury SP4 OJG. UK: tDepartrneni of VelerinaryPailTObioiogv, University oi Mir\nGsota, Si. Paul, Minnesota 55108,USA, •i=For correspondence, Tel, (091)2228143; Fax (091) 2227736,

Introduction

The emm-like genes ot group A streptococci {Sirepto-coccus pyogenes) encode a family of structurally relatedproteins that form cell-wall-associated fibrils witfi tfieir Ay-terminal ends extending outwards from the cell (r'eviewedby Kehoe, 1994), Publisfied emm-like gene sequencessiiare highly conserved 5' and 3' regions, corresponding,respectively, to N-terminal sigr-ial peptides and to the C-terminal wall-associated regions of M-lii e proteins, h-low-ever, the sequences corTesponding to fhe protruding N-terminal halves of fhe mature proteins are highly vari-able. Absorbed fyping-sera against tinese cell-surfaceproteins divide S, pyoc/enes sfr-ains into c. 100 distinct Mtypes, and immunity to S. pyogenes infections in humanshas been associated with the production of opsonic,type-specific anti-M protein antibodies {Laricefieid, 1962;Robinson and Kehoe, 1992), For many decades if wasassumed that individual IVI types of S. pyogenes expresssingle type-specific protective M pr-otein antigen, withfibrinogen-binding and antipiiagocytic properties (Lance-field, 1962: Fischetfi. 1989). However, if is now clear fhafthis was an oversimplificafion. Most sfr'ains possess mul-tiple emm-like genes, locafed adjacenf to each otiier af achromosomal locus calted the vir r-egulon where they arefiankedby the virR{mry) and scpA genes (Fig, 1). S. pyo-genes M fypes are fraditionally divided into OF '" and OFgroups based on expression of a poorly characterizedserur-n-opacity factor (OF); (Johnson and Kaplan, 1993).Polyr-nerase chain reaction (PCR) and hybridizationstudies have found fhat OF'' strains possess a friplef ofemm-iike genes befween virR and scpA, buf fhe structureof the w> regulon varies among OF " strains (Fig, 1).

The emm-like genes are divided into distinct mrp. emmand enn subdivisions, reflecting characteristic differencesin their coriserved 5' and 3' regions and fheir relafive posi-tions in w/regulons (Fig, 1). In sfrains fhat confain only asingle emm-like gene, this gene invariably belongs to theemm subdivision (Fig, 1), and cioned emm5 and emmSgenes have been demonstrafed directly fo express anfi-phagocytic M profeins in strepfococci (Poirier et ai,1989; Scott et ai, 1986). However, even ar-nong ihelimited number of sfrains fhaf have been sfudied in defail

1040 A. M. Whatmore, V. Kapur, J. M. Musser and M. A. Kehoe

OF - PositiveM types

emm - like

- emm

genes

enn

OF - NegativeM types

t^ig. 1. vir regulon struclures, Sub-divisiona of emm-like genes are described by rhe generic lerins mrp. emrt) and enn. following the siri-iplifiednoriienciaiure previously proposed insiead of the wide variety of aiiernaiive gene designations thai have been used in the Irterature ior allelesot these genes [Whatriior-g et a!.. 1994), The three basic vii regulon structures depicted in the diagram are based on a compilation of sequencedata for a limited number of weil-characterized strains, but PCR and hybridization studies have indicated (hal ihese structur-es ar-e represenfa-tive of the vast r-najorily of strains examined to date. However, it should be noted that minor variations on these siruotures, such as increasedspacing between emm and enn genes, between the enn and scpA genes, or duplications of particuiar genes, have been found in a minority olstrains. The data are compiled fi-om Ihe tallowing papers; t-lollingshGad et at., 198S; Robbins et al., 1987; Miller el al., 1988: Mouw et a!.. 1988;Frithz ef at.. 1989; Heath and Cleary, 1989; Haanes and Cleary, 1989; Goriii e( al.. 1990; Simpson e( al., 1990; Mar-tjula et aL 1991; Podbielskietai, 1991; Bessen and Fischetti, 1992; Jeppson etal., 1992; O'Toole etai, 1992; Heden and Lindahl, 1993; Podbielski, 1993; Podbielski etal., 1993a; Podbielski el al.. 1993b; HoJIingshe^d elal, 1993: Hollingshead el al.. 1994; Whalmore and Kehoe, 1994, References died aboveinciude direct eviderice that the emm gene products from M lypes 1, 2. 5, 6. 12, 2-4, 49 and 57 are recognized by the corresponding M typingsera, bul this has nol yet been cor-ifirmed tor other U lypes. The tigure is adapted from Kehoe (1994),

fo date, there are differences in the functional properties ofprofeins encoded by the same sub-division of emm-likegenes. For example, emm4 {arp4) encodes an IgA Fc-binding protein that does not birid fibrinogen, emm12encodes a protein that binds both fibrinogen and humanIgG Fc-domains, and the emmS gene pr-oducf bindsfibrinogen but nof non-immune igG nor igA (Lindahl andAkerstrotn, 1989: Refnoningrum et ai, 1993; Whafmoreand Kel-ioe. 1994), Furthermore, aifhough there is directevidence [hat the M type-specificity of several serotypesis determined by their emm gene products (Fig. 1), partial5' e?mm sequences from a number of stf-ains designated asdistinct M types have been found fo be idenfical (Whaf-more etat., 1994). This raises fhe possibilify that in somestrains type-specificity, and consequently type-specificimmunity, might be determined by an mrp or enn geneproduct, or by more than one M-like protein.

We have recently studied variation in the 5' regions ofemm genes in 79 different M fypes of S, pyogenes andcompared fhis to estimates of overall genomic relation-ships among these strains {Whatmore ef ai. 1994). Thesesfudies found a patiern of extensive var'taiion among emmgenes and revealed that many of these genes have beenfransferred horizontally between divergent S. pyogenessfrairis. The evolution of wrreguloi-is would be acceleratedif strains couid acquire different corTibinations of emm-likegenes by horizontal transfer of mrp or enn. as well as emmgenes, fror-n different donor strains. A recent report of non-congruency in ihe hybridization of different mrp and enn

gene probes fo DNA fr'om 21 differerif M lypes suggestedthat this might be the case (Bessen and Hoilingshead,1994), However, in the absenoe of information on phylo-genetic relationships among the hybridizing sequencesand overall chromosomal relationships among the strainsstudied, the observed hybridization patterns could beexplained equally wei[ by vertical divergence (Bessenand Hoilingshead, 1994) and/or convergence in responseto selective pressures. Therefore, fo test the above hypo-thesis, we characterized the DNA sequences of the 5'regions of enn genes from 31 disfinct M types of S. pyo-genes anct compared (hese to variation in 5' emmsequences and overall chromosomal divergence amongthese strains.

Results and Discussi on

isolation and sequencing of enn genes

PCR reactions employing primers corresponding fo con-served 5' and 3' sequences that are characleristic of theenn subdivision of emm-iike genes (see ttie Experimentalprocedures), amplified a producf from atl 35 OF" M fypestested and from 25/35 OF M fypes (Table 1). Sequencesof 200-300 bp at fhe 5' ends of PCR products from 22 OF"'and nine OF strains were determined (Table 1) and in allcases these sequences were distinci from {he previouslydescribed emm genes in these strains (Whatr-)iore ef ai,1994), Moreover, fhe PCR products from the M2 and M4

Streptococcus pyogenes enn genes 1041

Table 1. Group A str'eptococcal slr-ains'^ andGenBank accession numbers of correspondingenn sequences.

M type

241415222527283644464750586061626465687274

767780PT4245''PTI 80"PT2841"PT3875''TR2233'^PTS757"

Strain

NCTC-e322SS-241NCTC819932-1NCTC-a330NCTC-a306NCTC-8328NCTC-8308NCTC-8227M44-PHLS^NCTC-8230NCTG-8232NCTC-1086BM58-PHLS=NCTG-10378M61-PHLS^NCTC-1O880M64-PHLS'^M65-PHLS''R80/2076M72-PHLS'^^M74-PHLS'^

M76-PHLS'=M77-PHLS^JV18O-PHLS'=R90/33S5-180-PHLS'^2841-PHLS'^3875-PHLS''2233-PHLS^5757-PHLS^

Source'*

E, Beache/ 'E, Beach ey''E, Beachey''E. Beach ey""NCTCNCTCNCTCE, Beachey"E. Beachey^PHLSE, Beachey''NCTCNCTCPHLSNCTCPHLSNCTCPHLSPHLSPHLSPHLSPHLS

PHLSPHLSPHLSPHLSPHLSPHLSPHLSPHLSPHLS

Accession number

X61276'^'

z-tieoa" 'U20834U20833'U20831'U2G85t'US0850'U20849'U20847U 20826'U20325U20339U20838'U20836'U20835U20844'U20843'U2084tU20840U20856'U20855U20854 {enn74.l)U20853 {enn74.2)U20852'U20846U20845U20827'U20B32'U20848'U20829'U20830'U20837

*Among the strains described by Wharmore el al. (1994), enn POP, products were isolated, bulno! sequenced trom the slrains that correspond to M lypes 11,13, 18,30, 33, 41, 43, 52, 53, 54,56, 48, 59, 66, 73, 75, 78. 81, PT4931. Potter C, TR2612, TR2631, ar-id PT1658, No enn PCRproducl was isolated from fhe sirains that correspond to U lypes 3, 5(Manir'edo), 6, 12, 17, 19,26, 29, 37, 39, 55,

a. Source NCTC refers fo strains purchased from tfie National Collection of Type Cuilures,Collindale, UK, and PHLS to strains generously provided by Ms A, Tanna, Slr-eptococcalReference Laboratory, Public Healih Laboratory Seivice, Collindaie, UK, Olher strains wereprovided by the late Professor Ed Beaohey and are from the N4er-nphis VA Hospital cultur-e col-lection, Memphis, USA,b. The M serotypes of these strains were confirmed by Ms A, Tanna, PHLS, Collindale, UK.c. These str'ains were desigriated solely as a parlicular M iype,d. Provided as provisional new M fypes,e. These accession numbers are tor previously publisiied sequences that are identical fo linesequences determined in ihis study,i. These sequences were determined by direct sequencing of gel-puri(ied PCR products.

Strains are identical to the previously described enn2{emmL2.2; Bessen and Fischetti, 1992) and enn4 (Jepp-son etal., 1992) genes, respectively. Thus, the PCR pro-ducts isolated in this study were designated ennX, where'X' represents the M type of tine parent strain.

In most cases, sequences were determined directly ongel-purified PCR products, but for some OF^ and mosfOF" sfrains this resulted in ambiguous data and PCRproducts were cioned into a plasmid vecfor prior tosequencing (Table 1). Tine product amplified from onestrain (M74), was found to consist of fwo disfinct frag-ments ol identical size, which were designated enn74.1and enn74.2. Muitipie enn sequences were not idenfifiedamong the cloned products from other strains, but since

oniy six individual recombinants were examined in eachcase the possible existence cf more fhan one distinctenn gene in other strains cannot be completely ruled out.The 5' ends o1 all of tiie enn sequences are almost identi-cal to the conserved signal-peptide-encoding regions ofthe previously described enn genes, facilitating sequencealignments and the prediction of potential open-readingfrarTres (ORFs),

Piiylogenetic relationships: restricted enn sequencevariation and recombination between enn genes

For anaiysis of relationships, all sequences were reducedto a core of 180 nucleotides, corresponding to amino acid

1042 A. M. Whatmore, V. Kapur, J. M. Musser and M. A. Kehoe

OF-l'OS

enn2

n-|80

enn22

— Gnn68

— enn25

I enn50

~ n enn62

ehn77

I- enn2841

enn2233

enn27

enn58

Gnn4

Gnnf5 (OF-tJEG)

Gnn3875

enn44

enn61

• onn4245

enn28

O1--NGG

rL—

enn64

enn74.2

enn1

— Gnn7

enn74,1

enn80 (O

Fig. 2, Dendiograni eslimalrng phyiogenelicrelationships inferred from 180bp 5' ennsequences. Aligned sequences wereanalysed (or phylogenetic relationships usrngparsimony, with fhe computer program PAUP 3,1(Swotford, 1991), and Ihe significance ofIhe branching order was tested with the'bootstrap' method. The ruimber-s adjacent tonodes r'epresent the proportion (%) of 2000'bootstrap' frees that contain the sequencesto Ihe right as a monophyletro group. Numbersar-e not specified at nodes where tfie'boolslrap' values were less than 50%,Br-anohes having a maximum length of zero(representing dilferenf taxa with identicalenn sequences) were collapsed to yieldpoiytomies. Clusters are labelled A to I alr7-iajor branch points and designationsOF-NEG and OF-POS lisled below majorbranch poii-tts indicale the opacity-facforphenotypes of either all (clusters A, B, D, E, Hand I) or the majorily (clusters C, F and G) ofstrains within fhe corresponding oiusier. In thelatter case, aberrant OF phenotypes of aminority of str"ains are noted adjacent lo the Mtype. The OF phenofypes of Ihe strainsinciuded in this study were determinedpr-eviously (Whatmore etal., 1994), It isemphasised that while the figure provides avisual aid to obsen/ing the genetic distancesbetween sequences, Ihe occurrence ofhorizontal gene iransfer (see the Results andDiscussion} niaans that the exact evokifionarypathways remain uncertain.

oi--N

enn47

Gnn60

Lenn46"Ienn 14

enn36onn5

enn65 (OF-PO;:

residues —12 to -i- 48 from the predicted signai-peptidasecleavage sites in the deduced amino acid sequences. Inaddition to the sequences determined in this study,tine analysis inciuded the corresponding regions oftiie previously described enn49 {ennX; Haanes andCleary, 1989). enn1 {prtH: Gortii etal.. 1990) and enn5{enr\5.8193\ Whafmore and Keiioe, 1994) genes. Fig-ure 2 presents phylogenefic relafionships determined byparsimony anaiysis (Swofford and Olsen. 1990) and fhenumbers adjacent to nodes indicate the significance ofclustering evaluated by fhe 'bootstrap' procedure (Felsen-stein, 1988),

It is clear from Fig, 2 that there is a strong correiationbetween 5' enn nucleotide relationships and the OF pheno-types of strains. Tiie vasf majority of enn sequences fromOF" str'ains ar'e assigned to either cluster A, B or C, andthe oniy sequence from an OF" strain in any of thesethr'ee clusters is enn15 in cluster C (Fig, 2), Interestingly,

studies on the phyiogeny of em/77 sequences from thesestrains also assigned emm15to an OF*' emm sequencecluster (Whatmore et ai, 1994), It is iii<ely that theobserved OF" phenotype of M15 strains is due simply tofailure to express detectable quantities of opacity factor,though these data could also be explair-ied by the hori-zontal transfer of emm-like genes befween OF ' andOF strains, or loss of a gene for OF production. Theonly OF"* 5' enn sequences that cluster with sequencesfrom OF" sfrains are in clusters F (enn65 and ennSO)and G {enn60), and it is inter'esting to note that twoof these {ennSO and enn60) ar'e identical to OF" ennsequences {enn74.1 and enn47) (Fig. 2). This congru-ency between enn relationships and OF phenotyes furtherstrengthens previous suggestions that there are twogroups of vir reguions in S, pyogenes (Haanes andCieary, 1989; Haanes etai, 1992; Whatmore etai. 1994).A recent paper by Podbieiski et al. (1994b) inciuded a

Str'eptococcus pyogenes enn genes 1043

Table iisting enn,2-type, enn49-{ype, and two 'unique' enngenes frorTi OF" sfrains, fhaf were distinguisiied by >50%difference in a 70 bp 5' segment. Although the sequenceswere not reported and phylogenetic relationships were notanalysed (Podbielski etai, 1994b), it is interesting to notethat there are parallels between these enn^-type andenn49-type lists and sorTie of the M types correspondingto distinct cluster's A and C, respectiveiy, in Fig, 2. andthat the two 'unique' enn genes are from M fypes corre-sponding to two other disfinct clusters (clusfers B and G;Fig. 2). However, confirmation of fhese apparent parallelsmust await publication of sequences for the enn geneslisted by Podbielski etai (1994b),

A particularly inferesting feature of the inferred phylc-geny depicted in Fig. 2 is that nucleofide variation in theenn SLrbdivision of emm-like genes is approximately anorder of magnitude less than that described for the 5' vari-able regions of emm genes. For exampie, the averagenumber of substitutions per site in the 5' variable regionsof enn and emm genes in the nine str-ains correspondingto cluster C (Fig, 2) is 0.06 and 0,86 respectively. Therestricted diversity in enn compared to emm genes isalso cleariy apparent in each of the other cluster's (A, Band F; Fig, 2) that are represented by three or moreisolates (data nol shown), Aitiiough ther'e are clearly dif-ferences in the extent of sequence divergence in the 5'variabie regions of tiie enn and emm genes, in bothcases variation is predominantly the result of a combi-nation of point mutations and short deletions and inser-tions (data not shown; Whatmore et ai, 1994), Further,the ratio cf synonymous C S) to nonsynonymous C N) sub-stitutions per site for the enn and emm gene sequencesamong fhe nine isolates in cluster C (Fig, 2) is c, 1,07and 0.86. r'espectively, and in both cases there is acontrast between the level of variation observed in theproximal signal-peptide-encoding regions and in regionscorresponding to the N-terminal ends of the predictedmature pi'ofeins (data not shown). Thus, despite clear dif-fer'ences in the extent of variation, diversity at i30th ennand emm loci appears to have arisen by similar mechan-isms, probably in response to positive pressures for vari-ation imposed by recognition of celi-surface proteins bythe host immune system.

Horizontal tratisfer of enn sequences

We have previousiy reported estimates of overall genomicdivergence among the strains studied here, and describedthe relationships among the 5' regions of tiieir emm genes(Whafmore e a/,, 1994). A comparison of these datawith the 5' enn sequences determined during this studyreveals many cases of non-congruency that providestrong evidence of horizonfal gene transfer. A number ofexamples of this are described in Fig. 3, If should be

noted that previous studies distinguished 66 muitilocuselectr'ophoretic types (ET) among 79 M types of S.pyogenes, where the sample consisfed of one strainof each M type, and the estimated genetic distancesbetween sfrains ranged from 0,0 up fo 0,5 (Whafmore efai. 1994), Panels A and B in Fig. 3 describe exampleswiier'e there is a very clear contrast between the existenceof identical enn sequences in highly divergent geneticbackgrounds and the existence of divergent ennsequences in ciosely related geneiic backgrounds. Thiscontrast is particuiarly striking since in each of theseexamples a single strain (M44 in panel A. and M60 inpanel B) is oorrrpared both to a very divergent strain con-taining an identical enn sequerice and to an essentiallyidentical strain containing a divergent enn sequence(Fig. 3), It is likely tiiat enn genes have been transferredhorizontally among fhese strains. Horizontal transfer alsoaccounts fcr the existence of fwo disfincf enn genes inthe M74 strain (Fig. 3, panel C), M74 strain is sepai'atedfrorTi M64 and M80 by genefic distances of >0.2 and>0.3 respecfively, and M64 is separated from M80 by agenefic disfance of >0.3 (Whafmore et ai, 1994). Con-sistent with 1iiis, the emm genes in these thr-ee strainsare highly divergent, but the 5' r'egion of enn74.1 differsfrom ennSO only by a single silent base substitution andenn74.2 is identical to enn64 (Fig, 3). In addition to theexamples described in Fig. 3, many other strains exam-ined in this study display non-congruency between rela-tionships among enn genes and other ioci, suggestingthat the horizontal transfer of enn genes befween distinctstrains is a common event in the evolution of wrregulons,

vir regulon mosaics

In some strains both tiie emm and the enn genes appear tohave been acquired by horizontai transfer from the samedivergent donor strain. For example, it has been estimatedthat the M44 and ivl61 strains studied iiere have divergedto a considerable extent in overall genomic relationships(genetic distance >0.3), but the 5' regions of both theiremm and their enn genes are idenfical (Fig, 3, panel A),It is possible that in some cases co-inheritance of anemm and enn gene from a single donor might be followedby divergence in one of these genes in response foselective pressures while the second remains essentiallyunchanged. However, it is unlikely that fhis accounts forobserved ieveis of variation in the majority of cases, forexample those described in Fig. 3 panels 3 and C,where the levels of divergence in one gene and of conser-vation in the second are extensive in comparison with theoverall patterns of variation among their alleles in other S.pyogeties strains. Moreover, the emm and enn sequencesin many strains are ciosely reiated to correspondingsequences in different divergent strains. One example of

1044 A. M. Whatmore, V. Kapur. J. M. Musser and M. A. Kehoe

H- r.ik,eProiiein

Enn 4 4EnnSl

Enn44EnnPTIGO

EM]n44Emni6i

B

distance

0 . 3 3

0 . 0

0 . 4 4

Deduced N- te rmina l amino a c i d sequence

CEnn74.1Enn';i . 2

EnnaO

0.3 3 LGAGFANQTEVRA EGVKATKNLSEEAKYAALRDENTGLRGDQTKI.VKKL

LGAGFANQTEVRA EGVKATKNL SEEA~KYA—ALRDENTGLRGDQT^. . . . DQAASE_._VEVi<E^^KET^.KTL.^.G^.AD^^NVHA

LGAGFANQTEVKA AESRTFLKVSVSLELVDKLSDENDILREKQDEVLTK

ILG.AGFANQTEVKG ESVRMGSELSYSREHEDYIRQLEEQRGELLEKVDQL

0. a L G . \ G F A H Q T E V K G ESViUJGSELSYSREHEDYIRQL-EEQRGELLEKVDQA At::.KA.\-.VKA ^ . EEVK^SVP.T- . Y.K

0 . 4 4 GAG-FANQTTVKA ESSTVKA-ESSTVKAESSTISKERELINTLVDENNK. . .L ' . 'V .TNE.G . ATL^RNQR.._^LDFLNGI,VD^NDLE—.MQJf.DKE.

LGAGFAMQTEVKA DKNNPVSVSNEAKLHDEIAELLEKNGEYLDKIEEI.r:^ ^ _.. S D — ^

0 . 3 2 LGAGFANQTEVKA DKNWP

0 . 7-4 LGAGF.^NQTEVKA SDPQSVPKSGSVGTNTKIYDLYKELSDKHEKLSDEY

.A-GLVVN-Tr^EVSA FTVTRSMTRDYLAKVVQDFDTKHHELETIINSELSAT

. ^ . A - S Q ^ - . . K . DRLHPGY...AANR. ARNEFLVPAGAV. HEREKNDELR

. . F A - . Q . - . . K . trsRDITG^LPATMWKQKAEEA.AKLATSKS^LKKHE

Fig. 3. Examples oi noncongruency betweenvariation in 5' &nn sequences, 5' emnisequences, and estimates of overallchromosomal relalionships of ho.sl strains.Deduced amino acid sequencescorresponding to Ihe proximal regions ofthe 5' emm (Whatmore et al., 1994) or ennsequences (this siudy) are shown, with !hoarrow indicating the position of the ptetlictedsignal-peptidase cieavage sites. The dashesrepresent gaps introduced to maximizesequence alignments and the dots indicateidentity with the upper sequence in eachpanel. Underlined dots indicale ihal thecorresponding codon contains a synonymoussubstitution compared to the referencesequence. The genelic distances specifiedrefer to estimates of overall chromosomalr-elationships calculated fror-n muitilocusenzyme eiectrophoresis analysis describedpr"eviousiy (Whatmore etal., 1994), Theenn5757 sequence included for comparison inpanel B was excluded fror-n ihe analysisdepicted in Fig, 2 because it was shor-ter than180bp,

EraniL64

DEnn58Enn2233

Emni58

1> 0 . 3 8 LGAGFANQTEVRA EGVNTTTSLTEKA KYDALRDENTGLRGr>U

DEAA,SG_^VNNRSSKSVESF. . .^..-G^.AD^^MVrJ

LGAGFANQTEVKA DSSREVTfJELTA3MWKAQAD3AiCAl<A-KELRl<QVEET . . ^ .^DHNRSAVKKNNEEELHNKI.DLLDQN.EYLNKID

E . P A KKVEE. . E . . S . - . . , . L . .

this is shown in Fig. 3 panel D, The M58, tv12841 andM2233 Strains included in this study have under'goneconsiderable diver-gence in over'all genor-nic r-elationships,with each strain separated from the other two by geneticdistances of between 0,38 and 0.44 (Whatmore ef ai.,1994), The enn sequences in IV158 and M2233 are identi-cal to each other and very different from enn284l, butthe opposite pattern is observed in the case of fhe emmgenes, with emm58 being very distincf from emm2233.but closely related to emm2841. The contrast between!hese patterns is emphasized by the fact that all three ofthese strains are OF*' and the overall pattern of ennnucleotide variation among OF"^ strains is restricted incomparison to extensive variation in the 5' regions ofemm genes. It is likely that the emm and enn sequencesin the (v!58 strain have either been combined followinghorizontal tr-ansfer from distinct donor str-ains or separatedfollowing horizonfal tr'ansfer to distinct recipient strains. Acomparison of the resuits described here with datareported previously (Whatmore ef ai., 1994) suggeststhat the vir reguions in many strains possess a mosaic

structure, resulting from the acquisition of differer-it emm-lii<e genes from different donor strains by horizontal genetransfer.

Recombination between variabie regions of ennsequences

Based on amino acid sequence homologies, M-likeproteins have been divided into discrete segments,nameiy a consen/ed signal-peptide segment, a variabieN-terminal segrTienl, and conserved central A or Ctandem r-epeat, and C-terr-ninai wall-associating segments(OToole et ai. 1992; Bessen and Fischetfi, 1992; Pod-taielski et ai, 1994a). Recombination events generatingmosaic en7m-iike genes have been identified at or adja-cent to sites corr-esponding to the junctions of thesesegments (Whatmore and Kehoe, 1994; Pobdielski etal..1994a), but clear evidence of recornbinafion af sifescorr-esponding fc internal regions of these segments wasnot available and it was not clear if recombination follow-ing horizontal transfer contributes to diversity within the

Str'eptococcus pyogenes enn genes 1045

ennfil ,.A.Gq.T. .C. — .C. . .g.ACTT. ,C.G,g--. . .0. . . .a. . .GCCG.a

GAAGTAAGAGCT GATGAAGCAGTTTCTGGAAAAGTGGAAGTAAAAGAAAGTGAAAAAGAGACTAAGTATAAGACG.AA. . , .A. T.cC. . . .

enn22enn2

... ....Ac..g...A..-GtT.aC.TGGt.Ate.GA......TAG..A.a...C..G.A.A....C...

TTGGCCTWJiGAGC-TGAAAATGCTGACCTTAGAAACGTAAATGCAAAATATTTAGAGAAAATTAACGCAGAAGAAG. . . , . t , t

enn 2

. g . . g. Gc. TC, ,gC.

AAAAAAATAAAAA ATTAGAAAAAGAAAAACAAGAGTTAGAAAACCAAGCCCTTAACTTTCAAGATGTAATTGAAA

go. t. . . GC . ATt. . TA c . . A . TG . GA. TTAtTAc . . A . . A. . g Gc TG

Fig. 4. Recombination belween variable regions of enn sequences. The variable 5' regions of enn61 and enn2 are compared to thecorresponding region of the hybrid enn22 sequence. The arrow indicates Ihe poairion of Ihe predicted signal-peptidase cleavage sites. Thedashes repr'esent gaps introduced lo maximize sequence alignments and dots indicate identity with the enniP^ sequence. IMon-synonymousbase substitutions are indicated by upper-case letters and synonymous base substitutions by lower-case letters. It is interesting to note that theM22 strain included in this study is separated from the (vl2 and M61 strains by a genelic distance of 0,44, and the fvl2 and M61 strains areseparaled by a genetic distance of 0,38 (Whatmore ef a/,. 1994),

variabie N-terminai segments of M-like profeins. Variationin this region of the 79 previously characterized emmsequences proved too extensive to permit a robust statis-tical analysis for evidence of I'ecombinafion (Whafmore efai. 1994), However, the more restricted variation in the5' r'egions of the enn sequences aliowed tiiese to beanaiysed by the statistical methods of Savyyer (1989),vi/hicii examines the distribution of polymorphic sitesaiong segmenfs of mulfiple alleies and calculafes the prob-ability of non-uniform distributions being due to chance orrecombination between aileles. This analysis providedstrong evidence (P = 0,000; Sawyer, 1989) that the 5' vari-able region of enn22\s a iiybrid, resulting fr'orn infragenicrecombination between genes that are identical cr closelyrelated to enn£'and enn61 (Fig, 4), These data provide thefirsf clear evidence that recombination between emm-iikegenes is not confined tc 'hotspots' corr'esponding to thejunctions of discrefe segments in M-like proteins, but canalso contribute to generating diversity witiiin the variabieN-terrninai regions of these proteins.

Tiie results described here indicate that enn, as well asemm genes, can be transferred horizontally betweendivergent S. pyogenes strains. It seems likeiy that this pr'o-cess would greatly increase the rate of generating newcombinations of emm-Wke genes in vir r'egulons, com-pared to thaf which would resulf tr-om the horizontal trans-fer of emm genes alone. Furthermore, there is nowevidence of horizontai transfer involving either emm orenn genes, or both, in many of the strains in our studypopulation. Thus, taken together, our previous studies onemm genes (Whatmore et al., 1994; Whatmore andi<;ehoe, 1994). studies in other lat^oratories (Simpson e;ai, 1992: Heden and Lindahi. 1993; Pcdbielski et ai.

1994a; Bessen and Holiingshead, 1994), and tiie data inthis paper pr-ovide strong evidence that horizontal genetransfer has played a major role in generating diversity inS, pyogenes vir reguions.

A particularly interesting finding of the present study isthe considerable difference in fhe pattern of variafionamong emm and enn genes, despite tiie fact that variationin both oases involves similar mechanisms. Since immu-nity to group A streptococcai infections in humans is duepredominantly to opsonic antibodies against tlie pr-otrud-ing N-terminal r-egions of cell-surface M or M-lii\e pro-teins, the difference in tiie pattern of variation amongemm and enn genes has significant impiications for ourunderstanding of the roles of the corr-esponding surfaceproteins in pathogenesis and immunity, There are atieast tiiree plausible expianations tor this, none of wiiichare mutually exclusive. One possibiiity is thaf enn geneshave evolved more recently than emm genes in group Astreptococci and have yet to accumr.rlate tiie ieveis of vari-ation obsei-ved among emm genes. It this is the case, theconsider'able divergence of individual enn sequenceclusters in Fig. 2, particularly among OF str'ains, mightbe explained by these dusters arising independentlyfrom recent recombination events iinking variable seg-ments from disfinct emm genes to conserved enn genesequences. This possibiiity is suppor-ted by recent studieson two emm-iike gene mosaics {enn5 and enn64/14),where conserved segments characteristic of enn genesfiani-; variabie sequences tiiat appear to be closely r-elatedto the variable segments of emm genes in distinct strains(Whatmore and Kehoe, 1994; Podbielski etai, 1994a), Asecond possibility is that ther-e are differences in theextent to which Emm and Enn pr-ofeins have been

1046 A. M. Whatmore, V. Kapur. J. M. Musser and M. A. Kehoe

exposed to seiective pressures driving variation. This hypo-thesis is consistent with the c. 30-fold difference in expres-sion of emm and enn genes in a well characferized M2strain (Bessen and Fischetti, 1992), the absence of effi-cient transiation initiation signals at the 5' end of enn4(Jeppson ef ai, 1992), and fhe very low or undefectableexpression of the e/7n gene in an fvl5 strain (Whafmoreand Kehoe, 1994), However, furfher studies would ber'equired to determine if expression of tiie majority of enngenes In vivo occurs af simiiarly low levels tc thoseobsei-ved in the case of IVI2, M4 and M5 strains grown invitro. The third plausibie explanation for restricted vari-ation among the majority of enn sequences is that hori-zonfal transfer and recombination between enn genesoccurs at a significantly higher ra-ie than emm genes,resulting in a homogenization of variation among ennsequences. If this were the case, transfer of ennsequences wouid appear to occur preferentially betweenparticuiar pairs of donor and recipient strains, corr'espond-ing to the individuai sequence ciusters depicted in Fig, 2,At present, there is insufficient data fo assess whetherther'e is any difference in the rate of horizontai transfer ofemm and enn genes among S. pyogenes strains. Thestudies reported here suggest that emm and enn genescan be co-transferred fr-om donor to recipient strains, butit is not clear if this is representative of the majority of hori-zontal transfer evenfs where the apparent independentacquisition of emm and enn genes could be explainedeither by recombination events subsequent to co-transferor independent transfer of emm and enn genes betweenstrains. As discussed previously (Whatmore etai. 1994),transformation appears to play an important role in hori-zontal gene transfer in species such as Streptococcuspneumoniae and Neisseria gonorrhoeae (Seifert et ai,1988; Maynard-Smith etai, 1991), However, S. pyogenesis not naturally competent and the report by Totolian(1979) that M-type specificity can be transduced betweenstrains suggests thaf the horizontal transfer of emm-likegenes might involve differenf riiechanisr-iis. Since fheseevents have considerable implications for our understand-ing of pathogenesis, further sfudies on the mechanismsinvolved are required to understand tiie diversity of thishighly versatile group of human pathogens.

Experimental procedures

Bacterial strains, plasmids and growth conditions

The group A streptococcal strains used in this study aredescribed in Table 1, The E. coti strain XL-1 {recA1 endAIgyrA96 thi-1 hsciRU supE44 retAI .A(lac-proAByF'{proABlad" iacZi\M15 ln10)) (Builock et ai, 1987) and theplasmid pBluescript (Stratagene) were used as the host andvector for cloning PCR-amplified products, E. coli and S,pyogenes strains were grown as described pr-eviously (What-more and Kehoe, 1994).

PCR, DNA cloning, sequencing and analysis ofsequence reiationships

The enn-specific for'ward and rever-se PCR primers 3(F)and 4(R), respectively, have been described pr-eviously(Whatrnore and Kehoe, 1994), PCR reactions with extractsof group A streptococcal str-aiiis, cloning and sequencingwere perforr-i-ied as described previously (Whatmore andKehoe, 1994), All sequences are avaiiable from theGenBank Nucleotide Sequence Data Lihr'ary, with the acces-sion numbers listed in Table 1, Preiiminar'y sequence analysisemployed the pc GENE package of programs (IntelliGerieticsInc), The r-elative locations of the 5' and 3' ends of deter-mined sequences varied and for detailed analysis alisequences were reduced to a common core of 180bp, begin-ning with a very highly conserved 36bp 5' 'anchor' corre-sponding to the C-terriiinat ends of the signal peptide-encoding r'egions of enn genes. In analysing sequence reia-tionships, gaps were permitted to account for deletion/inser-tion events. The aligned sequences were examined forphylogenetic relationships using parsimony with thecomputer program PAUP 31 (Swoffor-d, 1991), Additional phylo-genetic anaiysis ot the sequences was perforrrred by theneighbour--joining method with fhe coriiputer programs NJOINand (•jjBooT written by T, S. Whittam (Institute for MolecularEvoiutionary Genetics, Pennsylvania State University, tjniver--sity Park, Pennsylvania, tJSA). Rates of synonymous CS) andnonsynonyrTious CN) substitutions per site wer-e determinedas described by Nei and Gojobori (1986), Sawyer's analysisfor clustering of polymorphic sifes was performed with thecomputer pr-ograms STAN (written by A, G, Clark and T, S.Whittam, Institute of Molecular Evolutionary Genetics, Penn-sylvania State Univer'sity) and VTDIST (written by S. A,Sawyer, Department of Mathematics, tJniversity otWashington, St, Louis, Missouri, USA),

Acknowledgements

This work was supported by Grant G8914199CA from the UKiVledical Resear'ch Councii (M.K,), labor'ator-y refurbishmentgr-ant 03477/Z/91 fr-om The Wellcome Trust (M,K,), and NIHGr-ant Ai-33119 and American Heart Association Grant 92-006640 (J,M,M,), J,M,M, is an Established Investigator ot theAmerican Heart Association.

References

Bessen, D,E,, ar-id Fischetti, VA, (1992) Nucleotide sequencesof two adjacent M or M-like pr-otein genes of group Astr-eptococci: Different RNA transcript Ieveis and identifi-cation of a unique immunoglobuiin A binding protein, infectImmun 60: 124-135.

Bessen, D.E., and Hoilingshead, S,K, (1994) Allelic poly-morphism of emm loci provides evidence for horizontalgene spread in group A str'eptococci, Proc Natl Acad SdUSA 91: 3280-3284,

Bullock, W,0,. Fernandez, J,M,, and Short, J,M, (1987) XL-1Blue: A high efficiency piasmid transforming recA Escheri-chia coll strain with [i-gaiactosidase seiection, Biotech-niques5: 376-379,

Felsenstein, J, (1988) Phyiogenies from molecular sequences:inference and reliability, Annu Rev Genet 22: 521-565.

Streptococcus pyogenes enn genes 1047

Fisohetti, V.A. (1989) Streptococcal M protein: moleculardesign and biological behaviour. Clin Microbiol Rev 2:285-314,

Frithz, E,, Reden, LO,, and Lindahl, G. (1989) Extensivesequence homoiogy between IgA receptor and M proteinsin Streptococcus pyogenes. Mol Microbiol 3: 1111-1119,

Gomi, H,, Hozumi, T., Hattori, S., Tagawa, C , Kishimoto, F.,and Bjorck, L. (1990) The gene sequence and someproperties of protein iH. A novel IgG-binding protein. JImmunol 144: 4046-4052.

Haanes, E.. and Cleary, P. (1989) Identifioation of a divergentM protein gene and an M protein related gene famiiy inStreptococcus pyogenes serotype 49. J Bacterioi 171:6397-6408.

Haanes, E.J., Heath, D,G,, and Cleary, P.P. (1992) Archi-tecture of the vir reguions of group A streptococci parallelsopacity factor phenotype and M protein class, J Bacterial174: 4967-4976.

Heath, D.G., and Cleary, P.P. (1989) Fc-receptor and fprotein genes are products of gene duplication, Proc NatlAcad Sci USA 86: 4741 -4745.

Heden, L.-O,, and Lindahl. G. (1993) Conserved and variableregions in protein Arp, the IgA receptor of Streptococcuspyogenes. J Gen Microbioi 139: 2067-2074,

Hollingshead, S,K., Fischetti, V,A,, and Scott, J.R. (1986)Complete nucleotide sequence of type 6 M protein of thegroup A streptococcus: repetitive structure and membraneanchor, J Biol Cliem 261: 1677-1686,

Hollingshead, S.K.. Readdy, T.L., Yung, D.L., and Bessen.D.E. (1993) Structural heterogeneity of the emm genecluster in group A streptococci. Mol Microbiol S: 707-717.

Hollingshead, S.K., Arnold, J.. Readdy, T.L., and Bessen,D.E. (1994) Molecular evolution of a multigene family ingroup A streptococci. Mo! Biol Evol^^: 208-219.

Jeppson, H.. Frithz, E,, and Heden, L.O. (1992) Duplication ofa DNA sequence i^omoiogous to genes for immunoglobulinreceptors and M proteins in Streptococcus pyogenes.EEMS Microbiol Lef/92: 139-146,

Johnson, D.R,. and Kaplan, E.L, (1993) A review of thecorrelation of T-agglutination patterns and iVI-protein typingand opacity factor production in the identification of group Astreptococci. J Med Microbioi 38: 311-315,

Kehoe, M.A. (1994) Cell-wall-associated proteins in Gram-positive bacteria. In Bacterial Cell Wall: New Comph Bio-chem. Vol 27. Ghuysen J.-M., and Hakenbeci<, R. (eds).Amsterdam: Elsevier Science B.V., pp. 217-261.

Lancefield, R.C. (1962) Current knowledge of tiie typespecific M antigens of group A streptococci. J Immunol89: 307-313,

Lindahl, G., and Akerstrom, B. (1989) Receptor tor IgA ingroup A streptococci: cloning of the gene and character-ization of the protein expressed in Escherichia coli. MolMicrobioi 3: 239-247.

Manjula, B.N., Khandke, K.M,, Faifwell, T., Relf, W.A., andSriprakash, K.S. (1991) Heptad motifs witiiin the distalsubdomain of the coiled-coil rod region of M protein framrheumatic fever and nephritis associated serotypes ofgroup A streptococci are distinct from each other:nucleotide sequence of the M57 gene and relation of thededuced amino acid sequence to other M proteins. J Prot

10: 369-384,

Maynard-Smith, J.M.. Dowson, C , and Spratt, B,G. (1991)Localized sex in bacteria, Nature 349: 29-31-

Millar, L. Gray, L., Beachey, E., and Kehoe, M, (1988)Antigenio variation among group A streptococcai Mproteins: nucleotide sequence of the serotype 5 M proteingene and its relationsiiip with genes encoding types 6 and24 M proteins. J Biol Chem 263: 5668-5673.

Mouw, A., Beachey, E,H., and Buidett, V. (1988) Molecularevolution of streptococcal M protein: cloning and nucteotidesequence of the type 24 M protein gene and relation toother genes ot Streptococcus pyogenes. J BRcleriol 170:676-684,

Nei, M,, and Gojobori, T, (1986) Simple methods for esti-mating the nutnbers of synonymous and non-synonymoussubstitutions. Mol Biol Evol3: 418-426.

O'Toole, P., Stcnberg, L. Rissler, M., and Lindahl, G. (1992)Two major classes in the M protein family in group Astreptococci. Proc Natl Acad Sci USA 89: 8661-8665.

Podbielskl. A. (1993) Three different types of organisaiion oftiie wrregulon in group A streptococci. Mo/Gen Gene/237:287-300,

Podbielski, A., Melzer, B., and Lutlicf^en. R, (1991) Applica-tion of the polymerase chain reaction to study the Mprotein(-like) gene family in beta-hemolytic streptococci.Med Microbiol Immunol 180: 213-227.

Podbielski. A,, Kaufhold, A,, and Cleaiy, P.P. (1993a) PCR-mediated amplification of group A streptococcal genesencoding immunoglobulin-binding pi'oteins, Immitnomethods2: 55-64.

Podbielski, A., Eeber-Heynemann, J., and Cieary, P.P. (t993b)Immunogiobulin-binding FcrA and ENN proteins and Mproteins of group A streptococci evolved from a commonancestral protein. Med Microbiol Immumol ^82: 1-10.

Podbielski, A., Hawlitzi<y, J,, Pack, T,D., Flosdorif, A,, andBoyle, M.D.P. (1994a) A group A streptococcal Enn proteinpotentially resulting from intergenomic recombinationexhibits atypical immunoglobulin-binding characteristics,Mol Microbiol 12: 725-736.

Podbielski, A., Krebs, B., and Kaufhoid, A. (1994b) Geneticvariability of the emm-related genes of the large W/'regulonof group A streptococci: potential infra- and intergenomicrecombination events. Mol Gen Genet 2^3: 691-698.

Poiriar, T.P,, Kehoe, M.A., Whitnack, E.. Doci<ter, M.E., andBeachey, E.H. (1989) Fibrinogen binding and resistance tophagocytosis of Streptococcus sanguis expressing clonedM protein of Streptococcus pyogenes. Infect Immun 57:29-35.

Retnoningrum, D.S., Podbielski, A., and Cleary, P,P, (1993)Type M12 protein from Streptococcus pyogenes is areceptor for lgG3. J Immunoi 150: 2332-2340.

Robbins, J.C. Spanier, J.G., Jones, S.J., Simpson, W.J,, andCleary, P.P. (1987) Streptococcus pyogenes type 12Mprotein gene regulation by upstream sequences. JSacter/o/169: 5633-5640.

Robinson, J.H., and Kehoe, M.A. (1992) Group A strepto-coccal M proteins: virulence factors and protectiveantigens, tmmunol Today ^3: 362-367.

Sawyer, S. (1989) Statistical tests for detecting geneconversion. Mol Bioi Evol 6: 526-538.

Scott, J.R., Guenthner. P,C., Malone, LM., and Fischefii, V,A.(1986) Convetsion of an M group A streptococcus to M'

1048 A. M. Whatmore, V. Kapur. J. M. Musser and M. A. Kehoe

by transfer of a piasmid containing an M5 gene. J Exp Med164: 1641-1651.

Seifert, H.S., Ajioka, R.D., MarchaL C Sparling, P.F.. andSo, [vl, (1988) DNA transformation leads to pilin antigenicvariation in Neisseria gonorrhoeas. Naiure 336: 392-395.

Simpson, W.J,, LaPenta, D., Chen, C, and Cleaiy, P.P.(1990) Coregulation of type 12 M protein and strepfocoocalC5a peptidase genes in group A streptococci: evidence fora virulence reguion controlled by the virR locus. J Bacteriol172: 696-700,

Simpson, W,J., Mussen J,M,, and Cleary, P,P. (1992)Evidence consistent with horizontal transfer of the gene{emm12) encoding serotype Ivn2 protein between group Aand group G pathogenic streptococci. Infect Immun 60:1890-1893.

Swotford, D.L. (1991) PAUP: phylogenetic analysis usingparsimony, version 3.1. Computer program distributed bythe lilinois Natural History Survey, Champaign, Illinois.

Swofford, D.L,, and Olsen, G.J, (1990) Phylogeny reconstruc-tion. In Molecular Systematics. Hills, D,M,, and Moritz, C.(eds). Sunderland, Masschusetts: Sinauer Associates, pp.411-501.

Totolian, A.A. (1979) Transduction of M-protein and serumopacity-factor in group A streptococci. In PathogenicStreptococci. Parker, M.T. (ed.). Surrey: Reedbooks, pp.38-39.

Whatmore, A.M.. and Kehoe, M.A. (1994) Horizontal genetransfer in the evolution of group A streptococcal emn7-llkegenes: gene mosaics and variation in Vir reguions. MolMicrobiol 11: 363-374.

Whatmore, A.M,, Kapur, V,, Sullivan, D.J,, Musser, J.M,. andKehoe, M.A. (1994) Noncongruent relationships betweenvariation in emtr) gene sequences and the populationgenetic structure of group A streptococci, Mol MicrobionA:619-631.