Genome sequence of Streptococcus agalactiae, a pathogen causing invasive neonatal disease

15
Molecular Microbiology (2002) 45 (6), 1499–1513 © 2002 Blackwell Science Ltd Blackwell Science, LtdOxford, UKMMIMolecular Microbiology 0950-382X Blackwell Science, 200245Original Article Genome sequence of Streptococcus agalactiaeP. Glaser et al. Accepted 24 June, 2002. *For correspondence. E-mail fkunst@ pasteur.fr; Tel. ( + 33) 1 45 68 89 96; Fax ( + 33) 1 45 68 87 86. Genome sequence of Streptococcus agalactiae , a pathogen causing invasive neonatal disease Philippe Glaser, 1 Christophe Rusniok, 1 Carmen Buchrieser, 1 Fabien Chevalier, 1 Lionel Frangeul, 1 Tarek Msadek, 2 Mohamed Zouine, 1 Elisabeth Couvé, 1 Lila Lalioui, 3 Claire Poyart, 3 Patrick Trieu-Cuot 3 and Frank Kunst 1 * 1 Laboratoire de Génomique des Microorganismes Pathogènes, and 2 Unité de Biochimie Microbienne, Institut Pasteur, 28 Rue du Dr Roux 75724 Paris Cedex 15, France. 3 Laboratoire Mixte Pasteur-Necker de Recherche sur les Streptocoques et Streptococcies, Faculté de Médecine Necker, 156 Rue Vaugirard, 75015 Paris, France. Summary Streptococcus agalactiae is a commensal bacterium colonizing the intestinal tract of a significant propor- tion of the human population. However, it is also a pathogen which is the leading cause of invasive infec- tions in neonates and causes septicaemia, meningitis and pneumonia. We sequenced the genome of the serogroup III strain NEM316, responsible for a fatal case of septicaemia. The genome is 2 211 485 base pairs long and contains 2118 protein coding genes. Fifty-five per cent of the predicted genes have an ortholog in the Streptococcus pyogenes genome, representing a conserved backbone between these two streptococci. Among the genes in S. agalactiae that lack an ortholog in S. pyogenes , 50% are clus- tered within 14 islands. These islands contain known and putative virulence genes, mostly encoding sur- face proteins as well as a number of genes related to mobile elements. Some of these islands could therefore be considered as pathogenicity islands. Compared with other pathogenic streptococci, S. aga- lactiae shows the unique feature that pathogenicity islands may have an important role in virulence acqui- sition and in genetic diversity. Introduction Lancefield’s group B streptococci (GBS) (Lancefield and Hare, 1935), also referred to as S. agalactiae , is well adapted to asymptomatic colonization of adult humans. It is commonly found in the gastrointestinal and the geni- tourinary tracts, but it is also the predominant cause of invasive bacterial disease in the neonate. Streptococcus agalactiae is the leading cause of septicaemia, meningitis and pneumonia in neonates, responsible for two to three cases per 1000 live births. It is also a serious cause of mortality or morbidity in non-pregnant adults, particularly in elderly persons and those with underlying diseases (Schuchat, 1998; Nizet and Rubens, 2000; Farley, 2001). In North America, this bacterium is considered as one of the major causes of bovine intramammary infections (Keefe, 1997). Group B streptococci are subclassified into serotypes according to the immunologic reactivity of the polysaccha- ride capsule. Of the nine serotypes described so far, the types Ia, Ib, II, III, and V are responsible for the majority of invasive human GBS diseases. Serotype III GBS is particularly important because it causes a significant percentage of early onset disease (i.e. infection occurring within the first week of life) and the majority of late-onset disease (i.e. infection occurring after the first week of life). Overall, the capsular serotype III is responsible for most cases (80%) of neonatal GBS meningitis (Schuchat, 1998; Nizet and Rubens, 2000). Colonization of the rec- tum and vagina of pregnant women with GBS, which causes infection of the amniotic cavity, is correlated with GBS sepsis in newborn infants with early onset disease. In this case, newborns are colonized intrapartum by aspi- ration of contaminated amniotic fluid. The lung is a prob- able portal entry for GBS into the bloodstream as these bacteria can adhere to and invade alveolar epithelial (Rubens et al ., 1992) and endothelial cells (Gibson et al ., 1993). Pneumonia results from local infections, whereas sepsis and meningitis may be due to the spread of bac- teria followed by systemic infection. We have determined the complete genome sequence of the serotype III strain NEM316 isolated from a case of fatal septicaemia. Complete genome sequences of Strep- tococcus pyogenes strains M1 (Ferretti et al ., 2001) and M18 (Smoot et al ., 2002), of two strains of Streptococcus pneumoniae , the virulent strain TIGR4 (Tettelin et al ., 2001) and the non-capsulated strain R6 (Hoskins et al ., 2001) have been published. Comparison of the NEM316

Transcript of Genome sequence of Streptococcus agalactiae, a pathogen causing invasive neonatal disease

Molecular Microbiology (2002)

45

(6) 1499ndash1513

copy 2002 Blackwell Science Ltd

Blackwell Science LtdOxford UKMMIMolecular Microbiology 0950-382X Blackwell Science 200245Original Article

Genome sequence of Streptococcus agalactiaeP Glaser et al

Accepted 24 June 2002 For correspondence E-mail fkunstpasteurfr Tel (

+

33) 1 45 68 89 96 Fax (

+

33) 1 45 68 87 86

Genome sequence of

Streptococcus agalactiae

a pathogen causing invasive neonatal disease

Philippe Glaser

1

Christophe Rusniok

1

Carmen Buchrieser

1

Fabien Chevalier

1

Lionel Frangeul

1

Tarek Msadek

2

Mohamed Zouine

1

Elisabeth Couveacute

1

Lila Lalioui

3

Claire Poyart

3

Patrick Trieu-Cuot

3

and Frank Kunst

1

1

Laboratoire de Geacutenomique des Microorganismes Pathogegravenes and

2

Uniteacute de Biochimie Microbienne Institut Pasteur 28 Rue du Dr Roux 75724 Paris Cedex 15 France

3

Laboratoire Mixte Pasteur-Necker de Recherche sur les Streptocoques et Streptococcies Faculteacute de Meacutedecine Necker 156 Rue Vaugirard 75015 Paris France

Summary

Streptococcus agalactiae

is a commensal bacteriumcolonizing the intestinal tract of a significant propor-tion of the human population However it is also apathogen which is the leading cause of invasive infec-tions in neonates and causes septicaemia meningitisand pneumonia We sequenced the genome of theserogroup III strain NEM316 responsible for a fatalcase of septicaemia The genome is 2 211 485 basepairs long and contains 2118 protein coding genesFifty-five per cent of the predicted genes have anortholog in the

Streptococcus pyogenes

genomerepresenting a conserved backbone between thesetwo streptococci Among the genes in

S agalactiae

that lack an ortholog in

S pyogenes

50 are clus-tered within 14 islands These islands contain knownand putative virulence genes mostly encoding sur-face proteins as well as a number of genes relatedto mobile elements Some of these islands couldtherefore be considered as pathogenicity islandsCompared with other pathogenic streptococci

S aga-lactiae

shows the unique feature that pathogenicityislands may have an important role in virulence acqui-sition and in genetic diversity

Introduction

Lancefieldrsquos group B streptococci (GBS) (Lancefield and

Hare 1935) also referred to as

S agalactiae

is welladapted to asymptomatic colonization of adult humans Itis commonly found in the gastrointestinal and the geni-tourinary tracts but it is also the predominant cause ofinvasive bacterial disease in the neonate

Streptococcusagalactiae

is the leading cause of septicaemia meningitisand pneumonia in neonates responsible for two to threecases per 1000 live births It is also a serious cause ofmortality or morbidity in non-pregnant adults particularlyin elderly persons and those with underlying diseases(Schuchat 1998 Nizet and Rubens 2000 Farley 2001)In North America this bacterium is considered as one ofthe major causes of bovine intramammary infections(Keefe 1997)

Group B streptococci are subclassified into serotypesaccording to the immunologic reactivity of the polysaccha-ride capsule Of the nine serotypes described so far thetypes Ia Ib II III and V are responsible for the majorityof invasive human GBS diseases Serotype III GBS isparticularly important because it causes a significantpercentage of early onset disease (ie infection occurringwithin the first week of life) and the majority of late-onsetdisease (ie infection occurring after the first week of life)Overall the capsular serotype III is responsible for mostcases (80) of neonatal GBS meningitis (Schuchat1998 Nizet and Rubens 2000) Colonization of the rec-tum and vagina of pregnant women with GBS whichcauses infection of the amniotic cavity is correlated withGBS sepsis in newborn infants with early onset diseaseIn this case newborns are colonized intrapartum by aspi-ration of contaminated amniotic fluid The lung is a prob-able portal entry for GBS into the bloodstream as thesebacteria can adhere to and invade alveolar epithelial(Rubens

et al

1992) and endothelial cells (Gibson

et al

1993) Pneumonia results from local infections whereassepsis and meningitis may be due to the spread of bac-teria followed by systemic infection

We have determined the complete genome sequenceof the serotype III strain NEM316 isolated from a case offatal septicaemia Complete genome sequences of

Strep-tococcus pyogenes

strains M1 (Ferretti

et al

2001) andM18 (Smoot

et al

2002) of two strains of

Streptococcuspneumoniae

the virulent strain TIGR4 (Tettelin

et al

2001) and the non-capsulated strain R6 (Hoskins

et al

2001) have been published Comparison of the NEM316

1500

P Glaser

et al

copy 2002 Blackwell Science Ltd

Molecular Microbiology

45

1499ndash1513

genome sequence was made with those of these relatedpathogenic streptococci and with that of the non-pathogenic species

Lactococcus lactis

(Bolotin

et al

2001) providing clues to the evolution of

S agalactiae

andthe acquisition of virulence Besides a better knowledgeof the molecular mechanisms responsible for virulencethis work should contribute to the finding of new targetsfor antimicrobial compounds and for the development ofa GBS vaccine

Results and discussion

General features of the genome

The genome of

S agalactiae

strain NEM316 consists ofa circular chromosome of 2 211 485 base pairs (bp)(Fig 1) (EMBL accession number AL732656) Its G

+

Ccontent of 356 is significantly lower than that of thegenomes of

S pyogenes

(385) (Ferretti

et al

2001)and of

S pneumoniae

(397) (Tettelin

et al

2001)whereas

L lactis

a more distantly related species hasan almost identical G

+

C content of 354 (Bolotin

et al

2001) Seven sets of 23S 5S and 16S ribosomal RNAoperons were identified all of which are organized withina 450 kb region located on the right replichore of thechromosome next to the origin of replication (Fig 1) For

S pneumoniae

only four rRNA operons were reported

which are all located within a 400 kb region three on theleft replichore and one on the right one (Tettelin

et al

2001 Hoskins

et al

2001) In contrast

S pyogenes

and

L lactis

each contain six rRNA operons distributed onboth replichores within 750 and 920 kb regions respec-tively (Bolotin

et al

2001 Ferretti

et al

2001 Smoot

et al

2002) We identified 80 tRNA genes in NEM316 anumber significantly higher than the 58 tRNA genespresent in

S pneumoniae

the 60 in

S pyogenes

and the62 in

L lactis

The genes coding tRNAs recognize only31 out of 61 possible sense codons The redundancy oftRNA genes as compared to genome size is an interestingfeature of the

S agalactiae

genome the biological conse-quence of which is not yet known

In the genome of NEM316 we identified 2082 proteincoding genes and 36 pseudogenes Forty-nine genes aretriplicated corresponding to a region present in three cop-ies in the genome Putative functions could be assignedto 62 of the predicted proteins 25 are similar tounknown proteins 4 are similar to unknown proteinsfrom other streptococci or lactococci and 9 are uniqueto

S agalactiae

Protein coding genes were classifiedaccording to the functional categories described for

Bacil-lus subtilis

and

Listeria monocytogenes

A strong bias ingene orientation was observed as 81 of the codingsequences (CDS) are transcribed in the same orientationas the movement of the replication fork This type of gene

Fig 1

Circular genome map of

S agalactiae

NEM316 showing the position and orientation of genes From the outside circle 1 protein coding genes on the

+

and - strands genes in yellow belonging to the 14 islands numbered from I to XIV circle 2 GC bias (G

+

CG

-

C) circle 3 G

+

C content with

lt

288 G

+

C in yellow between 288 and 425 G

+

C in orange and with

gt

425 G

+

C in red circle 4 stable RNA coding genes The scale in kb is indicated on the outside of the genome with the predicted origin of replication being at position 0

Genome sequence of

Streptococcus agalactiae 1501

copy 2002 Blackwell Science Ltd

Molecular Microbiology

45

1499ndash1513

orientation bias appears to be a common feature of lowG

+

C Gram-positive bacteria The origin of replicationwas predicted upstream from

dnaA

by similarity to thelocation of

oriC

in

B subtilis

Analysis of the GC skew(Fig 1) combined with the analysis of the orientation ofthe CDSs and the position of the

B subtilis ripX

homo-logue encoding a protein involved in Holliday junction res-olution allowed us to predict the region of termination ofreplication at around kb 1015 The location of the replica-tion terminus appears to be skewed from the expectedposition at 180

infin

from

oriC

resulting in a 174 kb shorterright replichore of the chromosome which contains theseven rDNA operons (Fig 1)

Bacterialndashhost interactions and virulence factors

S agalactiae

expresses a variety of extracellular productswhich are implicated in virulence Among these are thecapsular polysaccharide surface proteins and secretedproteins

S agalactiae

possesses two distinct polysaccha-ride antigens of which the sialylated capsular polysaccha-ride is one of the most important

S agalactiae

virulencefactors It prevents deposition of the host complementfactor C3b and inhibits complement-mediated opsono-phagocytosis The locus involved in type III capsule syn-thesis previously identified (Chaffin

et al

2000) waspresent in NEM316 in its entirety It contains 17 genes(

cpsA-L

neuBCDA

) including the transcriptional regula-tory gene

cpsY

(Nizet and Rubens 2000) The secondimportant polysaccharide the group B antigen is com-posed of a number of rhamnose units and is common toall strains (Yamamoto

et al

1999) The genetic determi-nants involved in the biosynthesis of the group B antigenhave not yet been characterized NEM316 contains a clus-ter of 16 genes which includes six involved in rhamnosemetabolism (dTDP-

L

-rhamnose synthases and rhamno-syltransferases) (gbs1481 gbs1484 gbs1485 gbs1492gbs1493 and gbs1494) A second cluster of three genes(gbs1271-gbs1273) which are paralogs of the

rml

genesof

Streptococcus mutans

is probably involved in the anab-olism of dTDP-

L

-rhamnose It seems probable that one orboth gene clusters are necessary for the synthesis of therhamnose-containing group B-specific cell wall polysac-charide antigen

Surface proteins of pathogenic bacteria play an impor-tant role during the infectious process by mediating inter-actions between the pathogen and the host cells andorevasion from the host defence (Navarre and Schneewind1999) Based on sequence analysis we predicted 30open reading frames in NEM316 encoding putative sur-face proteins bearing a cell wall sorting signal motif(LPXTG

=

21 IPXTG

=

4 LPXTS

=

2 LPXTN

=

2 FPXTG

=

1) (Table 1) Among these proteins 13 have predictedfunctions (three peptidases two nucleases one amidase

one pullulanase and six adhesins) whereas 17 have noobvious function (Table 1) Only four of these proteinshave an ortholog in

S pyogenes

and three in

S pneumo-niae

Six do not possess a streptococcal homologue andthus seem to be specific for

S agalactiae

Therefore eachof these pathogens possesses a specific repertoire ofLPXTG proteins Most

S agalactiae

strains express thecell wall-anchored C5a peptidase ScpB which cleaves thecomplement factor C5a the major neutrophil chemo-attractant produced by activation of the complement cas-cade (Chmouryguina

et al

1996 Takahashi

et al

1999)Interestingly besides ScpB NEM316 encodes two addi-tional cell wall-bound serine proteases (gbs0451 andgbs2008) that are similar to the C5a peptidase (55and 49 of similarity respectively) In particular theseenzymes possess at the expected positions the catalytictriad (D H S) postulated to constitute the active site(Krem and Di Cera 2001) However it remains to bedemonstrated whether these ScpB-like enzymes consti-tute functional serine proteases involved in virulence

Streptococcus agalactiae

encodes immunoprotectivesurface proteins (Rib

a

-C and

b

-C) each containing vari-able series of tandem repeat units (Heden

et al

1991Michel

et al

1992 Wastfelt

et al

1996 Lachenauer

et al

2000) NEM316 codes for the

a

-C-like protein Alp2but does not encode Rib nor

b-C which indicates that thestrain belongs to the molecular subserotype III-3 (Konget al 2002) Sequence comparison revealed that thegene encoding Alp2 in a serotype V strain (Lachenaueret al 2000) was probably derived from the gene encodingAlp2NEM316 by homologous recombination between theinternal repeated sequences Variations in the number ofrepeats of this protein can change its antigenicity a mech-anism to escape host immunity (Gravekamp et al 1996Lachenauer et al 2000) Two additional unrelated surfaceproteins (gbs1529 and gbs1087) possessing numerousinternal direct repeats have been characterized inNEM316 The protein gbs1087 (410 aa long) contains16 contiguous copies of the motif LERRQRDAENRKSQGNV but the function of this protein which does nothave any streptococcal homolog is not known The pro-tein gbs1529 (1310 aa long) contains in its C-terminalhalf 150 contiguous repeats of the motifs STSA (121motifs) or SMSA (29 motifs) This protein displays signifi-cant similarities with Hsa (GspB) of Streptococcus gordo-nii a protein associated with bacterial haemagglutinationand adhesion to a-23-linked sialic acid-containing recep-tors (Takahashi et al 2002)

The so-called LPXTG proteins are covalently linked tothe bacterial cell wall by a transpeptidylation mechanismrequiring a C-terminal sorting signal with a conservedLPXTG motif The enzyme that catalyses the protease-transpeptidase activity is a membrane-associated pro-tein named sortase (Srt) (Pallen et al 2001) NEM316

1502 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B

sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus

Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears

Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals

Gene nameSize(aa)

Cleavagemotif Related proteins

Percentage of aa identity (similarity)segment lengtha Putative functionb

gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)

Pas (S intermedius)37 (52)40536 (52)399

UnknownUnknown

gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)

R28 (S pyogenes)74 (77)79869 (75)1103

Unknown

gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)

PulA (S pneumoniae)65 (79)109549 (65)1305

Alkaline amylopullulanase

gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)

30 (43)138531 (45)1285

Sialic acid binding protein

gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)

50 (62)18330 (60)220

UnknownCholine binding protein

gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)

SrpA (S cristatus)50 (60)131443 (53)1248

Sialic acid binding protein

gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)

YbgE (L lactis)36 (54)47835 (54)492

Amidase

gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)

57 (70)69447 (66)630

Cyclo-nucleotidephosphodiesterase

gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)

PspC (S pneumoniae)31 (46)30223 (38)795

UnknownAdhesin

gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)

32 (46)17640 (55)901

Fibronectin-binding proteinUnknown

gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)

23 (35)55427(42)512

Unknown

gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)

BspA (Bacteroides forsythus)72 (81)105024 (41)566

UnknownUnknown

gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown

a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid

Genome sequence of Streptococcus agalactiae 1503

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function

Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)

Lifestyle of S agalactiae as revealed by metabolism

Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of

haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci

Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments

Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding

1504 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)

The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease

Stress adaptation

The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)

Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types

of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes

The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2

were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase

Physiological adaptation and transcriptional regulation

We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes

Genome sequence of Streptococcus agalactiae 1505

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species

We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements

Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases

Mobile genetic elements

Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal

Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome

An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae

Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

1500

P Glaser

et al

copy 2002 Blackwell Science Ltd

Molecular Microbiology

45

1499ndash1513

genome sequence was made with those of these relatedpathogenic streptococci and with that of the non-pathogenic species

Lactococcus lactis

(Bolotin

et al

2001) providing clues to the evolution of

S agalactiae

andthe acquisition of virulence Besides a better knowledgeof the molecular mechanisms responsible for virulencethis work should contribute to the finding of new targetsfor antimicrobial compounds and for the development ofa GBS vaccine

Results and discussion

General features of the genome

The genome of

S agalactiae

strain NEM316 consists ofa circular chromosome of 2 211 485 base pairs (bp)(Fig 1) (EMBL accession number AL732656) Its G

+

Ccontent of 356 is significantly lower than that of thegenomes of

S pyogenes

(385) (Ferretti

et al

2001)and of

S pneumoniae

(397) (Tettelin

et al

2001)whereas

L lactis

a more distantly related species hasan almost identical G

+

C content of 354 (Bolotin

et al

2001) Seven sets of 23S 5S and 16S ribosomal RNAoperons were identified all of which are organized withina 450 kb region located on the right replichore of thechromosome next to the origin of replication (Fig 1) For

S pneumoniae

only four rRNA operons were reported

which are all located within a 400 kb region three on theleft replichore and one on the right one (Tettelin

et al

2001 Hoskins

et al

2001) In contrast

S pyogenes

and

L lactis

each contain six rRNA operons distributed onboth replichores within 750 and 920 kb regions respec-tively (Bolotin

et al

2001 Ferretti

et al

2001 Smoot

et al

2002) We identified 80 tRNA genes in NEM316 anumber significantly higher than the 58 tRNA genespresent in

S pneumoniae

the 60 in

S pyogenes

and the62 in

L lactis

The genes coding tRNAs recognize only31 out of 61 possible sense codons The redundancy oftRNA genes as compared to genome size is an interestingfeature of the

S agalactiae

genome the biological conse-quence of which is not yet known

In the genome of NEM316 we identified 2082 proteincoding genes and 36 pseudogenes Forty-nine genes aretriplicated corresponding to a region present in three cop-ies in the genome Putative functions could be assignedto 62 of the predicted proteins 25 are similar tounknown proteins 4 are similar to unknown proteinsfrom other streptococci or lactococci and 9 are uniqueto

S agalactiae

Protein coding genes were classifiedaccording to the functional categories described for

Bacil-lus subtilis

and

Listeria monocytogenes

A strong bias ingene orientation was observed as 81 of the codingsequences (CDS) are transcribed in the same orientationas the movement of the replication fork This type of gene

Fig 1

Circular genome map of

S agalactiae

NEM316 showing the position and orientation of genes From the outside circle 1 protein coding genes on the

+

and - strands genes in yellow belonging to the 14 islands numbered from I to XIV circle 2 GC bias (G

+

CG

-

C) circle 3 G

+

C content with

lt

288 G

+

C in yellow between 288 and 425 G

+

C in orange and with

gt

425 G

+

C in red circle 4 stable RNA coding genes The scale in kb is indicated on the outside of the genome with the predicted origin of replication being at position 0

Genome sequence of

Streptococcus agalactiae 1501

copy 2002 Blackwell Science Ltd

Molecular Microbiology

45

1499ndash1513

orientation bias appears to be a common feature of lowG

+

C Gram-positive bacteria The origin of replicationwas predicted upstream from

dnaA

by similarity to thelocation of

oriC

in

B subtilis

Analysis of the GC skew(Fig 1) combined with the analysis of the orientation ofthe CDSs and the position of the

B subtilis ripX

homo-logue encoding a protein involved in Holliday junction res-olution allowed us to predict the region of termination ofreplication at around kb 1015 The location of the replica-tion terminus appears to be skewed from the expectedposition at 180

infin

from

oriC

resulting in a 174 kb shorterright replichore of the chromosome which contains theseven rDNA operons (Fig 1)

Bacterialndashhost interactions and virulence factors

S agalactiae

expresses a variety of extracellular productswhich are implicated in virulence Among these are thecapsular polysaccharide surface proteins and secretedproteins

S agalactiae

possesses two distinct polysaccha-ride antigens of which the sialylated capsular polysaccha-ride is one of the most important

S agalactiae

virulencefactors It prevents deposition of the host complementfactor C3b and inhibits complement-mediated opsono-phagocytosis The locus involved in type III capsule syn-thesis previously identified (Chaffin

et al

2000) waspresent in NEM316 in its entirety It contains 17 genes(

cpsA-L

neuBCDA

) including the transcriptional regula-tory gene

cpsY

(Nizet and Rubens 2000) The secondimportant polysaccharide the group B antigen is com-posed of a number of rhamnose units and is common toall strains (Yamamoto

et al

1999) The genetic determi-nants involved in the biosynthesis of the group B antigenhave not yet been characterized NEM316 contains a clus-ter of 16 genes which includes six involved in rhamnosemetabolism (dTDP-

L

-rhamnose synthases and rhamno-syltransferases) (gbs1481 gbs1484 gbs1485 gbs1492gbs1493 and gbs1494) A second cluster of three genes(gbs1271-gbs1273) which are paralogs of the

rml

genesof

Streptococcus mutans

is probably involved in the anab-olism of dTDP-

L

-rhamnose It seems probable that one orboth gene clusters are necessary for the synthesis of therhamnose-containing group B-specific cell wall polysac-charide antigen

Surface proteins of pathogenic bacteria play an impor-tant role during the infectious process by mediating inter-actions between the pathogen and the host cells andorevasion from the host defence (Navarre and Schneewind1999) Based on sequence analysis we predicted 30open reading frames in NEM316 encoding putative sur-face proteins bearing a cell wall sorting signal motif(LPXTG

=

21 IPXTG

=

4 LPXTS

=

2 LPXTN

=

2 FPXTG

=

1) (Table 1) Among these proteins 13 have predictedfunctions (three peptidases two nucleases one amidase

one pullulanase and six adhesins) whereas 17 have noobvious function (Table 1) Only four of these proteinshave an ortholog in

S pyogenes

and three in

S pneumo-niae

Six do not possess a streptococcal homologue andthus seem to be specific for

S agalactiae

Therefore eachof these pathogens possesses a specific repertoire ofLPXTG proteins Most

S agalactiae

strains express thecell wall-anchored C5a peptidase ScpB which cleaves thecomplement factor C5a the major neutrophil chemo-attractant produced by activation of the complement cas-cade (Chmouryguina

et al

1996 Takahashi

et al

1999)Interestingly besides ScpB NEM316 encodes two addi-tional cell wall-bound serine proteases (gbs0451 andgbs2008) that are similar to the C5a peptidase (55and 49 of similarity respectively) In particular theseenzymes possess at the expected positions the catalytictriad (D H S) postulated to constitute the active site(Krem and Di Cera 2001) However it remains to bedemonstrated whether these ScpB-like enzymes consti-tute functional serine proteases involved in virulence

Streptococcus agalactiae

encodes immunoprotectivesurface proteins (Rib

a

-C and

b

-C) each containing vari-able series of tandem repeat units (Heden

et al

1991Michel

et al

1992 Wastfelt

et al

1996 Lachenauer

et al

2000) NEM316 codes for the

a

-C-like protein Alp2but does not encode Rib nor

b-C which indicates that thestrain belongs to the molecular subserotype III-3 (Konget al 2002) Sequence comparison revealed that thegene encoding Alp2 in a serotype V strain (Lachenaueret al 2000) was probably derived from the gene encodingAlp2NEM316 by homologous recombination between theinternal repeated sequences Variations in the number ofrepeats of this protein can change its antigenicity a mech-anism to escape host immunity (Gravekamp et al 1996Lachenauer et al 2000) Two additional unrelated surfaceproteins (gbs1529 and gbs1087) possessing numerousinternal direct repeats have been characterized inNEM316 The protein gbs1087 (410 aa long) contains16 contiguous copies of the motif LERRQRDAENRKSQGNV but the function of this protein which does nothave any streptococcal homolog is not known The pro-tein gbs1529 (1310 aa long) contains in its C-terminalhalf 150 contiguous repeats of the motifs STSA (121motifs) or SMSA (29 motifs) This protein displays signifi-cant similarities with Hsa (GspB) of Streptococcus gordo-nii a protein associated with bacterial haemagglutinationand adhesion to a-23-linked sialic acid-containing recep-tors (Takahashi et al 2002)

The so-called LPXTG proteins are covalently linked tothe bacterial cell wall by a transpeptidylation mechanismrequiring a C-terminal sorting signal with a conservedLPXTG motif The enzyme that catalyses the protease-transpeptidase activity is a membrane-associated pro-tein named sortase (Srt) (Pallen et al 2001) NEM316

1502 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B

sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus

Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears

Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals

Gene nameSize(aa)

Cleavagemotif Related proteins

Percentage of aa identity (similarity)segment lengtha Putative functionb

gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)

Pas (S intermedius)37 (52)40536 (52)399

UnknownUnknown

gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)

R28 (S pyogenes)74 (77)79869 (75)1103

Unknown

gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)

PulA (S pneumoniae)65 (79)109549 (65)1305

Alkaline amylopullulanase

gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)

30 (43)138531 (45)1285

Sialic acid binding protein

gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)

50 (62)18330 (60)220

UnknownCholine binding protein

gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)

SrpA (S cristatus)50 (60)131443 (53)1248

Sialic acid binding protein

gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)

YbgE (L lactis)36 (54)47835 (54)492

Amidase

gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)

57 (70)69447 (66)630

Cyclo-nucleotidephosphodiesterase

gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)

PspC (S pneumoniae)31 (46)30223 (38)795

UnknownAdhesin

gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)

32 (46)17640 (55)901

Fibronectin-binding proteinUnknown

gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)

23 (35)55427(42)512

Unknown

gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)

BspA (Bacteroides forsythus)72 (81)105024 (41)566

UnknownUnknown

gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown

a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid

Genome sequence of Streptococcus agalactiae 1503

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function

Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)

Lifestyle of S agalactiae as revealed by metabolism

Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of

haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci

Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments

Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding

1504 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)

The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease

Stress adaptation

The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)

Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types

of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes

The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2

were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase

Physiological adaptation and transcriptional regulation

We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes

Genome sequence of Streptococcus agalactiae 1505

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species

We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements

Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases

Mobile genetic elements

Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal

Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome

An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae

Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

Genome sequence of

Streptococcus agalactiae 1501

copy 2002 Blackwell Science Ltd

Molecular Microbiology

45

1499ndash1513

orientation bias appears to be a common feature of lowG

+

C Gram-positive bacteria The origin of replicationwas predicted upstream from

dnaA

by similarity to thelocation of

oriC

in

B subtilis

Analysis of the GC skew(Fig 1) combined with the analysis of the orientation ofthe CDSs and the position of the

B subtilis ripX

homo-logue encoding a protein involved in Holliday junction res-olution allowed us to predict the region of termination ofreplication at around kb 1015 The location of the replica-tion terminus appears to be skewed from the expectedposition at 180

infin

from

oriC

resulting in a 174 kb shorterright replichore of the chromosome which contains theseven rDNA operons (Fig 1)

Bacterialndashhost interactions and virulence factors

S agalactiae

expresses a variety of extracellular productswhich are implicated in virulence Among these are thecapsular polysaccharide surface proteins and secretedproteins

S agalactiae

possesses two distinct polysaccha-ride antigens of which the sialylated capsular polysaccha-ride is one of the most important

S agalactiae

virulencefactors It prevents deposition of the host complementfactor C3b and inhibits complement-mediated opsono-phagocytosis The locus involved in type III capsule syn-thesis previously identified (Chaffin

et al

2000) waspresent in NEM316 in its entirety It contains 17 genes(

cpsA-L

neuBCDA

) including the transcriptional regula-tory gene

cpsY

(Nizet and Rubens 2000) The secondimportant polysaccharide the group B antigen is com-posed of a number of rhamnose units and is common toall strains (Yamamoto

et al

1999) The genetic determi-nants involved in the biosynthesis of the group B antigenhave not yet been characterized NEM316 contains a clus-ter of 16 genes which includes six involved in rhamnosemetabolism (dTDP-

L

-rhamnose synthases and rhamno-syltransferases) (gbs1481 gbs1484 gbs1485 gbs1492gbs1493 and gbs1494) A second cluster of three genes(gbs1271-gbs1273) which are paralogs of the

rml

genesof

Streptococcus mutans

is probably involved in the anab-olism of dTDP-

L

-rhamnose It seems probable that one orboth gene clusters are necessary for the synthesis of therhamnose-containing group B-specific cell wall polysac-charide antigen

Surface proteins of pathogenic bacteria play an impor-tant role during the infectious process by mediating inter-actions between the pathogen and the host cells andorevasion from the host defence (Navarre and Schneewind1999) Based on sequence analysis we predicted 30open reading frames in NEM316 encoding putative sur-face proteins bearing a cell wall sorting signal motif(LPXTG

=

21 IPXTG

=

4 LPXTS

=

2 LPXTN

=

2 FPXTG

=

1) (Table 1) Among these proteins 13 have predictedfunctions (three peptidases two nucleases one amidase

one pullulanase and six adhesins) whereas 17 have noobvious function (Table 1) Only four of these proteinshave an ortholog in

S pyogenes

and three in

S pneumo-niae

Six do not possess a streptococcal homologue andthus seem to be specific for

S agalactiae

Therefore eachof these pathogens possesses a specific repertoire ofLPXTG proteins Most

S agalactiae

strains express thecell wall-anchored C5a peptidase ScpB which cleaves thecomplement factor C5a the major neutrophil chemo-attractant produced by activation of the complement cas-cade (Chmouryguina

et al

1996 Takahashi

et al

1999)Interestingly besides ScpB NEM316 encodes two addi-tional cell wall-bound serine proteases (gbs0451 andgbs2008) that are similar to the C5a peptidase (55and 49 of similarity respectively) In particular theseenzymes possess at the expected positions the catalytictriad (D H S) postulated to constitute the active site(Krem and Di Cera 2001) However it remains to bedemonstrated whether these ScpB-like enzymes consti-tute functional serine proteases involved in virulence

Streptococcus agalactiae

encodes immunoprotectivesurface proteins (Rib

a

-C and

b

-C) each containing vari-able series of tandem repeat units (Heden

et al

1991Michel

et al

1992 Wastfelt

et al

1996 Lachenauer

et al

2000) NEM316 codes for the

a

-C-like protein Alp2but does not encode Rib nor

b-C which indicates that thestrain belongs to the molecular subserotype III-3 (Konget al 2002) Sequence comparison revealed that thegene encoding Alp2 in a serotype V strain (Lachenaueret al 2000) was probably derived from the gene encodingAlp2NEM316 by homologous recombination between theinternal repeated sequences Variations in the number ofrepeats of this protein can change its antigenicity a mech-anism to escape host immunity (Gravekamp et al 1996Lachenauer et al 2000) Two additional unrelated surfaceproteins (gbs1529 and gbs1087) possessing numerousinternal direct repeats have been characterized inNEM316 The protein gbs1087 (410 aa long) contains16 contiguous copies of the motif LERRQRDAENRKSQGNV but the function of this protein which does nothave any streptococcal homolog is not known The pro-tein gbs1529 (1310 aa long) contains in its C-terminalhalf 150 contiguous repeats of the motifs STSA (121motifs) or SMSA (29 motifs) This protein displays signifi-cant similarities with Hsa (GspB) of Streptococcus gordo-nii a protein associated with bacterial haemagglutinationand adhesion to a-23-linked sialic acid-containing recep-tors (Takahashi et al 2002)

The so-called LPXTG proteins are covalently linked tothe bacterial cell wall by a transpeptidylation mechanismrequiring a C-terminal sorting signal with a conservedLPXTG motif The enzyme that catalyses the protease-transpeptidase activity is a membrane-associated pro-tein named sortase (Srt) (Pallen et al 2001) NEM316

1502 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B

sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus

Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears

Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals

Gene nameSize(aa)

Cleavagemotif Related proteins

Percentage of aa identity (similarity)segment lengtha Putative functionb

gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)

Pas (S intermedius)37 (52)40536 (52)399

UnknownUnknown

gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)

R28 (S pyogenes)74 (77)79869 (75)1103

Unknown

gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)

PulA (S pneumoniae)65 (79)109549 (65)1305

Alkaline amylopullulanase

gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)

30 (43)138531 (45)1285

Sialic acid binding protein

gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)

50 (62)18330 (60)220

UnknownCholine binding protein

gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)

SrpA (S cristatus)50 (60)131443 (53)1248

Sialic acid binding protein

gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)

YbgE (L lactis)36 (54)47835 (54)492

Amidase

gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)

57 (70)69447 (66)630

Cyclo-nucleotidephosphodiesterase

gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)

PspC (S pneumoniae)31 (46)30223 (38)795

UnknownAdhesin

gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)

32 (46)17640 (55)901

Fibronectin-binding proteinUnknown

gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)

23 (35)55427(42)512

Unknown

gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)

BspA (Bacteroides forsythus)72 (81)105024 (41)566

UnknownUnknown

gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown

a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid

Genome sequence of Streptococcus agalactiae 1503

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function

Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)

Lifestyle of S agalactiae as revealed by metabolism

Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of

haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci

Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments

Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding

1504 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)

The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease

Stress adaptation

The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)

Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types

of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes

The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2

were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase

Physiological adaptation and transcriptional regulation

We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes

Genome sequence of Streptococcus agalactiae 1505

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species

We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements

Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases

Mobile genetic elements

Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal

Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome

An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae

Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

1502 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B

sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus

Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears

Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals

Gene nameSize(aa)

Cleavagemotif Related proteins

Percentage of aa identity (similarity)segment lengtha Putative functionb

gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)

Pas (S intermedius)37 (52)40536 (52)399

UnknownUnknown

gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)

R28 (S pyogenes)74 (77)79869 (75)1103

Unknown

gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)

PulA (S pneumoniae)65 (79)109549 (65)1305

Alkaline amylopullulanase

gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)

30 (43)138531 (45)1285

Sialic acid binding protein

gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)

50 (62)18330 (60)220

UnknownCholine binding protein

gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)

SrpA (S cristatus)50 (60)131443 (53)1248

Sialic acid binding protein

gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)

YbgE (L lactis)36 (54)47835 (54)492

Amidase

gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)

57 (70)69447 (66)630

Cyclo-nucleotidephosphodiesterase

gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)

PspC (S pneumoniae)31 (46)30223 (38)795

UnknownAdhesin

gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)

32 (46)17640 (55)901

Fibronectin-binding proteinUnknown

gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)

23 (35)55427(42)512

Unknown

gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)

BspA (Bacteroides forsythus)72 (81)105024 (41)566

UnknownUnknown

gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown

a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid

Genome sequence of Streptococcus agalactiae 1503

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function

Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)

Lifestyle of S agalactiae as revealed by metabolism

Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of

haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci

Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments

Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding

1504 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)

The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease

Stress adaptation

The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)

Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types

of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes

The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2

were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase

Physiological adaptation and transcriptional regulation

We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes

Genome sequence of Streptococcus agalactiae 1505

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species

We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements

Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases

Mobile genetic elements

Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal

Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome

An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae

Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

Genome sequence of Streptococcus agalactiae 1503

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function

Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)

Lifestyle of S agalactiae as revealed by metabolism

Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of

haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci

Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments

Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding

1504 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)

The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease

Stress adaptation

The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)

Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types

of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes

The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2

were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase

Physiological adaptation and transcriptional regulation

We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes

Genome sequence of Streptococcus agalactiae 1505

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species

We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements

Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases

Mobile genetic elements

Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal

Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome

An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae

Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

1504 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)

The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease

Stress adaptation

The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)

Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types

of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes

The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2

were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase

Physiological adaptation and transcriptional regulation

We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes

Genome sequence of Streptococcus agalactiae 1505

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species

We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements

Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases

Mobile genetic elements

Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal

Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome

An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae

Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

Genome sequence of Streptococcus agalactiae 1505

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species

We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements

Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases

Mobile genetic elements

Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal

Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome

An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae

Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

1506 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

truncated in its C-terminal end and has lost the criticaltyrosyl residue

Comparative genomics defines a composite organization of the S agalactiae chromosome

For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae

was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome

Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)

Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

Genome sequence of Streptococcus agalactiae 1507

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the

arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation

Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best

Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

1508 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Tab

le 2

Is

land

s in

sert

ed in

the

S a

gala

ctia

e N

EM

316

chro

mos

ome

Isla

ndN

umbe

r of

gen

es(s

ize

kb)

Pos

ition

Gen

es r

elat

ed t

o m

obile

ele

men

tsa

Kno

wn

gene

s an

d ge

nes

sim

ilar

to k

now

n ge

nesab

Pse

udog

enes

tRN

A

I25

(18

5)

2328

96ndash2

5135

7In

tegr

ase

pla

smid

rep

licat

ion

rec

ombi

nase

res

olva

seO

smop

rote

ctan

t A

BC

tra

nspo

rter

re

gula

tor

Rgg

3A

la T

GC

II19

(14

5)

2578

78ndash2

7244

8In

tegr

ase

pla

smid

rep

licat

ion

DN

A t

rans

loca

seR

NA

pol

ymer

ase

EC

F t

ype

sigm

a fa

ctor

ac

etyl

tra

nsfe

rase

A

BC

tra

nspo

rter

0Le

u C

AA

III49

(47

)38

5757

ndash432

824

Pla

smid

rep

licat

ion

top

oiso

mer

ase

sin

gle

stra

ndC

lp p

rote

ase

AT

P-b

indi

ng s

ubun

it C

lpA

0

VII

7118

24ndash7

5889

1bi

ndin

g pr

otei

n p

lasm

id t

rans

fer

com

plex

pro

tein

LP

XT

G p

rote

ins

(3)

VIII

1013

025ndash

1060

094

plas

mid

par

titio

n pr

otei

n p

lasm

id r

eplic

atio

n in

itiat

ion

IV23

(18

8)

4895

25ndash5

0830

3In

tegr

ase

ph

age

and

plas

mid

rel

ated

pro

tein

sD

ecar

boxy

lase

al

coho

l deh

ydro

gena

se

oxid

ored

ucta

se

Alp

23

Thr

GG

T

V10

(9

5)61

2484

ndash621

922

Inte

gras

e

tran

spos

ase

(2)

AB

C t

rans

port

er

two-

com

pone

ntre

gula

tory

Vnc

RS

2A

rg C

CT

VI

58 (

577

)63

9392

ndash697

130

Tran

spos

ase

Alp

ha li

ke

Hsp

33

Cyl

locu

s L

PX

TG

prot

eins

(3)

sor

tase

(2)

ch

aper

onin

7

IX27

(25

5)

1093

449ndash

1118

984

DN

A t

rans

loca

seTw

o-co

mpo

nent

Lyt

RS

C

arbo

n st

arva

tion

prot

ein

A

imm

unog

enic

sec

rete

d pr

otei

n2

X36

(33

5)

1163

887ndash

1197

331

Pla

smid

rel

axas

e an

d m

obili

satio

n t

rans

fer

com

plex

prot

eins

Trs

K T

rsE

pl

asm

id r

eplic

atio

n in

itiat

ion

LPX

TG

pro

tein

s (3

) D

NA

met

hyl

tran

sfer

ase

0

XI

11 (

75)

1253

733ndash

1261

229

Inte

gras

e0

Arg

CC

GX

II70

(81

5)

1341

772ndash

1423

296

Tran

spos

ase

(3)

D

NA

pol

ymer

ase

exo

nucl

ease

in

tegr

ase

(2)

pla

smid

rep

licat

ion

typ

e II

DN

A

mod

ifica

tion

tra

nspo

son

rela

xase

he

licas

e (

2)

plas

mid

tra

nsfe

r co

mpl

ex p

rote

in (

TraE

and

Trs

K)

Mul

ti dr

ug r

esis

tanc

e L

mb

S

cpB

la

ctos

eut

ilisa

tion

oper

on

LPX

TG

pro

tein

s D

NA

met

hyl t

rans

fera

se

4

XIII

47 (

445

)20

3671

8ndash20

8128

0In

tegr

ase

Cam

p f

acto

r C

5A p

eptid

ase

met

hion

ine

synt

hase

Met

C

glyc

erol

deh

ydro

gena

se

amin

ogly

cosi

de 6

-ade

nyly

ltran

sera

se

AB

C

tran

spor

ter

two-

com

pone

nt r

egul

ator

4Ly

s C

TT

XIV

22 (

225

)21

4383

1ndash21

6632

4P

lasm

id r

eplic

atio

n pr

otei

n in

tegr

ase

inte

gras

e

(22)

Two

com

pone

nt r

egul

ator

(2)

ca

rbam

ate

kina

se

orni

thin

e ca

rbam

oyltr

ansf

eras

eos

mop

rote

ctan

t A

BC

tra

nspo

rter

2

a N

umbe

r of

par

alog

s w

ithin

par

enth

eses

viru

lenc

e ge

nes

in b

old

lette

rs

b

Onl

y re

leva

nt g

enes

are

indi

cate

d

pseu

doge

nes

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

Genome sequence of Streptococcus agalactiae 1509

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected

The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein

What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an

homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence

Conclusion

The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)

The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine

Experimental procedures

Bacterial strains plasmids and growth conditions

Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

1510 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively

Sequencing and assembly methods

Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis

Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)

Annotation methods

The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-

dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)

All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength

The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList

Acknowledgements

We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged

References

Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402

Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440

drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

Genome sequence of Streptococcus agalactiae 1511

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539

Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537

Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881

Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373

Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80

Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753

Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618

Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477

Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770

Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208

Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390

Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686

DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763

Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845

Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194

Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561

Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663

Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512

Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634

Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935

Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943

Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390

Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485

Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202

Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583

Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515

Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097

Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471

Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490

Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170

Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

1512 P Glaser et al

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408

Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717

Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061

Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269

Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455

Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437

Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626

Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045

Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635

Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349

Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571

Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219

Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406

Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964

Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515

Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407

Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca

of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064

Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105

Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18

Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33

Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114

Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65

Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735

Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196

Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068

Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229

Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9

Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press

Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123

Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57

Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829

Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102

Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533

Petit CM Brown JR Ingraham K Bryant AP and

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184

Genome sequence of Streptococcus agalactiae 1513

copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513

Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233

Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629

Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106

Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247

Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581

Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163

Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513

Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673

Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and

Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878

Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440

Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085

Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870

Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218

Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506

Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924

Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897

Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57

Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184