Genome sequence of Streptococcus agalactiae, a pathogen causing invasive neonatal disease
Transcript of Genome sequence of Streptococcus agalactiae, a pathogen causing invasive neonatal disease
Molecular Microbiology (2002)
45
(6) 1499ndash1513
copy 2002 Blackwell Science Ltd
Blackwell Science LtdOxford UKMMIMolecular Microbiology 0950-382X Blackwell Science 200245Original Article
Genome sequence of Streptococcus agalactiaeP Glaser et al
Accepted 24 June 2002 For correspondence E-mail fkunstpasteurfr Tel (
+
33) 1 45 68 89 96 Fax (
+
33) 1 45 68 87 86
Genome sequence of
Streptococcus agalactiae
a pathogen causing invasive neonatal disease
Philippe Glaser
1
Christophe Rusniok
1
Carmen Buchrieser
1
Fabien Chevalier
1
Lionel Frangeul
1
Tarek Msadek
2
Mohamed Zouine
1
Elisabeth Couveacute
1
Lila Lalioui
3
Claire Poyart
3
Patrick Trieu-Cuot
3
and Frank Kunst
1
1
Laboratoire de Geacutenomique des Microorganismes Pathogegravenes and
2
Uniteacute de Biochimie Microbienne Institut Pasteur 28 Rue du Dr Roux 75724 Paris Cedex 15 France
3
Laboratoire Mixte Pasteur-Necker de Recherche sur les Streptocoques et Streptococcies Faculteacute de Meacutedecine Necker 156 Rue Vaugirard 75015 Paris France
Summary
Streptococcus agalactiae
is a commensal bacteriumcolonizing the intestinal tract of a significant propor-tion of the human population However it is also apathogen which is the leading cause of invasive infec-tions in neonates and causes septicaemia meningitisand pneumonia We sequenced the genome of theserogroup III strain NEM316 responsible for a fatalcase of septicaemia The genome is 2 211 485 basepairs long and contains 2118 protein coding genesFifty-five per cent of the predicted genes have anortholog in the
Streptococcus pyogenes
genomerepresenting a conserved backbone between thesetwo streptococci Among the genes in
S agalactiae
that lack an ortholog in
S pyogenes
50 are clus-tered within 14 islands These islands contain knownand putative virulence genes mostly encoding sur-face proteins as well as a number of genes relatedto mobile elements Some of these islands couldtherefore be considered as pathogenicity islandsCompared with other pathogenic streptococci
S aga-lactiae
shows the unique feature that pathogenicityislands may have an important role in virulence acqui-sition and in genetic diversity
Introduction
Lancefieldrsquos group B streptococci (GBS) (Lancefield and
Hare 1935) also referred to as
S agalactiae
is welladapted to asymptomatic colonization of adult humans Itis commonly found in the gastrointestinal and the geni-tourinary tracts but it is also the predominant cause ofinvasive bacterial disease in the neonate
Streptococcusagalactiae
is the leading cause of septicaemia meningitisand pneumonia in neonates responsible for two to threecases per 1000 live births It is also a serious cause ofmortality or morbidity in non-pregnant adults particularlyin elderly persons and those with underlying diseases(Schuchat 1998 Nizet and Rubens 2000 Farley 2001)In North America this bacterium is considered as one ofthe major causes of bovine intramammary infections(Keefe 1997)
Group B streptococci are subclassified into serotypesaccording to the immunologic reactivity of the polysaccha-ride capsule Of the nine serotypes described so far thetypes Ia Ib II III and V are responsible for the majorityof invasive human GBS diseases Serotype III GBS isparticularly important because it causes a significantpercentage of early onset disease (ie infection occurringwithin the first week of life) and the majority of late-onsetdisease (ie infection occurring after the first week of life)Overall the capsular serotype III is responsible for mostcases (80) of neonatal GBS meningitis (Schuchat1998 Nizet and Rubens 2000) Colonization of the rec-tum and vagina of pregnant women with GBS whichcauses infection of the amniotic cavity is correlated withGBS sepsis in newborn infants with early onset diseaseIn this case newborns are colonized intrapartum by aspi-ration of contaminated amniotic fluid The lung is a prob-able portal entry for GBS into the bloodstream as thesebacteria can adhere to and invade alveolar epithelial(Rubens
et al
1992) and endothelial cells (Gibson
et al
1993) Pneumonia results from local infections whereassepsis and meningitis may be due to the spread of bac-teria followed by systemic infection
We have determined the complete genome sequenceof the serotype III strain NEM316 isolated from a case offatal septicaemia Complete genome sequences of
Strep-tococcus pyogenes
strains M1 (Ferretti
et al
2001) andM18 (Smoot
et al
2002) of two strains of
Streptococcuspneumoniae
the virulent strain TIGR4 (Tettelin
et al
2001) and the non-capsulated strain R6 (Hoskins
et al
2001) have been published Comparison of the NEM316
1500
P Glaser
et al
copy 2002 Blackwell Science Ltd
Molecular Microbiology
45
1499ndash1513
genome sequence was made with those of these relatedpathogenic streptococci and with that of the non-pathogenic species
Lactococcus lactis
(Bolotin
et al
2001) providing clues to the evolution of
S agalactiae
andthe acquisition of virulence Besides a better knowledgeof the molecular mechanisms responsible for virulencethis work should contribute to the finding of new targetsfor antimicrobial compounds and for the development ofa GBS vaccine
Results and discussion
General features of the genome
The genome of
S agalactiae
strain NEM316 consists ofa circular chromosome of 2 211 485 base pairs (bp)(Fig 1) (EMBL accession number AL732656) Its G
+
Ccontent of 356 is significantly lower than that of thegenomes of
S pyogenes
(385) (Ferretti
et al
2001)and of
S pneumoniae
(397) (Tettelin
et al
2001)whereas
L lactis
a more distantly related species hasan almost identical G
+
C content of 354 (Bolotin
et al
2001) Seven sets of 23S 5S and 16S ribosomal RNAoperons were identified all of which are organized withina 450 kb region located on the right replichore of thechromosome next to the origin of replication (Fig 1) For
S pneumoniae
only four rRNA operons were reported
which are all located within a 400 kb region three on theleft replichore and one on the right one (Tettelin
et al
2001 Hoskins
et al
2001) In contrast
S pyogenes
and
L lactis
each contain six rRNA operons distributed onboth replichores within 750 and 920 kb regions respec-tively (Bolotin
et al
2001 Ferretti
et al
2001 Smoot
et al
2002) We identified 80 tRNA genes in NEM316 anumber significantly higher than the 58 tRNA genespresent in
S pneumoniae
the 60 in
S pyogenes
and the62 in
L lactis
The genes coding tRNAs recognize only31 out of 61 possible sense codons The redundancy oftRNA genes as compared to genome size is an interestingfeature of the
S agalactiae
genome the biological conse-quence of which is not yet known
In the genome of NEM316 we identified 2082 proteincoding genes and 36 pseudogenes Forty-nine genes aretriplicated corresponding to a region present in three cop-ies in the genome Putative functions could be assignedto 62 of the predicted proteins 25 are similar tounknown proteins 4 are similar to unknown proteinsfrom other streptococci or lactococci and 9 are uniqueto
S agalactiae
Protein coding genes were classifiedaccording to the functional categories described for
Bacil-lus subtilis
and
Listeria monocytogenes
A strong bias ingene orientation was observed as 81 of the codingsequences (CDS) are transcribed in the same orientationas the movement of the replication fork This type of gene
Fig 1
Circular genome map of
S agalactiae
NEM316 showing the position and orientation of genes From the outside circle 1 protein coding genes on the
+
and - strands genes in yellow belonging to the 14 islands numbered from I to XIV circle 2 GC bias (G
+
CG
-
C) circle 3 G
+
C content with
lt
288 G
+
C in yellow between 288 and 425 G
+
C in orange and with
gt
425 G
+
C in red circle 4 stable RNA coding genes The scale in kb is indicated on the outside of the genome with the predicted origin of replication being at position 0
Genome sequence of
Streptococcus agalactiae 1501
copy 2002 Blackwell Science Ltd
Molecular Microbiology
45
1499ndash1513
orientation bias appears to be a common feature of lowG
+
C Gram-positive bacteria The origin of replicationwas predicted upstream from
dnaA
by similarity to thelocation of
oriC
in
B subtilis
Analysis of the GC skew(Fig 1) combined with the analysis of the orientation ofthe CDSs and the position of the
B subtilis ripX
homo-logue encoding a protein involved in Holliday junction res-olution allowed us to predict the region of termination ofreplication at around kb 1015 The location of the replica-tion terminus appears to be skewed from the expectedposition at 180
infin
from
oriC
resulting in a 174 kb shorterright replichore of the chromosome which contains theseven rDNA operons (Fig 1)
Bacterialndashhost interactions and virulence factors
S agalactiae
expresses a variety of extracellular productswhich are implicated in virulence Among these are thecapsular polysaccharide surface proteins and secretedproteins
S agalactiae
possesses two distinct polysaccha-ride antigens of which the sialylated capsular polysaccha-ride is one of the most important
S agalactiae
virulencefactors It prevents deposition of the host complementfactor C3b and inhibits complement-mediated opsono-phagocytosis The locus involved in type III capsule syn-thesis previously identified (Chaffin
et al
2000) waspresent in NEM316 in its entirety It contains 17 genes(
cpsA-L
neuBCDA
) including the transcriptional regula-tory gene
cpsY
(Nizet and Rubens 2000) The secondimportant polysaccharide the group B antigen is com-posed of a number of rhamnose units and is common toall strains (Yamamoto
et al
1999) The genetic determi-nants involved in the biosynthesis of the group B antigenhave not yet been characterized NEM316 contains a clus-ter of 16 genes which includes six involved in rhamnosemetabolism (dTDP-
L
-rhamnose synthases and rhamno-syltransferases) (gbs1481 gbs1484 gbs1485 gbs1492gbs1493 and gbs1494) A second cluster of three genes(gbs1271-gbs1273) which are paralogs of the
rml
genesof
Streptococcus mutans
is probably involved in the anab-olism of dTDP-
L
-rhamnose It seems probable that one orboth gene clusters are necessary for the synthesis of therhamnose-containing group B-specific cell wall polysac-charide antigen
Surface proteins of pathogenic bacteria play an impor-tant role during the infectious process by mediating inter-actions between the pathogen and the host cells andorevasion from the host defence (Navarre and Schneewind1999) Based on sequence analysis we predicted 30open reading frames in NEM316 encoding putative sur-face proteins bearing a cell wall sorting signal motif(LPXTG
=
21 IPXTG
=
4 LPXTS
=
2 LPXTN
=
2 FPXTG
=
1) (Table 1) Among these proteins 13 have predictedfunctions (three peptidases two nucleases one amidase
one pullulanase and six adhesins) whereas 17 have noobvious function (Table 1) Only four of these proteinshave an ortholog in
S pyogenes
and three in
S pneumo-niae
Six do not possess a streptococcal homologue andthus seem to be specific for
S agalactiae
Therefore eachof these pathogens possesses a specific repertoire ofLPXTG proteins Most
S agalactiae
strains express thecell wall-anchored C5a peptidase ScpB which cleaves thecomplement factor C5a the major neutrophil chemo-attractant produced by activation of the complement cas-cade (Chmouryguina
et al
1996 Takahashi
et al
1999)Interestingly besides ScpB NEM316 encodes two addi-tional cell wall-bound serine proteases (gbs0451 andgbs2008) that are similar to the C5a peptidase (55and 49 of similarity respectively) In particular theseenzymes possess at the expected positions the catalytictriad (D H S) postulated to constitute the active site(Krem and Di Cera 2001) However it remains to bedemonstrated whether these ScpB-like enzymes consti-tute functional serine proteases involved in virulence
Streptococcus agalactiae
encodes immunoprotectivesurface proteins (Rib
a
-C and
b
-C) each containing vari-able series of tandem repeat units (Heden
et al
1991Michel
et al
1992 Wastfelt
et al
1996 Lachenauer
et al
2000) NEM316 codes for the
a
-C-like protein Alp2but does not encode Rib nor
b-C which indicates that thestrain belongs to the molecular subserotype III-3 (Konget al 2002) Sequence comparison revealed that thegene encoding Alp2 in a serotype V strain (Lachenaueret al 2000) was probably derived from the gene encodingAlp2NEM316 by homologous recombination between theinternal repeated sequences Variations in the number ofrepeats of this protein can change its antigenicity a mech-anism to escape host immunity (Gravekamp et al 1996Lachenauer et al 2000) Two additional unrelated surfaceproteins (gbs1529 and gbs1087) possessing numerousinternal direct repeats have been characterized inNEM316 The protein gbs1087 (410 aa long) contains16 contiguous copies of the motif LERRQRDAENRKSQGNV but the function of this protein which does nothave any streptococcal homolog is not known The pro-tein gbs1529 (1310 aa long) contains in its C-terminalhalf 150 contiguous repeats of the motifs STSA (121motifs) or SMSA (29 motifs) This protein displays signifi-cant similarities with Hsa (GspB) of Streptococcus gordo-nii a protein associated with bacterial haemagglutinationand adhesion to a-23-linked sialic acid-containing recep-tors (Takahashi et al 2002)
The so-called LPXTG proteins are covalently linked tothe bacterial cell wall by a transpeptidylation mechanismrequiring a C-terminal sorting signal with a conservedLPXTG motif The enzyme that catalyses the protease-transpeptidase activity is a membrane-associated pro-tein named sortase (Srt) (Pallen et al 2001) NEM316
1502 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B
sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus
Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears
Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals
Gene nameSize(aa)
Cleavagemotif Related proteins
Percentage of aa identity (similarity)segment lengtha Putative functionb
gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)
Pas (S intermedius)37 (52)40536 (52)399
UnknownUnknown
gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)
R28 (S pyogenes)74 (77)79869 (75)1103
Unknown
gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)
PulA (S pneumoniae)65 (79)109549 (65)1305
Alkaline amylopullulanase
gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)
30 (43)138531 (45)1285
Sialic acid binding protein
gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)
50 (62)18330 (60)220
UnknownCholine binding protein
gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)
SrpA (S cristatus)50 (60)131443 (53)1248
Sialic acid binding protein
gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)
YbgE (L lactis)36 (54)47835 (54)492
Amidase
gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)
57 (70)69447 (66)630
Cyclo-nucleotidephosphodiesterase
gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)
PspC (S pneumoniae)31 (46)30223 (38)795
UnknownAdhesin
gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)
32 (46)17640 (55)901
Fibronectin-binding proteinUnknown
gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)
23 (35)55427(42)512
Unknown
gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)
BspA (Bacteroides forsythus)72 (81)105024 (41)566
UnknownUnknown
gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown
a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid
Genome sequence of Streptococcus agalactiae 1503
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function
Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)
Lifestyle of S agalactiae as revealed by metabolism
Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of
haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci
Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments
Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding
1504 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)
The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease
Stress adaptation
The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)
Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types
of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes
The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2
were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase
Physiological adaptation and transcriptional regulation
We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes
Genome sequence of Streptococcus agalactiae 1505
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species
We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements
Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases
Mobile genetic elements
Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal
Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome
An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae
Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
1500
P Glaser
et al
copy 2002 Blackwell Science Ltd
Molecular Microbiology
45
1499ndash1513
genome sequence was made with those of these relatedpathogenic streptococci and with that of the non-pathogenic species
Lactococcus lactis
(Bolotin
et al
2001) providing clues to the evolution of
S agalactiae
andthe acquisition of virulence Besides a better knowledgeof the molecular mechanisms responsible for virulencethis work should contribute to the finding of new targetsfor antimicrobial compounds and for the development ofa GBS vaccine
Results and discussion
General features of the genome
The genome of
S agalactiae
strain NEM316 consists ofa circular chromosome of 2 211 485 base pairs (bp)(Fig 1) (EMBL accession number AL732656) Its G
+
Ccontent of 356 is significantly lower than that of thegenomes of
S pyogenes
(385) (Ferretti
et al
2001)and of
S pneumoniae
(397) (Tettelin
et al
2001)whereas
L lactis
a more distantly related species hasan almost identical G
+
C content of 354 (Bolotin
et al
2001) Seven sets of 23S 5S and 16S ribosomal RNAoperons were identified all of which are organized withina 450 kb region located on the right replichore of thechromosome next to the origin of replication (Fig 1) For
S pneumoniae
only four rRNA operons were reported
which are all located within a 400 kb region three on theleft replichore and one on the right one (Tettelin
et al
2001 Hoskins
et al
2001) In contrast
S pyogenes
and
L lactis
each contain six rRNA operons distributed onboth replichores within 750 and 920 kb regions respec-tively (Bolotin
et al
2001 Ferretti
et al
2001 Smoot
et al
2002) We identified 80 tRNA genes in NEM316 anumber significantly higher than the 58 tRNA genespresent in
S pneumoniae
the 60 in
S pyogenes
and the62 in
L lactis
The genes coding tRNAs recognize only31 out of 61 possible sense codons The redundancy oftRNA genes as compared to genome size is an interestingfeature of the
S agalactiae
genome the biological conse-quence of which is not yet known
In the genome of NEM316 we identified 2082 proteincoding genes and 36 pseudogenes Forty-nine genes aretriplicated corresponding to a region present in three cop-ies in the genome Putative functions could be assignedto 62 of the predicted proteins 25 are similar tounknown proteins 4 are similar to unknown proteinsfrom other streptococci or lactococci and 9 are uniqueto
S agalactiae
Protein coding genes were classifiedaccording to the functional categories described for
Bacil-lus subtilis
and
Listeria monocytogenes
A strong bias ingene orientation was observed as 81 of the codingsequences (CDS) are transcribed in the same orientationas the movement of the replication fork This type of gene
Fig 1
Circular genome map of
S agalactiae
NEM316 showing the position and orientation of genes From the outside circle 1 protein coding genes on the
+
and - strands genes in yellow belonging to the 14 islands numbered from I to XIV circle 2 GC bias (G
+
CG
-
C) circle 3 G
+
C content with
lt
288 G
+
C in yellow between 288 and 425 G
+
C in orange and with
gt
425 G
+
C in red circle 4 stable RNA coding genes The scale in kb is indicated on the outside of the genome with the predicted origin of replication being at position 0
Genome sequence of
Streptococcus agalactiae 1501
copy 2002 Blackwell Science Ltd
Molecular Microbiology
45
1499ndash1513
orientation bias appears to be a common feature of lowG
+
C Gram-positive bacteria The origin of replicationwas predicted upstream from
dnaA
by similarity to thelocation of
oriC
in
B subtilis
Analysis of the GC skew(Fig 1) combined with the analysis of the orientation ofthe CDSs and the position of the
B subtilis ripX
homo-logue encoding a protein involved in Holliday junction res-olution allowed us to predict the region of termination ofreplication at around kb 1015 The location of the replica-tion terminus appears to be skewed from the expectedposition at 180
infin
from
oriC
resulting in a 174 kb shorterright replichore of the chromosome which contains theseven rDNA operons (Fig 1)
Bacterialndashhost interactions and virulence factors
S agalactiae
expresses a variety of extracellular productswhich are implicated in virulence Among these are thecapsular polysaccharide surface proteins and secretedproteins
S agalactiae
possesses two distinct polysaccha-ride antigens of which the sialylated capsular polysaccha-ride is one of the most important
S agalactiae
virulencefactors It prevents deposition of the host complementfactor C3b and inhibits complement-mediated opsono-phagocytosis The locus involved in type III capsule syn-thesis previously identified (Chaffin
et al
2000) waspresent in NEM316 in its entirety It contains 17 genes(
cpsA-L
neuBCDA
) including the transcriptional regula-tory gene
cpsY
(Nizet and Rubens 2000) The secondimportant polysaccharide the group B antigen is com-posed of a number of rhamnose units and is common toall strains (Yamamoto
et al
1999) The genetic determi-nants involved in the biosynthesis of the group B antigenhave not yet been characterized NEM316 contains a clus-ter of 16 genes which includes six involved in rhamnosemetabolism (dTDP-
L
-rhamnose synthases and rhamno-syltransferases) (gbs1481 gbs1484 gbs1485 gbs1492gbs1493 and gbs1494) A second cluster of three genes(gbs1271-gbs1273) which are paralogs of the
rml
genesof
Streptococcus mutans
is probably involved in the anab-olism of dTDP-
L
-rhamnose It seems probable that one orboth gene clusters are necessary for the synthesis of therhamnose-containing group B-specific cell wall polysac-charide antigen
Surface proteins of pathogenic bacteria play an impor-tant role during the infectious process by mediating inter-actions between the pathogen and the host cells andorevasion from the host defence (Navarre and Schneewind1999) Based on sequence analysis we predicted 30open reading frames in NEM316 encoding putative sur-face proteins bearing a cell wall sorting signal motif(LPXTG
=
21 IPXTG
=
4 LPXTS
=
2 LPXTN
=
2 FPXTG
=
1) (Table 1) Among these proteins 13 have predictedfunctions (three peptidases two nucleases one amidase
one pullulanase and six adhesins) whereas 17 have noobvious function (Table 1) Only four of these proteinshave an ortholog in
S pyogenes
and three in
S pneumo-niae
Six do not possess a streptococcal homologue andthus seem to be specific for
S agalactiae
Therefore eachof these pathogens possesses a specific repertoire ofLPXTG proteins Most
S agalactiae
strains express thecell wall-anchored C5a peptidase ScpB which cleaves thecomplement factor C5a the major neutrophil chemo-attractant produced by activation of the complement cas-cade (Chmouryguina
et al
1996 Takahashi
et al
1999)Interestingly besides ScpB NEM316 encodes two addi-tional cell wall-bound serine proteases (gbs0451 andgbs2008) that are similar to the C5a peptidase (55and 49 of similarity respectively) In particular theseenzymes possess at the expected positions the catalytictriad (D H S) postulated to constitute the active site(Krem and Di Cera 2001) However it remains to bedemonstrated whether these ScpB-like enzymes consti-tute functional serine proteases involved in virulence
Streptococcus agalactiae
encodes immunoprotectivesurface proteins (Rib
a
-C and
b
-C) each containing vari-able series of tandem repeat units (Heden
et al
1991Michel
et al
1992 Wastfelt
et al
1996 Lachenauer
et al
2000) NEM316 codes for the
a
-C-like protein Alp2but does not encode Rib nor
b-C which indicates that thestrain belongs to the molecular subserotype III-3 (Konget al 2002) Sequence comparison revealed that thegene encoding Alp2 in a serotype V strain (Lachenaueret al 2000) was probably derived from the gene encodingAlp2NEM316 by homologous recombination between theinternal repeated sequences Variations in the number ofrepeats of this protein can change its antigenicity a mech-anism to escape host immunity (Gravekamp et al 1996Lachenauer et al 2000) Two additional unrelated surfaceproteins (gbs1529 and gbs1087) possessing numerousinternal direct repeats have been characterized inNEM316 The protein gbs1087 (410 aa long) contains16 contiguous copies of the motif LERRQRDAENRKSQGNV but the function of this protein which does nothave any streptococcal homolog is not known The pro-tein gbs1529 (1310 aa long) contains in its C-terminalhalf 150 contiguous repeats of the motifs STSA (121motifs) or SMSA (29 motifs) This protein displays signifi-cant similarities with Hsa (GspB) of Streptococcus gordo-nii a protein associated with bacterial haemagglutinationand adhesion to a-23-linked sialic acid-containing recep-tors (Takahashi et al 2002)
The so-called LPXTG proteins are covalently linked tothe bacterial cell wall by a transpeptidylation mechanismrequiring a C-terminal sorting signal with a conservedLPXTG motif The enzyme that catalyses the protease-transpeptidase activity is a membrane-associated pro-tein named sortase (Srt) (Pallen et al 2001) NEM316
1502 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B
sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus
Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears
Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals
Gene nameSize(aa)
Cleavagemotif Related proteins
Percentage of aa identity (similarity)segment lengtha Putative functionb
gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)
Pas (S intermedius)37 (52)40536 (52)399
UnknownUnknown
gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)
R28 (S pyogenes)74 (77)79869 (75)1103
Unknown
gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)
PulA (S pneumoniae)65 (79)109549 (65)1305
Alkaline amylopullulanase
gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)
30 (43)138531 (45)1285
Sialic acid binding protein
gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)
50 (62)18330 (60)220
UnknownCholine binding protein
gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)
SrpA (S cristatus)50 (60)131443 (53)1248
Sialic acid binding protein
gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)
YbgE (L lactis)36 (54)47835 (54)492
Amidase
gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)
57 (70)69447 (66)630
Cyclo-nucleotidephosphodiesterase
gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)
PspC (S pneumoniae)31 (46)30223 (38)795
UnknownAdhesin
gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)
32 (46)17640 (55)901
Fibronectin-binding proteinUnknown
gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)
23 (35)55427(42)512
Unknown
gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)
BspA (Bacteroides forsythus)72 (81)105024 (41)566
UnknownUnknown
gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown
a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid
Genome sequence of Streptococcus agalactiae 1503
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function
Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)
Lifestyle of S agalactiae as revealed by metabolism
Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of
haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci
Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments
Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding
1504 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)
The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease
Stress adaptation
The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)
Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types
of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes
The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2
were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase
Physiological adaptation and transcriptional regulation
We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes
Genome sequence of Streptococcus agalactiae 1505
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species
We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements
Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases
Mobile genetic elements
Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal
Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome
An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae
Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
Genome sequence of
Streptococcus agalactiae 1501
copy 2002 Blackwell Science Ltd
Molecular Microbiology
45
1499ndash1513
orientation bias appears to be a common feature of lowG
+
C Gram-positive bacteria The origin of replicationwas predicted upstream from
dnaA
by similarity to thelocation of
oriC
in
B subtilis
Analysis of the GC skew(Fig 1) combined with the analysis of the orientation ofthe CDSs and the position of the
B subtilis ripX
homo-logue encoding a protein involved in Holliday junction res-olution allowed us to predict the region of termination ofreplication at around kb 1015 The location of the replica-tion terminus appears to be skewed from the expectedposition at 180
infin
from
oriC
resulting in a 174 kb shorterright replichore of the chromosome which contains theseven rDNA operons (Fig 1)
Bacterialndashhost interactions and virulence factors
S agalactiae
expresses a variety of extracellular productswhich are implicated in virulence Among these are thecapsular polysaccharide surface proteins and secretedproteins
S agalactiae
possesses two distinct polysaccha-ride antigens of which the sialylated capsular polysaccha-ride is one of the most important
S agalactiae
virulencefactors It prevents deposition of the host complementfactor C3b and inhibits complement-mediated opsono-phagocytosis The locus involved in type III capsule syn-thesis previously identified (Chaffin
et al
2000) waspresent in NEM316 in its entirety It contains 17 genes(
cpsA-L
neuBCDA
) including the transcriptional regula-tory gene
cpsY
(Nizet and Rubens 2000) The secondimportant polysaccharide the group B antigen is com-posed of a number of rhamnose units and is common toall strains (Yamamoto
et al
1999) The genetic determi-nants involved in the biosynthesis of the group B antigenhave not yet been characterized NEM316 contains a clus-ter of 16 genes which includes six involved in rhamnosemetabolism (dTDP-
L
-rhamnose synthases and rhamno-syltransferases) (gbs1481 gbs1484 gbs1485 gbs1492gbs1493 and gbs1494) A second cluster of three genes(gbs1271-gbs1273) which are paralogs of the
rml
genesof
Streptococcus mutans
is probably involved in the anab-olism of dTDP-
L
-rhamnose It seems probable that one orboth gene clusters are necessary for the synthesis of therhamnose-containing group B-specific cell wall polysac-charide antigen
Surface proteins of pathogenic bacteria play an impor-tant role during the infectious process by mediating inter-actions between the pathogen and the host cells andorevasion from the host defence (Navarre and Schneewind1999) Based on sequence analysis we predicted 30open reading frames in NEM316 encoding putative sur-face proteins bearing a cell wall sorting signal motif(LPXTG
=
21 IPXTG
=
4 LPXTS
=
2 LPXTN
=
2 FPXTG
=
1) (Table 1) Among these proteins 13 have predictedfunctions (three peptidases two nucleases one amidase
one pullulanase and six adhesins) whereas 17 have noobvious function (Table 1) Only four of these proteinshave an ortholog in
S pyogenes
and three in
S pneumo-niae
Six do not possess a streptococcal homologue andthus seem to be specific for
S agalactiae
Therefore eachof these pathogens possesses a specific repertoire ofLPXTG proteins Most
S agalactiae
strains express thecell wall-anchored C5a peptidase ScpB which cleaves thecomplement factor C5a the major neutrophil chemo-attractant produced by activation of the complement cas-cade (Chmouryguina
et al
1996 Takahashi
et al
1999)Interestingly besides ScpB NEM316 encodes two addi-tional cell wall-bound serine proteases (gbs0451 andgbs2008) that are similar to the C5a peptidase (55and 49 of similarity respectively) In particular theseenzymes possess at the expected positions the catalytictriad (D H S) postulated to constitute the active site(Krem and Di Cera 2001) However it remains to bedemonstrated whether these ScpB-like enzymes consti-tute functional serine proteases involved in virulence
Streptococcus agalactiae
encodes immunoprotectivesurface proteins (Rib
a
-C and
b
-C) each containing vari-able series of tandem repeat units (Heden
et al
1991Michel
et al
1992 Wastfelt
et al
1996 Lachenauer
et al
2000) NEM316 codes for the
a
-C-like protein Alp2but does not encode Rib nor
b-C which indicates that thestrain belongs to the molecular subserotype III-3 (Konget al 2002) Sequence comparison revealed that thegene encoding Alp2 in a serotype V strain (Lachenaueret al 2000) was probably derived from the gene encodingAlp2NEM316 by homologous recombination between theinternal repeated sequences Variations in the number ofrepeats of this protein can change its antigenicity a mech-anism to escape host immunity (Gravekamp et al 1996Lachenauer et al 2000) Two additional unrelated surfaceproteins (gbs1529 and gbs1087) possessing numerousinternal direct repeats have been characterized inNEM316 The protein gbs1087 (410 aa long) contains16 contiguous copies of the motif LERRQRDAENRKSQGNV but the function of this protein which does nothave any streptococcal homolog is not known The pro-tein gbs1529 (1310 aa long) contains in its C-terminalhalf 150 contiguous repeats of the motifs STSA (121motifs) or SMSA (29 motifs) This protein displays signifi-cant similarities with Hsa (GspB) of Streptococcus gordo-nii a protein associated with bacterial haemagglutinationand adhesion to a-23-linked sialic acid-containing recep-tors (Takahashi et al 2002)
The so-called LPXTG proteins are covalently linked tothe bacterial cell wall by a transpeptidylation mechanismrequiring a C-terminal sorting signal with a conservedLPXTG motif The enzyme that catalyses the protease-transpeptidase activity is a membrane-associated pro-tein named sortase (Srt) (Pallen et al 2001) NEM316
1502 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B
sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus
Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears
Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals
Gene nameSize(aa)
Cleavagemotif Related proteins
Percentage of aa identity (similarity)segment lengtha Putative functionb
gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)
Pas (S intermedius)37 (52)40536 (52)399
UnknownUnknown
gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)
R28 (S pyogenes)74 (77)79869 (75)1103
Unknown
gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)
PulA (S pneumoniae)65 (79)109549 (65)1305
Alkaline amylopullulanase
gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)
30 (43)138531 (45)1285
Sialic acid binding protein
gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)
50 (62)18330 (60)220
UnknownCholine binding protein
gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)
SrpA (S cristatus)50 (60)131443 (53)1248
Sialic acid binding protein
gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)
YbgE (L lactis)36 (54)47835 (54)492
Amidase
gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)
57 (70)69447 (66)630
Cyclo-nucleotidephosphodiesterase
gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)
PspC (S pneumoniae)31 (46)30223 (38)795
UnknownAdhesin
gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)
32 (46)17640 (55)901
Fibronectin-binding proteinUnknown
gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)
23 (35)55427(42)512
Unknown
gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)
BspA (Bacteroides forsythus)72 (81)105024 (41)566
UnknownUnknown
gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown
a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid
Genome sequence of Streptococcus agalactiae 1503
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function
Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)
Lifestyle of S agalactiae as revealed by metabolism
Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of
haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci
Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments
Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding
1504 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)
The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease
Stress adaptation
The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)
Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types
of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes
The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2
were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase
Physiological adaptation and transcriptional regulation
We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes
Genome sequence of Streptococcus agalactiae 1505
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species
We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements
Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases
Mobile genetic elements
Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal
Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome
An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae
Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
1502 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes five putative sortases (SrtA gbs0630 gbs0631gbs1476 and gbs1475) such as Streptococcus suis(Osaki et al 2002) and a sortase pseudogene(gbs0633) SrtA (gbs0949) is a class A sortase asdefined by Ilangovan et al (2001) and its role in thevirulence of numerous Gram-positive pathogens has beenclearly demonstrated (Mazmanian et al 2000 Bolkenet al 2001 Bierne et al 2002 Garandeau et al 2002)As in other streptococci srtA is located downstream fromthe housekeeping gene gyrA encoding the DNA gyrasesubunit A The four remaining enzymes are class B sor-tases encoded by two pairs of adjacent genes whosefunction has not yet been identified Each pair of class B
sortase genes might be part of an operon which includesthree genes two upstream and one downstream encod-ing LPXTG proteins Possibly these secondary sortasespossess different substrate specificities and may berequired for the processing of the LPXTG proteins belong-ing to the same locus
Another strategy for retention of proteins to bacterialmembranes involves a biochemical pathway of lipid mod-ification All lipoproteins possess a characteristic amino-terminal cysteyl-containing lipobox sequence which issuccessively substrate for two key enzymes the prolipo-protein dioacylglyceryl transferase (Lgt) and the signalpeptidase II In S pneumoniae this pathway appears
Table 1 S agalactiae NEM316 proteins containing cell wall sorting signals
Gene nameSize(aa)
Cleavagemotif Related proteins
Percentage of aa identity (similarity)segment lengtha Putative functionb
gbs0391c 753 LPXTG Sec10 (Enterococcus faecalis) 24 (37)715 Surface exclusion proteingbs0392c 240 Plasmid-encoded protein (E faecalis) 33 (49)225 Unknowngbs0393c 933 SpaA ( S sobrinus)
Pas (S intermedius)37 (52)40536 (52)399
UnknownUnknown
gbs0428 521 Cell surface protein (S pneumoniae) 45 (60)463 Unknowngbs0470 1126 Alp2 (S agalactiae)
R28 (S pyogenes)74 (77)79869 (75)1103
Unknown
gbs0479 253 Plasmid-encoded protein (E faecalis) 33 (47)211 Unknowngbs0791 512 EaeH (Escherichia coli O157H7) 25 (38)358 Adhesingbs1087 410 Antigen p200 (Babesia bigemina) 26 (50)273 Unknowngbs1143 932 SpaA (S sobrinus) 38 (52)406 Unknowngbs1144 236 Plasmid-encoded protein (E faecalis) 31 (47)263 Unknowngbs1145 743 Sec10 (E faecalis) 22 (40)784 Surface exclusion proteingbs1288 1252 PulA (S pyogenes)
PulA (S pneumoniae)65 (79)109549 (65)1305
Alkaline amylopullulanase
gbs1356 1634 Ssp-5 (S gordonii)Pas (S intermedius)
30 (43)138531 (45)1285
Sialic acid binding protein
gbs1420 543 Cell surface protein (S mutans)CbpD (S pneumoniae)
50 (62)18330 (60)220
UnknownCholine binding protein
gbs1474 308 Cell surface protein (S pneumoniae) 31 (45)160 Unknowngbs1529 1310 Hsa (S gordonii)
SrpA (S cristatus)50 (60)131443 (53)1248
Sialic acid binding protein
gbs1539 192 No homology in public databases Unknowngbs1540 680 AmiC (S pyogenes)
YbgE (L lactis)36 (54)47835 (54)492
Amidase
gbs1929 800 CpdB (S dysgalactiae)YfkN (B subtilis)
57 (70)69447 (66)630
Cyclo-nucleotidephosphodiesterase
gbs2008 1570 PrtS (S thermophilus) 49 (65)1596 Serine proteinasegbs2018 643 M-like protein (S equi)
PspC (S pneumoniae)31 (46)30223 (38)795
UnknownAdhesin
gbs1478 901 IPXTG PFBP (S pyogenes)Cell surface protein (S pneumoniae)
32 (46)17640 (55)901
Fibronectin-binding proteinUnknown
gbs0628 554 Hypothetical protein (Lactobacillus leichmannii)Cell surface protein (S pneumoniae)
23 (35)55427(42)512
Unknown
gbs0629 307 No homology in public databases Unknowngbs1477 674 No homology in public databases Unknowngbs0451 1233 LPXTS ScpB (S agalactiae) 38 (55)1194 Serine proteasegbs0456 1055 SPy0843 (S pyogenes)
BspA (Bacteroides forsythus)72 (81)105024 (41)566
UnknownUnknown
gbs1308 1150 LPXTN ScpB (S agalactiae) 99 (99)1150 C5a peptidasegbs1403 690 SPy0872 (S pyogenes) 60 (74)688 Secreted 5rsquo-nucleotidasegbs0632 890 FPKTG Cell surface protein (S pneumoniae) 51 (66)890 Unknown
a Only the homology scores with a sum probability lt e-10 were considered as significant The segment length indicates the number of aminoacids of the region to which the percentage identity and similarity refersb The putative function was inferred by analogy with that of homologous cognate proteins in the public databasesc Encoded by the integrated plasmid
Genome sequence of Streptococcus agalactiae 1503
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function
Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)
Lifestyle of S agalactiae as revealed by metabolism
Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of
haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci
Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments
Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding
1504 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)
The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease
Stress adaptation
The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)
Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types
of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes
The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2
were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase
Physiological adaptation and transcriptional regulation
We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes
Genome sequence of Streptococcus agalactiae 1505
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species
We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements
Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases
Mobile genetic elements
Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal
Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome
An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae
Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
Genome sequence of Streptococcus agalactiae 1503
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
dispensable for growth but essential for virulence (Petitet al 2001) A database search revealed that NEM316encodes one prolipoprotein diacylglycerol transferase(gbs0758) and one signal peptidase II (gbs1436) Wealso identified 36 proteins possessing a lipoprotein lipidattachment site including the laminin binding protein Lmb(Spellerberg et al 1999) 20 components of ABC trans-porters three novel putative proteins with predicted enzy-matic activities (carboxypeptidase isomerase protease)one putative adhesin and 11 putative proteins with noobvious function
Analysis of the NEM316 genome predicted 71 secretedproteins some of which are already known to contributeto pathogenesis In particular we identified the completecyl locus (Pritzlaff et al 2001) containing the genesrequired for cytolysinhaemolysin synthesis cylXDGacpC cylZABEFHIJK including cylE which encodes acytolysin thought to damage pulmonary epithelial cells(Nizet and Rubens 2000) The cocytolysin or CAMP fac-tor that enhances the effect of Staphylococcus aureusb-haemolysin (Spellerberg 2000) is encoded by the cfbgene in S agalactiae Streptococcus pyogenes encodesalso a cfb ortholog but not S pneumoniae NEM316encodes a hyaluronate lyase (gbs1270) a secretedenzyme known to be associated with GBS virulence(Krem and Di Cera 2001) Most interestingly we identifiedgenes coding for yet uncharacterized putative secretedvirulence factors such as two fibronectin-binding proteins(gbs1263 gbs0850) and a neuraminidase (gbs1919) Theprotein gbs1263 is an ortholog of PavA from S pneumo-niae that is essential for adhesion and virulence (Holmeset al 2001 Chhatwal 2002) and gbs1919 is highly sim-ilar (75 of similarity) to NanA of S pneumoniae anenzyme contributing to bacterial colonization and persis-tence in the nasopharynx and middle ear (Tong et al2000)
Lifestyle of S agalactiae as revealed by metabolism
Streptococcus agalactiae is able to synthesize ATP byoxidative phosphorylation (Mickelson 1972) Analysis ofthe genome sequence confirms this result as we identifiedthe structural genes for the cytochrome bd terminal quinoloxidase (gbs1784-gbs1787) They are clustered with twogenes encoding enzymes involved in quinol biosynthesis(heptaprenyl diphosphate synthase and MenA) and agene coding for a NADH dehydrogenase Bacterial bdoxidases have been shown to have a high affinity foroxygen and a low energetic yield (DrsquoMello et al 1996)This enzyme probably contributes to the aerobic growthof S agalactiae With the exception of homologues ofhemK and hemN (gbs1108 and gbs0907) no genesinvolved in haem biosynthesis were identified It is con-ceivable that S agalactiae can utilize external sources of
haem although the corresponding transporters were notidentified As deduced from the genome sequence Sagalactiae is able to ferment different carbon sources tomultiple by-products such as lactate acetate ethanolformate or acetoin which is in agreement with the studyof by-product synthesis during glucose degradation by Sagalactiae (Mickelson 1972) It is worth noting that thebioenergetic metabolism of S agalactiae is more relatedto that of L lactis than to that of S pyogenes or Spneumoniae This is exemplified by the fact that genescoding for the bd oxidase and some fermentative path-ways have orthologs only in the genome sequence of Llactis and not in the two other streptococci
Streptococcus agalactiae has the capacity to import abroad range of carbon sources Seventeen sugar-specific phosphoenolpyruvate-dependent phosphotrans-ferase system (PTS) enzyme II complexes were identifiedThe predicted specificity includes cellobiose b-glucosidetrehalose mannose lactose fructose mannitol N-acetylgalactosamine and glucose Furthermore foursugar-specific ABC transporters three glycerol per-meases and one glycerol-phosphate permease wereidentified Interestingly the glycerol-phosphate permease(gbs1506) is 68 similar to a predicted protein from Shi-gella flexneri suggesting that it was acquired by horizontalgene transfer between these two enteric bacteria It isapparent from genome analysis that the enzymes neces-sary for glycolysis are all present and that the pentosephosphate pathway is only involved in pentose and glu-conate utilization but not to by-pass glycolysis (pentose-phosphate cycle) These results reveal a broad cataboliccapacity for S agalactiae which may reflect its ability toadapt to various environments
Streptococcus agalactiae is known to require agreat number of amino acids for growth (Milligan et al1978) From the genome sequence it is clear that in Sagalactiae as in S pyogenes and in S pneumoniae theTCA cycle is completely missing depriving this bacteriumfrom the ability to synthesize the precursors of most aminoacids Only the biosynthetic pathways for alanine serineglycine glutamine aspartate asparagine and threonineare present Surprisingly the addition of proline is notrequired for growth of S agalactiae although genesencoding the enzymes for the last steps of its biosynthesisseem to be missing As S agalactiae is auxotrophic formost amino acids it needs to import these compoundsfrom exogenous sources This is supported by thegenome analysis predicting eight ABC transporters andeight permeases showing specificity for different aminoacids Another source of amino acids is provided by thedegradation of peptides by peptidases The NEM316genome contains four genes encoding exported pepti-dases three ABC transporters specific for oligopeptidesand the remarkably large number of 21 genes encoding
1504 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)
The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease
Stress adaptation
The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)
Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types
of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes
The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2
were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase
Physiological adaptation and transcriptional regulation
We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes
Genome sequence of Streptococcus agalactiae 1505
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species
We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements
Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases
Mobile genetic elements
Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal
Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome
An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae
Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
1504 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
intracellular peptidases some of which are known to playa role in virulence For example the products of genesshowing similarities to L lactis pepX and pepN affect invivo survival of S agalactiae in a neonatal rat sepsisinfection model (Jones et al 2000) and a zinc metal-lopeptidase encoded by pepB is capable of degradingbioactive peptides (eg bradykinin neurotensin) (Linet al 1996)
The NEM316 genome does not only code for a highnumber of transporters related to amino acid import butalso has in general a broad transport capacity Indeed255 genes belonging to different classes of transporterswere identified The most abundant class is as in mostother genomes ABC transporters One hundred and fifty-nine genes encode the components of 62 ABC trans-porters constituting both import and export systemsIn contrast there are only 36 ABC transporters in Spyogenes but 50 in the L lactis genome No completebiosynthetic pathway for vitamins was predicted There-fore some permeases with unknown specificity may func-tion in the uptake of different vitamins In addition a largeset of transporters for different inorganic compounds liketwo phosphate and two iron ABC transporters and severalcation transport systems is encoded by the S agalactiaegenome This large diversity of transport systems enablesS agalactiae to survive and multiply in different environ-ments and is probably also involved in its capacity tocause disease
Stress adaptation
The physiopathology of S agalactiae infections impliesthat this bacterium can adapt rapidly to various stressconditions including pH osmolarity starvation and tem-perature variations as well as oxidative stress One wayfor bacteria to rapidly adapt to sudden environmentalchanges is the synthesis of proteins acting as chaperonesand proteases Squires and Squires (1992) have shownthat the subunits of the Clp ATP-dependent proteasedisplay both chaperone and proteolytic activities Clpproteins play an important role in virulence such as ClpXof S aureus (Mei et al 1997) ClpE and ClpC of Spneumoniae (Polissi et al 1998 Lau et al 2001) orClpC ClpE and ClpP of Listeria monocytogenes (Nairet al 1999 2000 Gaillot et al 2001)
Streptococcus agalactiae NEM316 encodes a largeset of Clp protease subunits which include the ClpPproteolytic subunit (gbs1634) and four ATPase regu-latory subunits ClpX (gbs1383) ClpC (gbs1869) ClpL(gbs1367) and ClpE (gbs0535) In addition S agalactiaecontains three identical ClpA ATPase paralogs (gbs0718gbs0991 and gbs0388) each of which is located on oneof the three copies of a chromosome-borne plasmid-likeelement In S agalactiae three of the four different types
of heat shock response regulatory mechanisms originallydefined in B subtilis are present Class I heat shockgenes such as the dnaK and groESL operons which aredefined as the HrcA regulon were identified whereasclass II heat shock genes are not present as the sB stresssigma factor is missing Class III regulation seems to bepresent as CtsR and several potential target genes (clpPclpC clpL clpE) preceded by its binding site were iden-tified Finally class IV genes are in B subtilis thosewhose induction by heat shock is not dependent on HrcAsB or CtsR Many of these genes are present in S aga-lactiae such as clpX and ftsH and one can speculate theywill also prove to be heat shock genes
The clearance of S agalactiae in the host is due tophagocytosis by macrophages or neutrophils (Beckeret al 1981 Noel et al 1991 Edwards and Baker 1995)An important killing mechanism of professional phago-cytes involves the production of highly microbicidal reac-tive oxygen metabolites during the so-called lsquooxidativeburstrsquo (Miller and Britigan 1997) Conversion of oxygenradicals is ensured by the superoxide dismutase SodAwhose contribution to virulence of NEM316 has beenshown recently (Poyart et al 2001) Although Sagalactiae does not synthesize a catalase to remove toxicH2O2 it is 10-fold more resistant to oxygen metabolitesthan the catalase-producing S aureus (Nizet and Rubens2000) Several putative enzymes that might detoxify H2O2
were identified an NADH peroxidase (gbs0266) anNADH oxidase (gbs0946) a thiol peroxidase (gbs1198)and the two subunits of an alkylhydroperoxide reductase(gbs1875 and gbs1874) Furthermore the protection ofS agalactiae against oxidative damage is correlated inpart with a greater endogenous content of the oxygen-metabolite scavenger glutathione Several genes encod-ing enzymes involved in glutathione metabolism could beidentified a possible gamma-glutamylcysteine synthetase(gbs1862) a lactoylglutathione lyase (gbs1544) and aglutathione reductase (gbs1445) Surprisingly we couldnot identify a gene encoding a glutathione synthase
Physiological adaptation and transcriptional regulation
We identified 107 transcriptional regulators representing5 of the predicted genes which is similar to the numberof regulators (120) found in L lactis (Bolotin et al 2001)This high proportion of regulators should confer on Sagalactiae the capacity to adapt to various environmentsStreptococcus agalactiae encodes three putative sigmafactors (a number comparable to that of other streptococ-cal species) the major sigma factor (sA) ComX and anECF-type sigma factor Interestingly sigma factors of theECF-subfamily have been found in several Gram-positivebacteria such as Streptococcus equi L lactis and Bsubtilis but not in S pneumoniae or S pyogenes
Genome sequence of Streptococcus agalactiae 1505
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species
We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements
Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases
Mobile genetic elements
Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal
Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome
An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae
Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
Genome sequence of Streptococcus agalactiae 1505
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
The most striking observation was the importanceof two-component regulator systems We identified 20sensor histidine kinases and 21 response regulatorsGenome comparison reveals that the number of two-component systems is much higher than that found inrelated species 14 in S pneumoniae 13 in S pyogenesand only eight in L lactis (Bolotin et al 2001 Ferrettiet al 2001 Tettelin et al 2001) This observation sug-gests that S agalactiae may have a higher capacity tomonitor various environmental parameters in comparisonto these phylogenetically closely related species
We identified orthologs of the yycFyycG system(gbs0741 gbs0742) which is essential for growth of lowG + C content Gram-positive bacteria (Hoch 2000) andorthologs of two-component regulatory systems known toplay a role in virulence of S pneumoniae S pyogenes orS aureus covScovR (Levin and Wessels 1998) ciaHciaR (Guenzi et al 1994) vncSvncR (Novak et al 2000)and lytS lytR-1 (Brunskill and Bayles 1996) Recently ina screen for S agalactiae mutants with altered fibronectinbinding properties Spellerberg et al (2002) identified ina S agalactiae type Ia strain a putative quorum-sensingsystem rgfBDAC including a two-component regulatorysystem (rgfCrgfA) In NEM316 the rgfD gene is deletedas well as the first 355 codons of the rgfC gene Further-more the two rgfA homologues seem to have beenexchanged by horizontal gene transfer as they onlydisplay 80 of identity whereas the DNA sequence ofthe rgfB genes are almost identical (99 of identity)Analysis of the diversity of this locus in various clinicalisolates will help to understand the significance of theserearrangements
Rgg RofA and Nra are three regulators reported to beimportant for virulence gene regulation in S pyogenes(Beckert et al 2001 Molinari et al 2001 Chausseeet al 2002) NEM316 contains three rgg-like paralogs(gbs0230 gbs1555 and gbs2117) and three rofAnra-likeparalogs (gbs1426 gbs1479 and gbs1530) that may beimportant for virulence gene expression It is thereforepossible that S agalactiae and S pyogenes share similarregulation circuits for expression of virulence genesalthough they occupy different environmental niches andcause very different diseases
Mobile genetic elements
Among the mobile genetic elements described so far inS agalactiae (IS861 IS1381 IS1548 ISSa1 ISSag2 andGBSi1) NEM316 contains only the two copies of ISSag2bracketing the scpB and lmb genes (Franken et al 2001)Six novel putative IS elements were identified and amongthese only one transposase does not seem inactivated byframeshift mutations (gbs0208) This strain does not con-tain any repetitive element equivalent to pneumococcal
Rup and Box sequences Although no complete or crypticprophage was identified in the NEM316 genome a strik-ing observation was the identification of a large numberof plasmid- and phage-related genes We identified 12genes encoding proteins related to plasmid functions(replication partition or transfer) which are often foundin the vicinity of integrase genes and 12 genes encod-ing proteins similar to phage integrases Seven of the12 putative integrases (gbs2079 gbs0237 gbs0965gbs0211 gbs0588 gbs1224 and gbs2073) display intheir carboxylic extremity the four amino acids (H RH Y)found in most tyrosyl site-specific recombinases (Argoset al 1986) Although sequence analysis suggests thatthe NH2 moiety of the putative proteins gbs2079 andgbs0588 has been deleted some of these integrases maystill be active and may have played a role in the evolutionof the S agalactiae genome
An interesting finding was the presence of a 47 068 bpsequence repeated three times in the genome ofNEM316 which was designated pNEM316-1 Sequenceanalysis of the chromosomepNEM316-1 junction frag-ments revealed that the element was flanked by invertedrepeated (IR) sequences (Fig 2) Characterization of thethree insertion sites of pNEM316-1 in a strain NEM-lj107devoid of this element allowed us to accurately localizeits extremities Integration of pNEM316-1 was associatedwith a 9 bp or 10 bp DNA duplication at the site of inser-tion (Fig 2) All three copies of pNEM316-1 are identicaland encode numerous putative proteins reminiscent ofplasmid replication and transfer functions It is worth not-ing that a 16 kb DNA fragment containing one IR ofpNEM316-1 encodes a 442 aa basic protein (gbs0740pKi = 977) This protein possesses two of the four aminoacids characteristic of site-specific recombinases includ-ing the tyrosyl residue covalently linked to DNA duringrecombination and gbs0740 might therefore constitutethe integrase of pNEM316-1 In addition PCR analysisdemonstrates that pNEM316-1 can be present in the cir-cular form in which the two extremities designated attach-ment (att) sites are separated by a 9 or 10 bp motif(Fig 2) These results suggest that pNEM316-1 repre-sents a new type of integrative plasmid that may play arole in gene acquisition in S agalactiae
Two orthologs of the putative intpNEM316-1 gene (gbs0740)are present in the NEM316 chromosome One of thesegbs1118 (66 of aa similarity) is located on a large DNAsegment (approx 355 kb) encoding putative plasmid rep-lication and transfer functions However as only one copyof this element is present in NEM316 its extremities couldnot be characterized This genetic structure may alsobelong to the new type of mobile elements typified bypNEM316-1 The second ortholog gbs1309 (71 of aasimilarity) is located between ISSag2U and the genescpB However this putative integrase gene is apparently
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
1506 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
truncated in its C-terminal end and has lost the criticaltyrosyl residue
Comparative genomics defines a composite organization of the S agalactiae chromosome
For comparative genomics analysis we took advantageof the availability of the complete genome sequences oftwo related pathogenic species S pneumoniae strainTIGR4 and S pyogenes strain M1 In total 1173 and 1139of the 2118 genes of S agalactiae were consideredas orthologous with genes from S pyogenes and S pneu-moniae respectively Comparison of the order of ortholo-gous genes between S agalactiae and S pyogenesrevealed a high conservation of the chromosomal archi-tecture with only 36 breakpoints of synteny between bothgenomes (Fig 3) On the contrary low conservation ingene order between S agalactiae and S pneumoniae
was observed except for some operons and a few func-tionally related gene clusters (Fig 3) These findingsmay reflect the closer evolutionary relationship of S aga-lactiae with S pyogenes than with S pneumoniae andthe importance of recombination and transformation-mediated gene transfer in the evolution of the S pneumo-niae genome
Analysis of the functions and locations of the 945 geneswithout an ortholog in S pyogenes reveals a particularfeature of the S agalactiae genome These genes areclustered in 200 regions dispersed around the chromo-some ranging in size from 1ndash77 genes which can beclearly divided into two groups The first group contains471 genes which are clustered in 14 large islands (includ-ing the three copies of the integrative plasmid) containing11ndash77 genes (Fig 1 and Table 2) These 14 islands con-tain all genes related to mobile elements except the onlyintact IS element Six of these islands (I II IV XI and XIII)
Fig 2 Target DNA sequences at the three integration sites (A B and C) of pNEM316-1 in the chromosome of NEM316 For two insertion sites (A and B) the corresponding sequences in a strain devoid of pNEM-316ndash1 NEM-lj107 are indicated below the sequence of the NEM316 chromosome The sequence of the circularized attachment (att) site of pNEM316-1 is indicated in (D) In this case the DNA sequence located between the inverted repeats of pNEM316-1 was unreadable presumably because it results from the sequencing of a mixture of excisants originating from the insertions in A B and C The imperfect inverted repeats delineating pNEM316-1 (att-L and att-R) are indicated by horizontal arrows (right and left extremities were arbitrarily defined) The 9 or 10 bp duplicated at pNEM316-1 insertion sites are boxed Co-ordinates refer to the position +1 of the NEM316 chromosome
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
Genome sequence of Streptococcus agalactiae 1507
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
are found next to tRNA genes which may be their inser-tion sites as reported for typical pathogenicity islands(Hacker et al 1997) Most importantly the majority ofknown or putative virulence genes of S agalactiae werefound within these islands (Table 2) and these may there-fore be defined as pathogenicity islands For exampleisland XII contains the lmb and scpB gene island IV thealp2 gene and island VI contains the cyl operon and twotype B sortase genes linked with three genes encodingLPXTG surface proteins Overall 14 out of 30 genesencoding LPXTG proteins are carried by such islands Inaddition to virulence traits genes involved in adaptationto adverse conditions might have been acquired throughsuch elements as island I and XIV each contains anosmoprotectant ABC transporter Genes involved in the
arginine deiminase pathway which may play a role inresistance to acidic conditions are also found on islandXIV The genes encoding the ECF sigma factor as well asfive two-component regulatory systems and one Rgg-likeregulator are located on such islands indicating their rolein specific adaptation
Several of these islands show a mixture of genetic lsquosig-naturesrsquo of different mobile elements such as replicationand transfer proteins of plasmids phage proteins inte-grases or transposases Their present structure may havebeen the result of several consecutive recombinationevents and therefore it is not possible to define theseislands as putative plasmids phages or conjugative trans-posons nor to predict the exact limits of these insertedelements Furthermore on the basis of bidirectional best
Fig 3 Synteny between S agalactiae and S pyogenes (A) and S agalactiae and S pneumoniae (B) Each diamond represents one gene having an ortholog in both genomes with co-ordinates corresponding to the position in kb in each genome
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
1508 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Tab
le 2
Is
land
s in
sert
ed in
the
S a
gala
ctia
e N
EM
316
chro
mos
ome
Isla
ndN
umbe
r of
gen
es(s
ize
kb)
Pos
ition
Gen
es r
elat
ed t
o m
obile
ele
men
tsa
Kno
wn
gene
s an
d ge
nes
sim
ilar
to k
now
n ge
nesab
Pse
udog
enes
tRN
A
I25
(18
5)
2328
96ndash2
5135
7In
tegr
ase
pla
smid
rep
licat
ion
rec
ombi
nase
res
olva
seO
smop
rote
ctan
t A
BC
tra
nspo
rter
re
gula
tor
Rgg
3A
la T
GC
II19
(14
5)
2578
78ndash2
7244
8In
tegr
ase
pla
smid
rep
licat
ion
DN
A t
rans
loca
seR
NA
pol
ymer
ase
EC
F t
ype
sigm
a fa
ctor
ac
etyl
tra
nsfe
rase
A
BC
tra
nspo
rter
0Le
u C
AA
III49
(47
)38
5757
ndash432
824
Pla
smid
rep
licat
ion
top
oiso
mer
ase
sin
gle
stra
ndC
lp p
rote
ase
AT
P-b
indi
ng s
ubun
it C
lpA
0
VII
7118
24ndash7
5889
1bi
ndin
g pr
otei
n p
lasm
id t
rans
fer
com
plex
pro
tein
LP
XT
G p
rote
ins
(3)
VIII
1013
025ndash
1060
094
plas
mid
par
titio
n pr
otei
n p
lasm
id r
eplic
atio
n in
itiat
ion
IV23
(18
8)
4895
25ndash5
0830
3In
tegr
ase
ph
age
and
plas
mid
rel
ated
pro
tein
sD
ecar
boxy
lase
al
coho
l deh
ydro
gena
se
oxid
ored
ucta
se
Alp
23
Thr
GG
T
V10
(9
5)61
2484
ndash621
922
Inte
gras
e
tran
spos
ase
(2)
AB
C t
rans
port
er
two-
com
pone
ntre
gula
tory
Vnc
RS
2A
rg C
CT
VI
58 (
577
)63
9392
ndash697
130
Tran
spos
ase
Alp
ha li
ke
Hsp
33
Cyl
locu
s L
PX
TG
prot
eins
(3)
sor
tase
(2)
ch
aper
onin
7
IX27
(25
5)
1093
449ndash
1118
984
DN
A t
rans
loca
seTw
o-co
mpo
nent
Lyt
RS
C
arbo
n st
arva
tion
prot
ein
A
imm
unog
enic
sec
rete
d pr
otei
n2
X36
(33
5)
1163
887ndash
1197
331
Pla
smid
rel
axas
e an
d m
obili
satio
n t
rans
fer
com
plex
prot
eins
Trs
K T
rsE
pl
asm
id r
eplic
atio
n in
itiat
ion
LPX
TG
pro
tein
s (3
) D
NA
met
hyl
tran
sfer
ase
0
XI
11 (
75)
1253
733ndash
1261
229
Inte
gras
e0
Arg
CC
GX
II70
(81
5)
1341
772ndash
1423
296
Tran
spos
ase
(3)
D
NA
pol
ymer
ase
exo
nucl
ease
in
tegr
ase
(2)
pla
smid
rep
licat
ion
typ
e II
DN
A
mod
ifica
tion
tra
nspo
son
rela
xase
he
licas
e (
2)
plas
mid
tra
nsfe
r co
mpl
ex p
rote
in (
TraE
and
Trs
K)
Mul
ti dr
ug r
esis
tanc
e L
mb
S
cpB
la
ctos
eut
ilisa
tion
oper
on
LPX
TG
pro
tein
s D
NA
met
hyl t
rans
fera
se
4
XIII
47 (
445
)20
3671
8ndash20
8128
0In
tegr
ase
Cam
p f
acto
r C
5A p
eptid
ase
met
hion
ine
synt
hase
Met
C
glyc
erol
deh
ydro
gena
se
amin
ogly
cosi
de 6
-ade
nyly
ltran
sera
se
AB
C
tran
spor
ter
two-
com
pone
nt r
egul
ator
4Ly
s C
TT
XIV
22 (
225
)21
4383
1ndash21
6632
4P
lasm
id r
eplic
atio
n pr
otei
n in
tegr
ase
inte
gras
e
(22)
Two
com
pone
nt r
egul
ator
(2)
ca
rbam
ate
kina
se
orni
thin
e ca
rbam
oyltr
ansf
eras
eos
mop
rote
ctan
t A
BC
tra
nspo
rter
2
a N
umbe
r of
par
alog
s w
ithin
par
enth
eses
viru
lenc
e ge
nes
in b
old
lette
rs
b
Onl
y re
leva
nt g
enes
are
indi
cate
d
pseu
doge
nes
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
Genome sequence of Streptococcus agalactiae 1509
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
hits by BLASTP some genes within these islands havebeen defined as orthologous to genes of S pyogenes orS pneumoniae although they seemed to have beenacquired by horizontal gene transfer For example onisland XII the lactose utilization operon shows higher sim-ilarity to its S pneumoniae counterpart than to the onefrom S pyogenes in contradiction with the phylogeny ofthe three species On the same island the scpB and lmbgenes encoding the C5A peptidase and the laminin bind-ing protein respectively are almost identical to the corre-sponding genes in S pyogenes indicating a horizontalgene transfer between these two species (Chmouryguinaet al 1996 Franken et al 2001) Furthermore the lacoperon as well as the scpB and lmb genes have a higherG + C content than the chromosomal average (42 41 and40 G + C respectively) further supporting an exogenousorigin of these genes These islands seem also to undergorapid evolution as they carried 25 of the 36 pseudogenesidentified in the genome For example island VI carriesseven pseudogenes including a truncated gene showing70 similarity to the Alp2 protein as well as a truncatedsortase gene These observations indicate that not onlydoes the chromosome of S agalactiae show a compositearchitecture but also the horizontally acquired islands doas they seem to be submitted to rearrangements andrapid evolution A broad diversity among different isolatesof S agalactiae may therefore be expected
The second group of S agalactiae genes without anortholog in S pyogenes strain M1 contains 468 genesclustered in 186 islets (1ndash17 genes) which do not seemto be related to mobile elements These islets are dis-persed around the chromosome and correspond either todeletions in the S pyogenes genome (ie orthologousgenes are present in the S pneumoniae or in the L lactisgenomes) or to insertions in the NEM316 genome Somegenes within these islets may also play a role in virulenceof S agalactiae The best characterized of these regionsencodes the functions required for capsule biosynthesisAnother interesting region corresponds to the gene clusterpossibly involved in antigen B biosynthesis (gbs1479-gbs1494) It contains also two genes encoding type Bsortase proteins and three genes encoding LPXTG pro-teins as well as a rofA-like regulatory gene (gbs1479) Thelongest islet carried gbs1529 encoding a protein similarto the S gordonii haemagglutinin protein which clusterswith six glycosylases and two paralogous genes of secAand secY which are possibly required for the secretion ofthis highly repetitive protein
What is the mechanism of acquisition of these isletsWith the exception of the components of the regulatorycircuit responsible of induction of competence in Spneumoniae ComA ComB and ComCDE we identifiedorthologs of all genes necessary for competence in Spneumoniae The NEM316 genome encodes also an
homologue (gbs0090) of the two S pneumoniaecompetence-specific sigma factors comX1 and comX2but the cilE gene (comC in B subtilis) seems to be trun-cated Although S agalactiae is not known to be compe-tent it is possible that some isolates may in certain growthconditions be competent andor that the ancestor of Sagalactiae was competent Therefore most of the isletscorresponding to insertions may have been acquired bytransformation although transduction and conjugationcannot be excluded The part of the S agalactiae genomemissing in S pyogenes may account for specificities ofthis species whereas the islands related to mobile ele-ments seem to correspond to a more recent evolution andshould prove essential for virulence
Conclusion
The genome of S agalactiae contains a large number ofchromosomal islands a unique feature with regard to thedifferent streptococcal genomes determined up to now butreminiscent of pathogenic Escherichia coli (Perna et al2001) Accordingly most of the known and putative viru-lence genes are located within these chromosomalislands which show characteristics of pathogenicityislands They also contain most of the pseudogenes iden-tified as well as genes probably mediating horizontal genetransfer strongly suggesting that these islands undergorapid evolution The unexpected similarity in genomeorganization between pathogenic E coli and S agalactiaemight not be fortuitous as these two bacteria are commen-sals of the mammalian gut with an extremely diverseand abundant microbial flora An exciting evolutionaryhypothesis is that pathogenic E coli and S agalactiaehave gradually evolved as pathogens through successiveacquisition of exogenous virulence factors carried bysuch islands In particular the emergence of hypervirulentS agalactiae clones might result from such horizontalgene transfer (Musser et al 1989 Quentin et al 1995Blumberg et al 1996)
The development of a GBS vaccine to fight invasiveneonatal disease is considered world wide as a priority byhealth authorities The complete genome sequence of Sagalactiae strain NEM316 and its analysis opens now newavenues for the identification of novel potential vaccinetargets However the apparent genetic flexibility amongisolates should be taken into consideration for the devel-opment of a universal vaccine
Experimental procedures
Bacterial strains plasmids and growth conditions
Streptococcus agalactiae strain NEM316 a serotype III strain(ATCC 12403 CIP 8245) responsible for a fatal case of
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
1510 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
septicaemia was chosen from our laboratory collectionNEM-lj107 is a serotype Ia strain isolated from urine at theHocircpital Necker-Enfants Malades For the construction of theBAC library the large insert library and the shotgun librariesthe following plasmids and recipient strains were usedpBeloBac11 and strain DH10B pSYX34 and XL10-blueKanR (Stratagene) and pcDNA21 (Invitrogen) and XL2-blueEscherichia coli and S agalactiae strains were grown inLuriandashBertani and brainndashheart infusion medium respectively
Sequencing and assembly methods
Genome sequencing was performed using the wholegenome shotgun strategy (Fleischmann et al 1995) asdescribed by (Frangeul et al 1999) Two libraries (1ndash2 kband 2ndash3 kb inserts) were generated by random mechanicalshearing of genomic DNA and cloning into pcDNA-21 (Invit-rogen) A scaffold was obtained by end-sequencing clonesfrom a BAC library with an average fragment size of 50ndash90 kb and of a medium size insert library (5ndash10 kb) in thelow copy number vector pSYX34 (Xu and Fomenkov 1994)Recombinant plasmids were used as templates for cyclesequencing reactions consisting in 35 cycles (96infinC for 30 s50infinC for 15 s 60infinC for 4 min) in a thermocycler Sampleswere precipitated and loaded onto a 96-lane capillary auto-matic 3700 DNA sequencer (Applied Biosystems) In aninitial steps 34 431 sequences from the four librarieswere assembled into 230 contigs using the PhredPhrapConsed software (Ewing AND Green 1998 Gordon et al1998) (sequence coverage 85-fold) CAAT-Box (L Frangeulunpublished) was used to predict links between contigsPolymerase chain reaction products amplified from NEM316chromosomal DNA as template were used to fill gaps and tore-sequence low quality regions using primers designed byConsed Physical gaps were closed by using combinatorialPCR The correctness of the assembly was confirmed byanalysing the BAC clone scaffold and by ensuring that thephysical map deduced from the genome sequence was iden-tical to the one obtained experimentally by pulsed field gelelectrophoresis
Amplification and sequencing of the att site of the circularform of pNEM316-1 was performed by PCR using the follow-ing oligonuleotide pair 5cent-CAATAACCATTCTGTAGATCCTTC-3cent and 5cent-TTTAACCAGTTCAAACGAAAGT-3cent Amplifica-tion and sequencing of the two chromosomal insertion sitesin NEM-lj107 was performed using the oligonucleotides5cent-TAAAAGGTTTTCTCAGAGTATTATCA-3cent and 5cent-TTTTCCTCTTAAGGGAGTAAGC-3cent (insertion 1) or 5cent-ATTCATGTCCATTCACGACC-3cent and 5cent-TCCCACTTCCATTCATAAACT-3cent(insertion 2)
Annotation methods
The CAAT-Box environment (L Frangeul unpublished) wasused for genome annotation Coding sequences (CDS) weredefined by combining Genemark predictions (Isono et al1994) with visual inspection of each open reading frame(ORF) for the presence of a start codon with an upstreamribosome binding site and BLASTP similarity searches on theNrprot database (Altschul et al 1997) The Genemark pre-
dictions were trained on a set of ORFs longer than 300codons encoding proteins similar to proteins with known func-tion present in public databases Initially only CDSs longerthan 80 codons were retained Subsequently all CDSsbetween 40 and 80 codons were searched using the samematrix but only those with a high coding probability as pre-dicted by Genmark were retained In a final step all inter-genic regions were searched for short or truncated genes byBLASTX comparisons with protein sequence libraries Limitsof the ribosomal RNA operons were identified by homologywith the other streptococcal genomes tRNAs were searchedby tRNAscan-SE (Lowe and Eddy 1997)
All predicted CDSs were examined visually Function pre-dictions were based on BLASTP similarity searches and on theanalysis of motifs using the PFAM databases Toppred2 wasused to identify transmembrane domains (Claros and vonHeijne 1994) SignalP vs20 was used to predict signal pep-tide regions (Nielsen et al 1999) and the Petrin algorithmwas used to predict transcriptional terminators (drsquoAubentonCarafa et al 1990) Lipoproteins were defined as proteinscontaining a lipoprotein modificationprocessing motif(Hayashi and Wu 1990) and a signal sequence identified bySignalP vs20 Secreted proteins were defined as proteinscontaining a signal peptide (predicted by SignalP vs20)but no other transmembrane domain Each protein wasmanually inspected Orthologs between S agalactiae and Spyogenes S pneumoniae or L lactis were defined as genesshowing bi-directional best-hits by S agalactiae proteomeBLASTP comparisons The threshold was set to a minimum of50 sequence identity and a ratio of 08ndash125 of the proteinlength
The genome sequence and the annotation are accessi-ble via a Sybase relational database constructed accordingto the SubtiList model (Moszer et al 2002) at httpgenolistpasteurfrSagaList
Acknowledgements
We thank Tim Stinear for critical reading of the manuscriptand Louis Malin Jones for the construction of the SagaListdatabase Financial support from the Institut Pasteur(Programme Transversal de Recherche ninfin17) CNRSURA 2171 ltltGeacuteneacutetique des Geacutenomesgtgt and Ministegraverede la Recherche (Reacuteseaux des Geacutenopoles) is gratefullyacknowledged
References
Altschul SF Madden TL Schaffer AA Zhang JZhang Z Miller W and Lipman DJ (1997) GappedBLAST and PSI-BLAST a new generation of protein data-base search programs Nucleic Acids Res 25 3389ndash3402
Argos P Landy A Abremski K Egan JB Haggard-Ljungquist E Hoess RH et al (1986) The integrasefamily of site-specific recombinases regional similaritiesand global diversity EMBO J 5 433ndash440
drsquoAubenton Carafa Y Brody E and Thermes C (1990)Prediction of rho-independent Escherichia coli transcrip-tion terminators A statistical analysis of their RNA stem-loop structures J Mol Biol 216 835ndash858
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
Genome sequence of Streptococcus agalactiae 1511
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Becker ID Robinson OM Bazan TS Lopez-Osuna Mand Kretschmer RR (1981) Bactericidal capacity of new-born phagocytes againts group B beta-hemolytic strepto-cocci Infect Immun 34 535ndash539
Beckert S Kreikemeyer B and Podbielski A (2001)Group A streptococcal rofA gene is involved in the controlof several virulence genes and eukaryotic cell attachmentand internalization Infect Immun 69 534ndash537
Bierne H Mazmanian SK Trost M Pucciarelli MG LiuG Dehoux P et al (2002) Inactivation of the srtA genein Listeria monocytogenes inhibits anchoring of surfaceproteins and affects virulence Mol Microbiol 43 869ndash881
Blumberg HM Stephens DS Modansky M Erwin MElliot J Facklam RR et al (1996) Invasive group Bstreptococcal disease the emergence of serotype V JInfect Dis 173 365ndash373
Bolken TC Franke CA Jones KF Zeller GO JonesCH Dutton EK and Hruby DE (2001) Inactivation ofthe srtA gene in Streptococcus gordonii inhibits cell wallanchoring of surface proteins and decreases in vitro andin vivo adhesion Infect Immun 69 75ndash80
Bolotin A Wincker P Mauger S Jaillon O Malarme KWeissenbach J Ehrlich SD and Sorokin A (2001) Thecomplete genome sequence of the lactic acid bacteriumLactococcus lactis ssp lactis IL1403 Genome Res 11731ndash753
Brunskill EW and Bayles KW (1996) Identification andmolecular characterization of a putative regulatory locusthat affects autolysis in Staphylococcus aureus J Bacteriol178 611ndash618
Chaffin DO Beres SB Yim HH and Rubens CE(2000) The serotype of type Ia and III group B streptococciis determined by the polymerase gene within the polycis-tronic capsule operon J Bacteriol 182 4466ndash4477
Chaussee MS Sylva GL Sturdevant DE Smoot LMGraham MR Watson RO and Musser JM (2002)Rgg Influences the Expression of Multiple Regulatory LociTo Coregulate Virulence Factor Expression in Streptococ-cus pyogenes Infect Immun 70 762ndash770
Chhatwal GS (2002) Anchorless adhesins and invasins ofGram-positive bacteria a new class of virulence factorsTrends Microbiol 10 205ndash208
Chmouryguina I Suvorov A Ferrieri P and Cleary PP(1996) Conservation of the C5a peptidase genes in groupA and B streptococci Infect Immun 64 2387ndash2390
Claros MG and von Heijne G (1994) TopPred II animproved software for membrane protein structure predic-tions Comput Appl Biosci 10 685ndash686
DrsquoMello R Hill S and Poole RK (1996) The cytochromebd quinol oxidase in Escherichia coli has an extremely highoxygen affinity and two oxygen-binding haems implica-tions for regulation of activity in vivo by oxygen inhibitionMicrobiology 142 755ndash763
Edwards MS and Baker CJ (1995) Streptococcus aga-lactiae (Group B Streptococcus) In Mandell Douglas andBennettrsquos Principles and Practice of Infectious Diseases4th edn Mandell GL Bennett JE and Dolin R (eds)New York Churchill Livingstone pp 1835ndash1845
Ewing B and Green P (1998) Base-calling of automatedsequencer traces using phred II Error probabilitiesGenome Res 8 186ndash194
Farley MM (2001) Group B streptococcal disease in non-pregnant adults Clin Infect Dis 33 556ndash561
Ferretti JJ McShan WM Ajdic D Savic DJ Savic GLyon K et al (2001) Complete genome sequence of anM1 strain of Streptococcus pyogenes Proc Natl Acad SciUSA 98 4658ndash4663
Fleischmann RD Adams MD White O Clayton RAKirkness EF Kerlavage JM et al (1995) Whole-genome random sequencing and assembly of Haemophi-lus influenzae Rd Science 269 496ndash512
Frangeul L Nelson KE Buchrieser C Danchin AGlaser P and K (1999) Cloning and assembly strategiesin microbial genome projects Microbiology 145 2625ndash2634
Franken C Haase G Brandt C Weber-Heynemann JMartin S Lammler C et al (2001) Horizontal genetransfer and host specificity of beta-haemolytic strepto-cocci the role of a putative composite transposon contain-ing scpB and Lmb Mol Microbiol 41 925ndash935
Gaillot O Bregenholt S Jaubert F Di Santo JP andBerche P (2001) Stress-induced ClpP serine protease ofListeria monocytogenes is essential for induction of listeri-olysin O-dependent protective immunity Infect Immun 694938ndash4943
Garandeau C Reglier-Poupet H Dubail I Beretti JLBerche P and Charbit A (2002) The Sortase SrtA ofListeria monocytogenes Is Involved in Processing of Inter-nalin and in Virulence Infect Immun 70 1382ndash1390
Gibson RL Lee MK Soderland C Chi EY andRubens CE (1993) Group B streptococci invade endot-helial cells type III capsular polysaccharide attenuatesinvasion Infect Immun 61 478ndash485
Gordon D Abajian C and Green P (1998) Consed agraphical tool for sequence finishing Genome Res 8 195ndash202
Gravekamp C Horensky DS Michel JL and MadoffLC (1996) Variation in repeat number within the alpha Cprotein of group B streptococci alters antigenicity and pro-tective epitopes Infect Immun 64 3576ndash3583
Guenzi E Gasc AM Sicard MA and Hakenbeck R(1994) A two-component signal-transducing system isinvolved in competence and penicillin susceptibility inlaboratory mutants of Streptococcus pneumoniae MolMicrobiol 12 505ndash515
Hacker J Blum-Oehler G Muumlhldorfer I and Tschaumlpe H(1997) Pathogenicity islands of virulent bacteria structurefunction and impact on microbial evolution Mol Microbiol23 1089ndash1097
Hayashi S and Wu HC (1990) Lipoproteins in bacteria JBioenerg Biomembr 22 451ndash471
Heden LO Frithz E and Lindahl G (1991) Molecularcharacterization of an IgA receptor from group B strepto-cocci sequence of the gene identification of a proline-richregion with unique structure and isolation of N-terminalfragments with IgA-binding capacity Eur J Immunol 211481ndash1490
Hoch JA (2000) Two-component and phosphorelay signaltransduction Curr Opin Microbiol 3 165ndash170
Holmes AR McNab R Millsap KW Rohde MHammerschmidt S Mawdsley JL and Jenkinson HF(2001) The pavA gene of Streptococcus pneumoniae
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
1512 P Glaser et al
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
encodes a fibronectin-binding protein that is essential forvirulence Mol Microbiol 41 1395ndash1408
Hoskins J Alborn WE Jr Arnold J Blaszczak LCBurgett S DeHoff BS et al (2001) Genome of thebacterium Streptococcus pneumoniae strain R6 J Bacte-riol 183 5709ndash5717
Ilangovan U Ton-That H Iwahara J Schneewind O andClubb RT (2001) Structure of sortase the transpeptidasethat anchors proteins to the cell wall of Staphylococcusaureus Proc Natl Acad Sci USA 98 6056ndash6061
Isono K McIninch JD and Borodovsky M (1994) Char-acteristic features of the nucleotide sequences of yeastmitochondrial ribosomal protein genes as analyzed bycomputer program GeneMark DNA Res 1 263ndash269
Jones AL Knoll KM and Rubens CE (2000) Identi-fication of Streptococcus agalactiae virulence genes inthe neonatal rat sepsis model using signature-taggedmutagenesis Mol Microbiol 37 1444ndash1455
Keefe GP (1997) Streptococcus agalactiae mastitis areview Can Vet J 38 429ndash437
Kong F Gowan S Martin D James G and Gilbert GL(2002) Molecular profiles of group B streptococcal surfaceprotein antigen genes relationship to molecular serotypesJ Clin Microbiol 40 620ndash626
Krem MM and Di Cera E (2001) Molecular markers ofserine protease evolution EMBO J 20 3036ndash3045
Lachenauer CS Creti R Michel JL and Madoff LC(2000) Mosaicism in the alpha-like protein genes ofgroup B streptococci Proc Natl Acad Sci USA 97 9630ndash9635
Lancefield RC and Hare R (1935) The serological differ-entiation of pathogenic and non-pathogenic strains ofhemolytic streptococci from parturient women J Exp Med61 335ndash349
Lau GW Haataja S Lonetto M Kensit SE Marra ABryant AP et al (2001) A functional genomic analysis oftype 3 Streptococcus pneumoniae virulence Mol Microbiol40 555ndash571
Levin JC and Wessels MR (1998) Identification of csrRcsrS a genetic locus that regulates hyaluronic acid capsulesynthesis in group A Streptococcus Mol Microbiol 30209ndash219
Lin B Averett WF Novak J Chatham WWHollingshead SK Coligan JE et al (1996) Characteri-zation of PepB a group B streptococcal oligopeptidaseInfect Immun 64 3401ndash3406
Lowe TM and Eddy SR (1997) tRNAscan-SE a programfor improved detection of transfer RNA genes in genomicsequence Nucleic Acids Res 25 955ndash964
Mazmanian SK Liu G Jensen ER Lenoy E andSchneewind O (2000) Staphylococcus aureus sortasemutants defective in the display of surface proteins and inthe pathogenesis of animal infections Proc Natl Acad SciUSA 97 5510ndash5515
Mei JM Nourbakhsh F Ford CW and Holden DW(1997) Identification of Staphylococcus aureus virulencegenes in a murine model of bacteraemia using signature-tagged mutagenesis Mol Microbiol 26 399ndash407
Michel JL Madoff LC Olson K Kling DE KasperDL and Ausubel FM (1992) Large identical tandemrepeating units in the C protein alpha antigen gene bca
of group B streptococci Proc Natl Acad Sci USA 8910060ndash10064
Mickelson MN (1972) Glucose degradation molar growthyields and evidence for oxidative phosphorylation in Strep-tococcus agalactiae J Bacteriol 109 96ndash105
Miller RA and Britigan BE (1997) Role of oxidants inmicrobial pathophysiology Clin Microbiol Rev 10 1ndash18
Milligan TW Doran TI Straus DC and Mattingly SJ(1978) Growth and amino acid requirements of variousstrains of group B streptococci J Clin Microbiol 7 28ndash33
Molinari G Rohde M Talay SR Chhatwal GS BeckertS and Podbielski A (2001) The role played by the groupA streptococcal negative regulator Nra on bacterial inter-actions with epithelial cells Mol Microbiol 40 99ndash114
Moszer I Jones LM Moreira S Fabry C and DanchinA (2002) SubtiList the reference database for the Bacillussubtilis genome Nucl Acids Res 30 62ndash65
Musser JM Mattingly SJ Quentin R Goudeau A andSelander RK (1989) Identification of a high-virulenceclone of type III Streptococcus agalactiae (group B Strep-tococcus) causing invasive neonatal disease Proc NatlAcad Sci USA 86 4731ndash4735
Nair S Frehel C Nguyen L Escuyer V and Berche P(1999) ClpE a novel member of the HSP100 family isinvolved in cell division and virulence of Listeria monocy-togenes Mol Microbiol 31 185ndash196
Nair S Milohanic E and Berche P (2000) ClpC ATPaseis required for cell adhesion and invasion of Listeria mono-cytogenes Infect Immun 68 7061ndash7068
Navarre WW and Schneewind O (1999) Surface proteinsof gram-positive bacteria and mechanisms of their target-ing to the cell wall envelope Microbiol Mol Biol Rev 63174ndash229
Nielsen H Brunak S and von Heijne G (1999) Machinelearning approaches for the prediction of signal peptidesand other protein sorting signals Protein Eng 12 3ndash9
Nizet V and Rubens CE (2000) Pathogenic mechanismsand virulence factors of Group B streptococci In Gram-positive pathogens Fischetti VA Novick RP FerrettiJJ Portnoy DA and Rood JI (eds) Washington DCAmerican Society for Microbiology Press
Noel GJ Katz SL and Edelson PJ (1991) The role ofC3 in mediating binding and ingestion of group B Strepto-coccus serotype III by murine macrophages Pediatr Res30 118ndash123
Novak R Charpentier E Braun JS and Tuomanen E(2000) Signal transduction by a death signal peptideuncovering the mechanism of bacterial killing by penicillinMol Cell 5 49ndash57
Osaki M Takamatsu D Shimoji Y and Sekizaki T(2002) Characterization of Streptococcus suis genesencoding proteins homologous to sortase of Gram-positivebacteria J Bacteriol 184 971ndash829
Pallen MJ Lam AC Antonio M and Dunbar K (2001)An embarrassment of sortases ndash a richness of substratesTrends Microbiol 9 97ndash102
Perna NT Plunkett G 3rd Burland V Mau B GlasnerJD Rose DJ et al (2001) Genome sequence of entero-haemorrhagic Escherichia coli O157 H7 Nature 409529ndash533
Petit CM Brown JR Ingraham K Bryant AP and
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184
Genome sequence of Streptococcus agalactiae 1513
copy 2002 Blackwell Science Ltd Molecular Microbiology 45 1499ndash1513
Holmes DJ (2001) Lipid modification of prelipoproteins isdispensable for growth in vitro but essential for virulencein Streptococcus pneumoniae FEMS Microbiol Lett 200229ndash233
Polissi A Pontiggia A Feger G Altieri M Mottl HFerrari L and Simon D (1998) Large-scale identificationof virulence genes from Streptococcus pneumoniae InfectImmun 66 5620ndash5629
Poyart C Pellegrini E Gaillot O Boumaila C BaptistaM and Trieu-Cuot P (2001) Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulenceof Streptococcus agalactiae Infect Immun 69 5098ndash5106
Pritzlaff CA Chang JC Kuo SP Tamura GSRubens CE and Nizet V (2001) Genetic basis for thebeta-haemolyticcytolytic activity of group B StreptococcusMol Microbiol 39 236ndash247
Quentin R Huet H Wang FS Geslin P Goudeau Aand Selander RK (1995) Characterization of Streptococ-cus agalactiae strains by multilocus enzyme genotype andserotype identification of multiple virulent clone familiesthat cause invasive neonatal disease J Clin Microbiol 332576ndash2581
Rubens CE Smith S Hulse M Chi EY and van BelleG (1992) Respiratory epithelial cell invasion by group Bstreptococci Infect Immun 60 5157ndash5163
Schuchat A (1998) Epidemiology of group B streptococcaldisease in the United States shifting paradigms ClinMicrobiol Rev 11 497ndash513
Smoot JC Barbian KD Van Gompel JJ Smoot LMChaussee MS Sylva GL et al (2002) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks Proc Natl Acad Sci USA 994668ndash4673
Spellerberg B (2000) Pathogenesis of neonatal Streptococ-cus agalactiae infections Microbes Infect 2 1733ndash1742
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J Schnitzler N Lutticken R and
Podbielski A (1999) Lmb a protein with similarities to theLraI adhesin family mediates attachment of Streptococcusagalactiae to human laminin Infect Immun 67 871ndash878
Spellerberg B Rozdzinski E Martin S Weber-Heynemann J and Lutticken R (2002) rgf encodes anovel two-component signal transduction system ofStreptococcus agalactiae Infect Immun 70 2434ndash2440
Squires C and Squires CL (1992) The Clp proteins pro-teolysis regulators or molecular chaperones J Bacteriol174 1081ndash1085
Takahashi S Aoyagi Y Adderson EE Okuwaki Y andBohnsack JF (1999) Capsular sialic acid limits C5a pro-duction on type III group B streptococci Infect Immun 671866ndash1870
Takahashi Y Konishi K Cisar JO and Yoshikawa M(2002) Identification and characterization of hsa the geneencoding the sialic acid-binding adhesin of Streptococcusgordonii DL1 Infect Immun 70 1209ndash1218
Tettelin H Nelson KE Paulsen IT Eisen JA ReadTD Peterson S et al (2001) Complete genomesequence of a virulent isolate of Streptococcus pneumo-niae Science 293 498ndash506
Tong HH Blue LE James MA and DeMaria TF(2000) Evaluation of the virulence of a Streptococcuspneumoniae neuraminidase- deficient mutant in nasopha-ryngeal colonization and development of otitis media in thechinchilla model Infect Immun 68 921ndash924
Wastfelt M Stalhammar-Carlemalm M Delisse AMCabezon T and Lindahl G (1996) Identification of afamily of streptococcal surface proteins with extremelyrepetitive structure J Biol Chem 271 18892ndash18897
Xu SY and Fomenkov A (1994) Construction of pSC101derivatives with Camr and Tetr for selection or LacZcent forbluewhite screening Biotechniques 17 57
Yamamoto S Miyake K Koike Y Watanabe MMachida Y Ohta M and Iijima S (1999) Molecularcharacterization of type-specific capsular polysaccharidebiosynthesis genes of Streptococcus agalactiae type Ia JBacteriol 181 5176ndash5184