Comparative genomics of phages and prophages in lactic acid bacteria

19
Antonie van Leeuwenhoek 82: 73–91, 2002. © 2002 Kluwer Academic Publishers. Printed in the Netherlands. 73 Comparative genomics of phages and prophages in lactic acid bacteria Frank Desiere , Sacha Lucchini 1 , Carlos Canchaya, Marco Ventura & Harald Brüssow Nestl´ e Research Center, Nestec Ltd., Vers-chez-les-Blanc, CH Lausanne 26, Switzerland ( Author for correspond- ence; E-mail: [email protected]) Key words: phage genomics, prophages, classification Abstract Comparative phage genomics has become possible due to the availability of more than 100 complete phage gen- ome sequences and the development of powerful bioinformatics tools. This technology, profiting from classical molecular-biology knowledge, has opened avenues of research for topics, which were difficult to address in the past. Now, it is possible to retrace part of the evolutionary history of phage modules by comparative genomics. The diagnosis of relatedness is hereby not uniquely based on sequence similarity alone, but includes topological considerations of genome organization. Detailed transcription maps have allowed in silico predictions of genome organization to be verified and refined. This comparative knowledge is providing the basis for a new taxonomic classification concept for bacteriophages infecting low G+C-content Gram-positive bacteria based on the genetic organization of the structural gene module. An Sfi21-like and an Sfi11-like genus of Siphoviridae is proposed. The gene maps of many phages show remarkable synteny in their structural genes defining a lambda super-group within Siphoviridae. A hierarchy of relatedness within the lambda super-group suggests elements of vertical evolution in Siphoviridae. Tailed phages are the result of both vertical and horizontal evolution and are thus fascinating objects for the study of molecular evolution. Prophage sequences integrated into the genomes of their bacterial host present theoretical challenges for evolutionary biologists. Prophages represent up to 10% of the genome in some LAB. In pathogenic streptococci prophages confer genes of selective value for the lysogenic cell. The lysogenic conversion genes are located between the lysin gene and the right phage attachment site. Non-attributed genes were found at the same genome position of prophages from lactic streptococci. These genes belong to the few prophage genes transcribed in the lysogen. Prophages from dairy bacteria might therefore also contribute to the evolutionary fitness of non-pathogenic LAB. Introduction Due to their economic importance for the dairy in- dustry, there are currently more complete genome sequences for phages of Streptococcus thermophilus or Lactococcus lactis than for lambdoid coliphages. Dairy phages are currently the best-investigated phage system with respect to genome data (Brüssow & Hendrix 2002). The complete genome sequence of bacteriophage lambda has been extraordinarily im- portant to study fundamental biological questions. Phage genomics has the potential to be as important for fundamental problems in genomics since phage Present address: Institute of Food Research, Norwich Research Park, Norwich, NR4 7UA, United Kingdom genomics represents all problems that are currently discussed in bacterial genomics: unity or diversity of origin, vertical versus horizontal gene transfer, non- orthologous gene displacement, tree versus web-like phylogeny, synteny versus instability of gene order, gene-splitting versus domain accretion (Koonin et al. 2000). Bacteriophages have an important impact on bacterial genomes and their evolution. In fact, many sequenced bacterial genomes contain prophages, some of which contribute 3–10% of the chromosomal gene content of the host bacterium (Brüssow & Hendrix 2002). The scientific value of these prophage se- quences is potentially great, but detailed analysis in the literature is scarce. Thorough analysis is needed to elucidate the role of prophages in the context of bacterial genomes. A major question is whether pro-

Transcript of Comparative genomics of phages and prophages in lactic acid bacteria

Antonie van Leeuwenhoek 82: 73–91, 2002.© 2002 Kluwer Academic Publishers. Printed in the Netherlands.

73

Comparative genomics of phages and prophages in lactic acid bacteria

Frank Desiere∗, Sacha Lucchini1, Carlos Canchaya, Marco Ventura & Harald BrüssowNestle Research Center, Nestec Ltd., Vers-chez-les-Blanc, CH Lausanne 26, Switzerland (∗Author for correspond-ence; E-mail: [email protected])

Key words: phage genomics, prophages, classification

Abstract

Comparative phage genomics has become possible due to the availability of more than 100 complete phage gen-ome sequences and the development of powerful bioinformatics tools. This technology, profiting from classicalmolecular-biology knowledge, has opened avenues of research for topics, which were difficult to address in thepast. Now, it is possible to retrace part of the evolutionary history of phage modules by comparative genomics.The diagnosis of relatedness is hereby not uniquely based on sequence similarity alone, but includes topologicalconsiderations of genome organization. Detailed transcription maps have allowed in silico predictions of genomeorganization to be verified and refined. This comparative knowledge is providing the basis for a new taxonomicclassification concept for bacteriophages infecting low G+C-content Gram-positive bacteria based on the geneticorganization of the structural gene module. An Sfi21-like and an Sfi11-like genus of Siphoviridae is proposed. Thegene maps of many phages show remarkable synteny in their structural genes defining a lambda super-group withinSiphoviridae. A hierarchy of relatedness within the lambda super-group suggests elements of vertical evolution inSiphoviridae. Tailed phages are the result of both vertical and horizontal evolution and are thus fascinating objectsfor the study of molecular evolution. Prophage sequences integrated into the genomes of their bacterial host presenttheoretical challenges for evolutionary biologists. Prophages represent up to 10% of the genome in some LAB. Inpathogenic streptococci prophages confer genes of selective value for the lysogenic cell. The lysogenic conversiongenes are located between the lysin gene and the right phage attachment site. Non-attributed genes were found atthe same genome position of prophages from lactic streptococci. These genes belong to the few prophage genestranscribed in the lysogen. Prophages from dairy bacteria might therefore also contribute to the evolutionary fitnessof non-pathogenic LAB.

Introduction

Due to their economic importance for the dairy in-dustry, there are currently more complete genomesequences for phages of Streptococcus thermophilusor Lactococcus lactis than for lambdoid coliphages.Dairy phages are currently the best-investigated phagesystem with respect to genome data (Brüssow &Hendrix 2002). The complete genome sequence ofbacteriophage lambda has been extraordinarily im-portant to study fundamental biological questions.Phage genomics has the potential to be as importantfor fundamental problems in genomics since phage

∗ Present address: Institute of Food Research, Norwich ResearchPark, Norwich, NR4 7UA, United Kingdom

genomics represents all problems that are currentlydiscussed in bacterial genomics: unity or diversity oforigin, vertical versus horizontal gene transfer, non-orthologous gene displacement, tree versus web-likephylogeny, synteny versus instability of gene order,gene-splitting versus domain accretion (Koonin et al.2000). Bacteriophages have an important impact onbacterial genomes and their evolution. In fact, manysequenced bacterial genomes contain prophages, someof which contribute 3–10% of the chromosomal genecontent of the host bacterium (Brüssow & Hendrix2002). The scientific value of these prophage se-quences is potentially great, but detailed analysis inthe literature is scarce. Thorough analysis is neededto elucidate the role of prophages in the context ofbacterial genomes. A major question is whether pro-

74

Figure 1. Alignment of the genetic maps of four distinct L. lactis phage species represented by phages c2, sk1, BK5-T and r1t. Transcriptionmaps are indicated with arrows; E: early; M: middle; L: late transcripts. To facilitate comparison the genome maps from sk1, c2 and r1t arerearranged with respect to the Genbank entry. Corresponding genes are marked with the same colour.

phages contribute to the fitness of the bacterial host.Discussion of this subject can profit from the know-ledge accumulated from pathogenic streptococci. Dueto their clinical relevance, research regarding vir-ulence in this group of organism has been extensiveand prophages have been investigated for their poten-tial to carry and disseminate virulence genes (Weeks &Ferretti 1984; Goshorn & Schlievert 1989; McShan etal. 1997). In the current contribution, we review phageand prophage genomics for lactic acid bacteria.

Genomes of lactic acid bacteriophages

Lactococcus lactis phages

Braun et al. (1989) differentiated five types of iso-metric head Siphoviridae according to head size andtail structure and two types of prolate headed phages.Most phages had DNA genomes ranging from 20 to 55kb and defined seven slightly overlapping DNA homo-logy groups. Interlaboratory phage comparisons led tothe definition of twelve lactococcal phage species (Jar-vis et al. 1991). However, industrial phage ecology isdominated by just three phage species (Prevots et al.1990). The most abundant isolates were virulent smallisometric head Siphoviridae of the 936-species (‘sk1-

like’ phages , Figure 1) representing about half of allisolates. Another quarter of the isolates were classifiedinto the c2-species of virulent lactococcal phages, Si-phoviridae with prolate heads. A further quarter of theisolates is represented by small isometric phages of theP335 species. The P335 species is heterogeneous andcomprises virulent and temperate phages and phageswith two different DNA packaging mechanisms, i.e.,cos-site and pac-site phages. The definition of thisspecies can be questioned (see below).

Representatives from each of the three abundantlactococcal phage species have been completely se-quenced. The phages sk1 and bIL170, members fromthe 936 phage species shared >80% nt sequence iden-tity and differ from each other mainly by point muta-tions and numerous small insertion/deletions (Fig-ure 2). Phage sk1 has a 28.4-kb long DNA genomewith cohesive ends. The genome is subdivided intotwo segments (Chandry et al. 1997). The left most16-kb encodes the structural gene cluster followed bythe lysis module. The structural genes showed a verysimilar gene order as phage BK5-T (Figure 1), a tem-perate cos-site phage and the type phage of a minorlactococcal phage species. However, no sequence sim-ilarity linked the two structural gene clusters. Theearly phage genes are encoded on the opposite strandcovering a major part of the right genome half of phage

75

Figure 2. Dotplot matrix comparison calculated for the genome sequences of the lactococcal phages bIL285, bIL286, bIL309, Tuc2009, BK5-T,TP901-1, bIL67, c, r1t, sk-1 and bIL170. The left x-axis provides a scale in kb. The dotplot-matrix was calculated using Dotter (Sonnhammer& Durbin, 1995).

sk1. This head-to-head constellation of late and earlygenes is not found in any other dairy phage.

Eight proteins from phage sk1 shared sequencesimilarity with phage c2 proteins, the type phage ofthe c2-species of lactococcal phages. Despite that sim-ilarity, phage c2 showed a clearly distinct genomeorganization and a shorter genome size (22 kb) whencompared to sk1 (Lubbers et al. 1995). The originof replication, located 7 kb downstream of the leftcos-site, divided the phage c2 genome in leftwardlytranscribed early genes covering DNA replication andrecombination functions and a larger rightward tran-scribed late gene cluster covering the lysis module andthe structural genes. c2-like phages do not share nt se-quence similarity with members of other lactococcalphage species. Phage bIL67 (Schouler et al. 1994),another member of the c2-species, could be alignedwith c2 along the entire genome at the nt level (see alsoFigure 2). The c2-bIL67 comparison revealed gene re-placements over the purported tail adsorption proteinand three minor structural genes located at the rightgenome end (Lubbers et al. 1995). The two pac-site L.lactis phages, Tuc2009 (van Sinderen, van de Guchte,Seegers, and Fitzgerald, GenBank entry AF109874)and TP901-1 (Brondsted et al. 2001) are both mem-

bers of the P335 species defined by phage taxonom-ists and could be aligned over three quarters of theirgenomes (Figure 2) and the nt sequence identity fre-quently exceeded 90%. The similarity was especiallymarked over the late gene cluster. Here the alignmentwas only interrupted at a few gene positions (down-stream of a minor head gene, an intron in the majorhead gene from Tuc2009, insertions/deletions of twolikely tail fiber genes, domain replacement in a minortail gene). In contrast, over the early gene cluster thetwo lactococcal phages could only be aligned oversmall genome regions.

Phage r1t, another member of the P335 lacto-coccal phage species, has a 33.3-kb genome withcohesive ends (van Sinderen et al., 1996) (Figure 1).The structural genes are localized next to the cos-site,and were followed by the lysis genes, the lysogenymodule, the DNA replication modules and transcrip-tional regulation region. The lysogeny genes werethe only genes transcribed from the opposite strand.The structural gene cluster of r1t does not resemblein its organization any other group of dairy phages,while it shares significant protein sequence similaritywith Siphoviridae from Mycobacteria (Lucchini et al.1999a; Brüssow & Desiere 2001b). We propose that

76

Figure 3. Comparison of the DNA packaging, head, tail, tail fiber and lysis genes in the Lactobacillus johnsoni prophage Lj965 with thecorresponding genes in pac-site phages from Streptococcus thermophilus (Sfi11), L. delbrueckii (LL-H), L. plantarum (phig1e) and Bacillussubtilis (SPP1). Listeria phage A118 is given as a further reference (Loessner et al. 2000). Open reading frames are indicated by arrows. Thenumber below the arrow gives the lengths of the deduced proteins in aa or the name attributed in the original publication. Corresponding genesin the five phages are indicated with the same colour code, unknown genes are indicated with gray arrows. Amino acid sequence identitybetween phage proteins is indicated with their percentage and BLAST E-value (expressed as decadic logarithmic exponent). Light blue shadinglinks related proteins. Gray shading indicates similarities between proteins of bacteriophages, which are not depicted next to each other. Thephage Sfi11 genes are marked with putative functions deduced from previous analysis (Lucchini et al. 1999a).

phage r1t be removed from the P335 species and thatit be attributed to its own lactococcal phage species.As the current taxonomic phage system is primarilybased on phage morphology, it seems logical to at-tribute a greater importance to the distinct structuralgene module of phage r1t than to the fact that it sharesearly genes with lactococcal phages attributed to twodistinct phage species (Tuc2009 and BK5-T).

The genetic diversity of the remaining lactococcalphage species is unexplored, but probably substantial.Phage φ111, for example, a member of the minor 949-species, is a Siphovirus with a large isometric head anda 110-kb genome (Prevots et al. 1990).

Lactobacillus phages

Lactobacillus phage sequence data are available fromfive distinct species of lactobacilli (Lb. delbrueckii,gasseri, plantarum, casei and johnsoni). No sig-nificant nucleotide sequence similarity was detected

between Lactobacillus phages infecting distinct bac-terial species (Desiere et al. 2000ab). Apparently,relatively tight barriers prevent the transfer of phagegenes across Lactobacillus species. The overall gen-omic organization of Lb. delbrueckii phage LL-H wasidentical to that of pac-site S. thermophilus phagesbelonging to the proposed Sfi11-like genus of Siphovi-ridae. Similarly organized pac-site phages were iden-tified in other lactobacilli (Lb. plantarum phage phi-g1e, Lb. johnsoni prophage Lj965). Related phageswere identified in Bacillus subtilis (phage SPP1) andListeria monocytogenes (phage A118, Figure 3). Se-quence similarity of the structural proteins definedtwo sublineages of pac-site Lactobacillus phages. Onegroup is constituted by Lactobacillus phages LL-Hand phi-g1e. The second group of pac-site Lactoba-cillus phages is constituted by Lb. johnsoni prophageLj965 whose closest relative is S. thermophilus phageSfi11. The pac-site B. subtilis phage SPP1 is a linkerbetween both subgroups since it alternatively shares

77

Figure 4. Dotplot matrix calculated for the genome sequences of the S. thermophilus phages Sfi19, Sfi11, Sfi21, O1205, DT1, 7201. Thecomparison window was 50 bp, and the stringency was 30 bp.

small groups of adjacent genes with either group(Figure 3) (Desiere et al. 2000a,b).

Close relationships were demonstrated betweenthe virulent and the temperate pac-site Lb. delbrueckiiphages LL-H and mv4. Over the structural genes thetwo phages shared up to 92% nt sequence identity(Vasala et al. 1993). Interestingly, the virulent phageLL-H still contained a remnant phage integrase andphage attachment sites demonstrating that it was re-cently derived from a temperate phage (Mikkonen etal. 1996a,b).

One complete and one partial genome sequence iscurrently available for cos-site Lactobacillus phagesadh (Altermann et al. 1999) and a2 (Garcia, unpub-

lished). Numerous protein sequence similarities linkedthese Lactobacillus and cos-site Streptococcus phages.The similarity with streptococcal phages was espe-cially marked for the DNA packaging, head and tailgenes of Lb. gasseri phage adh (Altermann et al. 1999)and the DNA replication genes of Lb. casei phage a2(Moscoso & Suarez 2000).

Further genetic diversity exists in Lactobacil-lus phages. For example, group c Lb. delbrueckiiphages contain prolate head Siphoviridae with pe-culiar knob-like appendages along the tail (phageJCL1032) (Ravin, Raisanen, Alatossava unpublished).In addition, Myoviridae (phages with contractile tails)have been isolated from Lb. plantarum (Chennoufi,

78

Figure 5. Genome comparison of the temperate pac-site S. thermophilus phage 01205, the virulent pac-site S. thermophilus phage Sfi11 and thevirulent cos-site S. thermophilus phage Sfi19. For phage 01205, orfs were indicated according to the reference (Stanley et al. 1997; Lucchini etal. 1999c). Probable gene functions are indicated, the phage genomes have been divided into functional units according to comparative analysis(Lucchini et al. 1999c). Genes belonging to the same module are indicated with the same colour. Areas of blue shading indicate regions ofmajor sequence difference.

personal communication) one of which has been se-quenced in our laboratory. Over its structural geneclusters it showed about 40% derived protein sequenceidentity with Myovirus A511 from Listeria monocyto-genes, but no detectable DNA sequence similarity.

Streptococcus thermophilus phages

Streptococcus thermophilus phages are an unusuallyhomogenous group. All S. thermophilus phages pos-sess the same basic morphology (B1 Siphoviridae) andshare extensive DNA sequence similarity (Figure 4).(Le Marrec et al. 1997) differentiated S. thermophilusphages on the basis of DNA packaging mechanisms(cos-site vs. pac-site phages) and protein compositioninto two phage species which confirmed the clas-sification done by DNA-hybridization and serology(Brüssow et al. 1994). Six S. thermophilus phage gen-omes have been completely sequenced. Based on adotplot analysis the S. thermophilus phage genome canbe subdivided into four large segments (Figure 4). Onesegment is the late gene cluster extending from theDNA packaging genes to the tail genes. This moduleis represented by two unrelated configurations: one ischaracteristic for cos-site phages (e.g., phage Sfi19),the other for pac-site phages (prototype: phage Sfi11)(Lucchini et al. 1999a) (Figure 5). The two structuralgene clusters are not related to each other at the nucle-otide or protein sequence level. Both clusters diversify

by the accumulation of point mutations (Lucchini et al.1999a,b) A second segment of S. thermophilus phagegenomes covers the putative tail fiber, lysis and lyso-geny genes. Diversity is created by insertion, deletionand replacement of DNA segments and to a lesser de-gree by point mutations (Lucchini et al. 1999c). In thelysogeny module recombination processes apparentlyunderlie the acquisition of different types of superin-fection immunity and repressor binding specificity inthe genetic switch region (Neve et al. 1998) (Figure 6).When the lysogeny modules from two temperate S.thermophilus phages were aligned, an alternation ofconserved and variable DNA segments was observed(Neve et al. 1998). Some transition zones were exactlyat gene borders (integrase/ superinfection immunity).Others were in the middle of genes separating pro-tein domains (repressor and antirepressor). Deletionsin spontaneous phage mutants were located near thesetransition zones (Bruttin et al. 1997) (Figure 6).

The third genome segment is the putative DNAreplication module represented by two distinct geneconstellations: the Sfi21-like and the 7201-like DNAreplication module. The Sfi21-like DNA replicationmodule is present in the vast majority of the isol-ated phages and is unusually conserved (Desiere etal. 1997). Even at the third codon position < 1% se-quence diversity was observed in independent phageisolates. The fourth S. thermophilus phage genomesegment covers the rightmost 5 kb of the genome

79

Tabl

e1.

Lis

tof

com

plet

ely

sequ

ence

dL

AB

bact

erio

phag

ege

nom

es

Phag

eH

ost

Gen

ome

Tem

pera

te/

Phag

ety

pe∗

Ref

eren

ce

size

viru

lent

1bI

L67

L.l

actis

2219

5V

B2;

cos:

c2-s

peci

esSc

houl

eret

al.(

1994

)

2c2

L.l

actis

2217

2V

B2;

cos:

c2-s

peci

es(L

ubbe

rset

al.(

1995

)

3sk

1L

.lac

tis28

451

VB

1;co

s;93

6-sp

ecie

s(C

hand

ryet

al.1

997)

4bI

L17

0L

.lac

tis31

754

VB

1;co

s;93

6-sp

ecie

sC

rutz

-Le

Coq

,A.M

.,C

esse

lin,B

.,C

omm

issa

ire,

J.,

Anb

a,J.

,Kyr

iaki

dis,

S.an

dC

hopi

n,M

.C.,

(unp

ublis

hed)

5r1

tL

.lac

tis33

350

TB

1;P3

35-s

peci

es:

cos

van

Sind

eren

etal

.(19

96)

6T

uc20

09L

.lac

tis38

347

TB

1;P3

35-s

peci

es:

pac

van

Sind

eren

,D.,

van

deG

ucht

e,M

.,

Seeg

ers,

J.F.

M.L

.and

Fitz

gera

ld,G

.F.,

(unp

ublis

hed)

Des

iere

etal

.(20

01)

7T

P901

-1L

.lac

tis37

667

TB

1;P3

35-s

peci

es:

pac

Bro

ndst

edet

al.(

2001

)

8B

K5-

TL

.lac

tis40

003

TB

1;co

s:B

K5-

Tsp

ecie

sM

ahan

ivon

get

al.,

(200

1)

Des

iere

etal

.(20

01)

9L

L-H

Lb.

delb

rück

ii34

659

VB

1;pa

cgr

oup

aM

ikko

nen

etal

.,(1

996)

11g1

eL

acto

baci

llus

plan

taru

m42

259

TB

1;pa

c(K

odai

raet

al.(

1997

)

12ad

hL

acto

baci

llus

gass

eri

4378

5T

B1;

cos

Alte

rman

net

al.(

1999

)

1301

205

S.th

erm

ophi

lus

4307

5T

B1;

pac:

Sfi11

-spe

cies

(Sta

nley

etal

.,(1

997)

1472

01S.

ther

mop

hilu

s35

466

VB

1;co

s:Sfi

21-s

peci

esSt

anle

yet

al.(

2000

)

15D

T1

S.th

erm

ophi

lus

3482

0V

B1;

cos:

Sfi21

-spe

cies

Tre

mbl

ay&

Moi

neau

(199

9)

16Sfi

21S.

ther

mop

hilu

s40

739

TB

1;co

s:Sfi

21-s

peci

esD

esie

reet

al.,

(199

8)

17Sfi

19S.

ther

mop

hilu

s37

370

VB

1;co

s:Sfi

21-s

peci

esD

esie

reet

al.,

(199

8)

18Sfi

11S.

ther

mop

hilu

s39

807

VB

1;pa

c:Sfi

11-s

peci

esL

ucch

inie

tal.,

(199

9c)

19C

p-1

S.pn

eum

onia

e28

451

VPo

dovi

rus

Mar

tinet

al.(

1996

)

20M

M1

S.pn

eum

onia

e40

264

TB

1O

breg

on,V

.,G

arci

a,P.

,Gar

cia,

E.,

Lop

ez,R

.and

Gar

cia,

J.L

.(U

npub

lishe

d)

∗ B1

–is

omet

ric

head

;B2,

B3

–pr

olat

ehe

adSi

phov

irid

ae(p

hage

sw

ithlo

ngno

n-co

ntra

ctile

tail)

/cos

–co

s-si

te,g

enom

ew

ithco

hesi

veen

ds;p

ac–

pac-

site

,DN

Apa

ckag

ing

byth

ehe

ad-f

ull

mec

hani

sm/p

hage

spec

ies

desi

gnat

ion

whe

reav

aila

ble

T–

tem

pera

teph

age;

V–

viru

lent

phag

e.

80

Figure 6. Alignment of the lysogeny modules from the cos-site phage Sfi21 and the pac-site phage TP-J34 from S. thermophilus. The coloredbars give the percentage of nt sequence identity between both sequences as defined in the colour code at the top of the figure. The predictedopen reading frames are indicated with their codon length. The vertical lines indicate the transition zones from high to low sequence similarity.D1, D2: extent of two spontaneous deletions

encoding early genes apparently involved in tran-scriptional regulation. Diversity is created by severalinsertions/ deletions, a duplication and a few gene re-placements, while the overall DNA sequence is highlyconserved (Lucchini et al. 1999c).

An unusual feature is the close genetic relation-ship between temperate and virulent S. thermophilusphages. Lytic phages dominate the S. thermophilusphage population. They are apparently derived fromtemperate phages by a combination of rearrangementand deletion events in the lysogeny module (Lucchiniet al. 1999b). Recombination plays also a role in thecreation of diversity in the putative tail fiber genes(Lucchini et al. 1998). Variable and conserved DNAsegments were interspersed in the gene of the phageprotein that probably interacts with the phage receptoron the bacterial cell (Lucchini et al. 1999c). Spontan-eous deletions were observed that started and ended inDNA repeats of conserved regions encoding collagen-like protein motifs (Desiere et al. 1998). Phages thatdiffered in host range showed completely unrelatedvariable regions, while phages with overlapping hostranges shared highly related variable regions. Swap-ping of the variable gene segments changed the hostrange of recombinant phages as predicted by the spe-cificity of the variable region (Duplessis & Moineau2001).

Genome comparison defines the Sfi21- andSfi11-like genus of Siphoviridae

The comparison of cos-site temperate Siphoviridaeinfecting distinct genera of low GC content Gram-positive bacteria (Desiere et al. 2000) has yieldedinteresting insights into the evolution of the struc-tural gene module. The compared sequences showeda gradient of relatedness, which, in an approximateway, reflected the relatedness of their bacterial hostssuggesting some co-evolution of phages with theirhost bacteria. Closest relatives to phage Sfi21 wereother cos-site S. thermophilus phages which shared >

80% DNA sequence identity with Sfi21, followed byLactococcus phage BK5-T, which shared 60% DNAidentity over the DNA packaging and head morpho-genesis genes with Sfi21 (Desiere et al. 2001a,b).Lactobacillus phage adh showed up to 61% aa identityover the DNA packaging, head and tail morphogenesisgene products but no detectable DNA identity (Desiereet al. 2000). Bacillus phage phi-105 and Staphylococ-cus phage PVL shared 26 to 36% protein sequenceidentity and an identical modular genome organizationwith the Sfi21 DNA packaging-head-tail-tail fiber-lysis-lysogeny-DNA replication-transcriptional regu-lation modules (Figure 7). Similarly an SFi11-likegenus of Siphoviridae was proposed, based on theextensive sequence similarities of phage Sfi11 witha number of pac-site phages from low GC contentGram-positive bacteria including Lactococcus phage

81

Figure 7. Alignment of the genetic maps of the Lactobacillus phage adh, Streptococcus phage Sfi21, Lactococcus phage BK5-T, Staphylococ-cus phage PVL, Bacillus phage phi-105 and Staphylococcus aureus phage PVL. Genes encoding proteins, which showed aa sequence similarity,are linked by blue shading. Corresponding genes are marked with the same colour. The location of the lysogeny and DNA replication modulesare highlighted in yellow and green.

TP901-1, Lactobacillus phages phig1e, Bacillus phageSPP1 and Listeria phage A118 (Desiere et al. 2000).

The Sfi11-like phages have been shown to differfrom the Sfi21-like phages by the possession of twoinstead of one major head proteins, the presence of ascaffold protein and the lack of proteolytic processingof the major head protein.

Many of the established genera of Siphoviridaeshare with Sfi21- and Sfi11-like phages a similarlyorganized structural gene cluster suggesting evolu-tion from a common ancestor module (Figure 8).c2-like Siphoviridae (Jarvis et al. 1995) showed adistinct organization of their structural genes, butcould still be aligned with the equivalent region ofsk1-like phages if one allows for two rearrangementevents (Desiere et al. 2001a,b). Sequence similar-ity was still detected between some phages from thissuper-group and allowed the distinction of at leastfive lineages of capsid genes represented by phagesHK97/Sfi21/phiC31, λ/Sfi11, L5/TM4/r1t, sk1/c2 andιM2. Notably, the archaeavirus ιM2 (Pfister et al.1998) showed over its portal and major head proteinsclear similarity with P2-like Myoviridae (contract-ile tail Caudovirales) from both Gram-negative andGram-positive bacteria. P22-like Podoviridae (shorttail Caudovirales) shared with the λ super-group of

Siphoviridae a similarly organized capsid gene cluster(Eppler et al. 1991).

Siphoviridae genomes fall into two size classes:121-134 kb (coliphage T5, Bacillus phage SPβ) and22-56 kb (all other genera). The genome organizationof phage SPβ (Lazarevic et al. 1999) shared no sim-ilarity with the smaller genomes of Siphoviridae, andthus may be polyphyletic. Most of the putative struc-tural genes from SPβ are unrelated to the entries in thedatabase, while part of the tail fiber genes shared sim-ilarity with defective prophages from Bacillus. Stillother SPβ genes are present as a consequence of recenthorizontal gene transfers from the Bacillus chromo-some. This observation leads to the taxonomicallyunsatisfying situation that genomes of Siphoviridaefrom the λ super-group are more closely related toP22-like Podoviridae and P2-like Myoviridae (below)than to SPβ-like Siphoviridae. Apparently, the cur-rent taxonomic classification only partially reflects thenatural relationships between tailed phages.

Since current phage taxonomy is based on the mor-phology of the phage particle, it seems logical to builda genomics-based phage taxonomy on the organiza-tion of the structural gene cluster. For other phagemodules, a distinct picture of phage relatedness hasbeen obtained. For example, at least four differentlysogeny modules were observed within the lambda

82

supergroup (Brüssow & Desiere 2001a). Within dairyphages, a very conserved gene-order in the lyso-geny genes was detected (Lucchini et al. 1999b).The closest relatives of the lysogeny module werenot found in Siphoviridae, but in P2-like Myoviridae(Figure 9).

Transcription maps confirm modular boundariesin bacteriophage genomes

Database searches with a new phage sequence are fre-quently of limited use and allow the attribution ofa likely function to only few genes (Lucchini et al.1999a). This observation reflects the limited numberof phage entries in our current database and the intrins-ically great diversity of phage sequences. Alternativeapproaches that do not rely on sequence relatednesslike the comparison of phage gene maps with well-investigated model phages makes it possible to definethe modular organization of dairy phage genomes.A ‘reality check’ for these in silico predictions wasprovided by transcription mapping of the temper-ate cos-site S. thermophilus phage Sfi21 (Ventura etal. 2002). Three classes of transcripts were observedbased on the timing of their appearance (Figure 10).Early transcripts were identified in four different gen-ome regions. Transcription mapping located earlygenes in the lysogeny module and between the DNAreplication module and the cos-site (Desiere et al.1997). Two middle transcripts were identified, one wasinitiated at the promoter of the cro-repressor gene, theother covered the DNA replication module (Foley etal. 1998). Four types of late transcripts were identifiedwhich covered the likely DNA packaging genes, thehead morphogenesis module plus the major tail gene,the remainder of the tail genes, and the putative tailfiber and lysis genes, respectively. The Sfi21 tran-scription map confirmed the prediction of the Sfi21modular structure established by comparative genomeanalysis with only one notable exception. Transcrip-tion mapping showed that the DNA replication moduleended downstream of the primase gene with the originof phage replication. The lack of database matchesfor the following five genes and the observation of aduplicated origin of replication downstream of thesegenes in virulent phages led to their erroneous attribu-tion as DNA replication genes. This analysis showedthat transcription mapping is a suitable tool that canverify in silico predictions about the modular genomeorganization of bacteriophages.

Amongst the temperate phages of dairy bacteriaa transcription map has only been reported for theLactococcus lactis phage TP901-1 (Madsen & Ham-mer 1998). Its transcription map differs from thatof Sfi21 in an important aspect: all early transcriptsin TP901-1 were initiated in the predicted geneticswitch region. A similar transcription pattern wasobserved for another pac-site temperate lactococcalphage, Tuc2009 (D. van Sinderen, personal commu-nication). In phage Sfi21, however, the divergentlyoriented promoters of the putative genetic switch re-gion are transcriptionally silent in the early infectionphase. The transcriptional silence of both repressorgenes during early infection is surprising since theSfi21 cro-like repressor gene has a consensus pro-moter and the cI-like repressor gene is transcribedfrom its own promoter when cloned on a plasmid.This indicates that both genes can be transcribed byan unmodified host RNA polymerase.

Prophages

About half of the sequenced bacterial genomes con-tain prophage sequences (Lawrence et al. 2001). Thepresence of such prophage DNA poses intriguingquestions for evolutionary biologists. Prophages in-crease the metabolic burden of the host cell whichhas to replicate up to 8% extra phage DNA. In ad-dition, prophages may lyse their host after prophageinduction. To compensate for these disadvantages onehas to presume that temperate phages encode func-tions that increase the fitness of the lysogen (Brüs-sow & Hendrix 2002). According to the selectivevalue of these phage genes, the lysogenic cell willbe maintained or even be over-represented in the bac-terial population. An obvious selective advantage forthe lysogenic host is the presence of immunity andsuper-infection exclusion genes of the prophage thatprotect the lysogen against phage infection. Furtherexamples of phage-encoded genes that increase hostfitness include diphtheria toxin, streptococcal eryth-rogenic toxin A, and the non-essential phage λ genebor that confers serum-resistance to the E. coli lyso-gen. In these cases, the reproductive success of thelysogenic bacterium translates directly into an evolu-tionary success for the resident prophage. However,the host–parasite relationship is also a highly fragileand dynamic genetic equilibrium. Prophages can beconsidered dangerous molecular time bombs that cankill the lysogenic cell upon their eventual induction.

83

Figure 8. Comparison of the genomes from the established and proposed genera of Siphoviridae constituting the λ super-group of phages.Corresponding genes are indicated with the same colour.

Figure 9. Comparative genetic organization of the lysogeny-related genes in Sfi21- and Sfi11-like Siphoviridae (A), P2-like Myoviridaeinfecting Vibrio cholerae (phage K139) and Homophiles influenza (phages S2 and HP1) (B) and L5-like and λ-like Siphoviridae (C; notethe changed scale). Corresponding genes are indicated with the same colour. Blue shading or identical letters link genes of which the productsshare sequence similarity.

84

Figure 10. Transcription map for the temperate S. thermophilus phage Sfi21. Top: Prediction of the open reading frames in the completegenome of phage Sfi21. The phage genome is divided into functional units. The modules are indicated at the top of the gene map. Genespredicted to belong to the same unit have the same colour. Orfs preceded by a potential RBS are marked with an R inside the arrow. Orfsstarting with an unconventional initiation codon are indicated with an asterisk. Overlap of start and stop codon is indicated with a triangle.Middle: The approximate position of the PCR products used for probing of the Northern blots is provided with the scale in bp. Bottom:Summary of the transcription analyses. The Sfi21-specific transcripts are depicted as arrows which point to the 3’ end of the mRNA and whichare coloured in green, blue or red to indicate early, middle and late transcripts, respectively. The length of the arrow is proportional to the lengthof the mRNA derived from the Northern blots. The estimated size of the mRNA is indicated in kb. The width of the arrows indicates the relativeabundance of the mRNA species. The wavy lines indicate mRNAs that show up as smeared bands in hybridization. Positive (blue circles) andnegative (red circles) results of the primer-extension experiments are noted next to the identification of the tested orf. Hairpins indicate possiblerho-independent terminators, the two sizes of the hairpins refer to different energies calculated for the hairpin (Ventura et al. 2002).

One would therefore expect evolution to select lyso-genic bacteria with mutations in the prophage DNA.Mutations that inactivate the prophage induction pro-cess avoid the loss of the lysogenic clone from thebacterial population. In a next step, one would ex-pect that selection would lead to large-scale deletion ofprophage DNA in order to decrease the metabolic bur-den of extra DNA synthesis. One predicts furthermorethat useful prophage genes (e.g. lysogenic conver-sion genes) are preferentially spared from this deletionprocess since their loss would actually decrease thefitness of the cell. It was proposed that a high genomicdeletion rate is instrumental in removing dangerousgenetic parasites from the bacterial genome (Lawrenceet al. 2001). Deletion processes could explain whybacterial genomes did not increase in size despite aconstant bombardment with parasitic DNA over evol-

utionary time. The streamlined bacterial chromosomecontaining few pseudogenes might be the consequenceof this deletion process of parasitic DNA.

Non-dairy LAB

The completely sequenced genome of the S. pyogenesM1 strain SF370 (Ferretti et al. 2001) isolated from apatient with an invasive wound infection was investig-ated for its prophage content. The analysis was done toconfirm the theoretical predictions of prophage–hostinteraction (Desiere et al. 2001a,b). It was alreadyknown that many S. pyogenes strains contain mul-tiple prophage elements (Yu & Ferretti 1991; Hyneset al. 1995). In addition, it was established that somestreptococcal exotoxins were phage-encoded (Nida &Ferretti 1982; Weeks & Ferretti 1984; Goshorn &

85

Schlievert, 1989). S. pyogenes strains encode furthervirulence factors like extracellular enzymes that con-tribute to the pathology of streptococcal disease. Someof them (e.g., hyaluronidase, DNase) are of phageorigin (Hynes & Ferretti 1989; Marciel et al. 1997).Phages might thus be common vehicles of geneticexchange between S. pyogenes strains including thetransfer of virulence factors.

The S. pyogenes SF370 genome contains eight pro-phage elements (Figure 11). Only prophage SF370.1could be induced by mitomycin C treatment. SF370.1possesses a 41-kb-long genome that closely resemblesthe genome of pac-site temperate Siphoviridae foundin different lactic acid bacteria. Its closest relatives areprophage NIH1.1 from a Japanese S. pyogenes strainand to a lesser extent S. pneumoniae phage mm1. Thepossession of prophage NIH1.1 differentiated olderfrom newly emerging Japanese S. pyogenes strains (In-agaki et al. 2000). Prophage acquisition might thus bea major mechanism of genetic variation in this highlydynamic and medically most relevant bacterial species(Smoot et al. 2002).

Prophage SF370.3 has a 33-kb long genome ofwhich the genome organisation closely resembles thatof the cos-site temperate L. lactis phage r1t. Analysisof the prophage genome revealed mutations in thereplisome organizer gene that may prevent the induc-tion of the prophage. SF370.3 encodes a hyaluronidaseand a DNase possibly implicated in the spreading ofS. pyogenes through tissue planes of its human host.Notably, antibodies against both phage proteins arefound in poststreptococcal autoimmune complicationsunderlining a role for these phage proteins in thepathology of this human disease. Prophage SF370.2has a 43-kb long genome, the genome organization ofwhich resembles that of pac-site temperate Siphovi-ridae infecting dairy bacteria and shows extensiveprotein sequence identity with structural proteins fromS. thermophilus and L. lactis phages. SF370.2 showedtwo probable inactivating mutations: one in the repli-some organizer gene and another in the gene encodingthe portal protein.

All three SF370 and the NIH1.1 prophages possesslikely lysogenic conversion genes between the phagelysin gene and the right attachment site (Figure 12).This region in prophage SF370.1 genome encodes thepyrogenic exotoxin C (SpeC) and a mitogenic factor(mf2). Closely related proteins are encoded in pro-phages SF370.2 and SF370.3 at an identical phagegenome position. The superantigens may contributeto the immune deregulation observed during invas-

ive streptococcal infections. The lysogenic conversiongenes in the prophages differ in their GC content fromthe surrounding prophage and bacterial DNA. Theirlocation near the phage attachment site suggests afaulty phage excision process in an unusual bacterialhost of a GC content lower than that of the genomefrom which the DNA originated. The horizontal trans-fer of these genes is also suggested by the presenceof identical genes in a veterinary pathogen, S. equi(Harrington et al. 2002).

A clear trend for prophage genome inactivationwas seen for the remaining five prophage sequences inS. pyogenes SF370 which all showed massive lossesof prophage DNA. The largest prophage remnantSF370.4 showed a 13-kb long genome consisting oflysogeny, DNA replication and transcriptional regula-tion genes flanked by attachment sites. The remainingfour prophage remnants were small (0.4–2.2-kb long),but all contained an integrase gene.

The sequencing of a second S. pyogenes strain,Manfredo, is close to completion. This allowed in-teresting interstrain comparisons of the prophage se-quences. The comparison has revealed that prophagesare hotspots for genetic recombination, which resultsnot only in the reshuffling of genes between pro-phages, but apparently also in rearrangements of thebacterial genome (Brüssow & Hendrix 2002).

S. pyogenes infections are known for their variableclinical symptomatology ranging from symptom-lesscarriage and mild sore throat to scarlet fever andpyodermitis to include autoimmune diseases (rheum-atic fever, glomerulonephritis) and life-threateningdiseases (fasciitis, toxic shock syndrome). It is tempt-ing to interpret the clinical potential of this proteanbacterial pathogens, at least partly, as a function ofthe specific prophage combination that a given strainharbors (Brüssow & Hendrix, 2002). In fact, the pos-session of the prophage NIH1.1 has been found tobe a genetic marker for newly emerging S. pyogenesisolates in Japan. Moreover, the recent publicationabout the genome sequence and the comparative mi-croarray analysis of serotype M18 group A Strepto-coccus strains associated with acute rheumatics fever(Smoot et al. 2002) showed prophages and prophage-like elements were responsible for most of the geneticvariation in 36 serotype M18 strains from diverselocalities. This observation underlines the role thatprophages might play for the ecological success of agiven bacterial strain.

Pathogenic bacteria represent only a small fractionof LAB. Do the principles observed with S. pyogenes

86

Figure 11. Aligned genome maps of three SF370 prophages and a prophage from S. pyogenes strain NIH1. The phage modules are col-our-coded, (red, lysogeny; orange, DNA replication; yellow, transcriptional regulation; green, DNA packaging and head; brown, head-to-tail;blue, tail; mauve, tail fiber; dark pink, lysis, black; superantigen/mitogenic factor genes). Likely prophage inactivating mutations are indicatedby large vertical arrows (replisome organizer, portal protein), an asterisk marks phage genes that potentially contribute to the virulence ofthe lysogenic host; the phage hyaluronidase is labeled by a triangle. Blue shading connects regions of DNA sequence similarity between theprophages

Figure 12. Compilation of currently available partial genome maps of temperate Siphoviridae from low GC content Gram-positive bacteria.The depicted genome regions are flanked on the left side by the phage lysin (blue) and on the right side by the phage integrase gene (green).Candidates for lysogenic conversion genes are highlighted in red. The code name of the prophages or temperate phages is indicated at the leftside of the gene map, the host of the specified phage is indicated at the right side of the map (Canchaya et al. unpublished).

prophages also apply to prophages from other LABoccupying dramatically different ecological niches?Do some prophages increase the ecological fitness ofdairy strains or commensals? Are phages then majordrivers of the evolution of bacteria allowing the testingof many new genes and the continued maintenance of

those genes that increase the competitiveness of thelysogen in its ecological niche?

Comparative phage genomics has demonstratedthatlysogenic conversion genes might also be encodedby temperate phages infecting dairy streptococci.ORFs of no known phage-related function are fre-

87

Table 2. Prophages of lactic acid bacteria

Host Prophage Inducible Size Reference:

S. thermophilus Sfi16 N 3kb unpublished

S. mitis SM1 Y 35 kb Bensing 2001

L. lactis IL1403 bIL285 y 35 538 Chopin 2001

bIL286 y 41 834

bIL309 y 36 949

bIL310 Y 14 957

bIL311 N 14 510

bIL312 Y 15 179

Lb. johnsonii Lj965 N 39 kb Desiere 2000

Lj928 N 39 kb unpublished

Lj771 N 42 kb

Lb. plantarum LP1, LP2a unpublished

LP2b, LP3

S. pyogenes SF370.1 Ferretti 2001

SF370.2 Desiere 2001

SF370.3 Canchaya 2002

SF370.4

E. faecalis V583 1 ? 37 kb unpublished

2 2 ? 33 kb

E. faecium 1 unpublished

Oenococcus oeni fOg44 Parreira 1999

quently encoded in temperate phages and prophagesfrom dairy bacteria at a genome position where patho-genic streptococci and staphylococci encode lysogenicconversion genes. Figure 12 provides a compilationof the genome maps of temperate phages from LABbetween the lysin and the integrase genes. With asingle exception (S. pneumoniae phage mm1), allphages encode at least one gene in this region. In allcases, the attachment site is located directly down-stream of the integrase gene or it overlaps the 3’-end ofthe integrase gene. Due to this configuration, the extra-genes are located between the phage lysin gene andthe right attachment site in the lysogen. Where invest-igated these genes have been shown to be transcribedin the lysogen (Ventura et al. 2002). Frequently, theirtranscription was found to be more prominent thanthe transcription of the phage repressor, for example.These observations suggest a physiological functionfor these prophage genes in the lysogen. The lack ofbiochemical or genetic evidence and of any bioinform-atic leads for these phage genes makes it, however,difficult to speculate on possible lysogenic conversionphenotypes.

The close genetic relationships between phagesfrom dairy bacteria and evolutionary related patho-

genic bacteria have interesting scientific implications.Research in medical and dairy microbiology has beenconducted with different priorities. For example, lyso-genic conversion is a strong research area in medicalbacteriology, while the basic aspects of phage replic-ation have been investigated in more detail for dairyphages (Brüssow 2001). Medical microbiologists usedphages for killing bacteria (phage therapy) (Nelson etal. 2001), while dairy microbiologists have developedphage-resistant starter strains based on the knowledgeof the phage genome (e.g., Foley et al. 1998). Know-ledge established in one field can be employed tocomplement research in the other.

Thus, the current knowledge on prophages in dairyspecies of LAB will be briefly summarized and anoverview of the currently known LAB prophages isgiven in Table 2.

Phophages of dairy lab

Prophages of Lactobacillus johnsoni

Lactobacillus johnsoni strain La1 contains three pro-phage elements of 38–42 kb in length in its 1.8-Mb

88

long genome. Each is flanked by attachment sites, butno phage particles could be induced by mitomycin Ctreatment. They share the genome organization of awidely distributed group of temperate pac-site phagesfrom LAB (Desiere et al. 2000). Twelve kb upstreamof the integrase gene, all three prophages contain acluster of tRNA genes. Only two regions of the pro-phage genome are transcribed in the lysogenic cell.In all three prophages the strongest transcripts derivefrom a single open reading frame located betweenthe lysin gene and the right phage attachment site.One of these genes shares sequence similarity with aplasmid maintenance system. The second transcribedregion is the lysogeny module. In prophage Lj771this transcript covers a conventional lysogeny modulefor dairy phages (integrase, superinfection exclusion,repressor genes). However, in prophages Lj965 andLj928 additional genes were apparently inserted intothe lysogeny module. Bioinformatic analysis suggeststhere is a further lysogenic conversion gene in thisregion in Lj965.

Prophages of Lactococcus lactis

Lactococcus lactis strain IL1403 is a major bacterialstarter organism used in industrial cheese fermenta-tion. Its 2.3-Mb genome contains six prophage ele-ments; all were found to be flanked by the phageattachment sites, and all but one prophage remnant(which contained insertion elements) could be excisedfrom the bacterial genome (Chopin et al. 2001). How-ever, only two prophages gave rise to infectious phageparticles after induction. Three prophages, ranging insize from 36 to 42 kb, showed a genome organizationtypical of temperate dairy phages. Prophages bIL286and bIL309 share DNA sequence identity over thestructural gene cluster with the temperate cos-site L.lactis phage BK5-T and protein sequence similaritywith S. thermophilus phage Sfi21. Prophage bIL285did not match the DNA sequence of any previously de-scribed lactococcal phage, but it still shared detectableprotein sequence similarity with the BK5-T group ofphages. Three prophage sequences of approximately15 kb in length represent likely prophage remnants.All possessed the conserved part of the lysogeny mod-ule (integrase/ repressor) and, in variable numbers,DNA replication and a few structural genes.

Lactobacillus plantarum strain WCFS1 containstwo about 43-kb long prophage genomes and severalprophage remnants in its 3.3-Mb long genome (R.Siezen, M. Kleerebezem personal communication).

The large prophage elements Lp1 and Lp2 resembletemperate pac-site dairy phages in their genome or-ganization, the closest relative being Lb. plantarumphage phig1e. Both prophages contain genes unrelatedto known phage functions near both attachment sites.In the case of prophage Lp1 these ‘extra’ genes atboth genome ends yielded database matches suggest-ing lysogenic conversion functions (distant relativesof mitogenic factors from S. pyogenes prophages,and the lysogenic conversion gene fun from E. coliphage P1). The Lp2 prophage showed a point muta-tion in the large subunit terminase gene which waslikely to cause gene inactivation. Further featuresmake prophage Lp2 peculiar. It shares DNA sequenceidentity with prophage Lp1 over the entire DNA pack-aging/head/tail gene cluster and the lysis cassette.Directly downstream of the Lp2 integrase gene a 12kb-long prophage remnant containing residual lyso-geny, DNA replication, DNA packaging and headgenes was found. Another 8-kb long prophage rem-nant carries lysogeny- and DNA replication-relatedphage genes.

In the future prophage genomics is likely to play animportant role in bacterial genomics. Prophage DNAis apparently enriched for genes that play a crucialrole in the phenotype of the host cell. This role is notonly underlined by virulence factors encoded on theprophage genome but DNA microarrays have demon-strated that prophage genes are prominent within thegroup of host genes that experience a significantlychanged transcription level after alterations in growthconditions (for example temperature changes in S.pyogenes (Smoot et al. 2001), planktonic vs. biofilmgrowth in Pseudomonas (Whiteley et al. 2001)). Fo-cusing on prophage sequences in bacterial genomesmight thus lead to genes influencing important phen-otypic differences between otherwise closely relatedbacterial strains.

Outlook

Genomics started with the sequencing of phage gen-omes in the late 1970s. However, in the last decadeonly few laboratories were involved in phage genom-ics. More recently, the pace has quickened again sothat now over 100 complete phage genomes are avail-able and it can be expected that many more will bepublished in the near future. These sequences havebeen marvelously informative with respect to the bio-logy of the individual phages. The real excitement

89

for phage biology starts now, with the advent of highvolume sequencing technology, as this allows for thefirst time at a whole genome resolution to analyzethe sequences together and thereby address a set offundamental biological questions related to popula-tions, e.g., what is the structure of the global phagepopulation, what are its dynamics, how do phagesevolve? This is Comparative Genomics with a capital‘C’ (Brüssow and Hendrix 2002).

There is now interest not only in sequencing newphage genomes but also in doing so from an arrayof phages as diverse as possible. Phage biologists arealso beginning to explore methods of sampling phagesequences from environmental sources without intro-ducing the severe bias of asking them to grow onculturable bacteria. Combining such an approach withtechniques like DNA microarray analysis promises togive a better picture of the diversity of the phage popu-lation and also to provide tools to answer questions asto how that diversity changes over space and time. Theease and low cost of phage sequencing combined withthe extensive knowledge of model phages could givephage genomics a lead role in population genetics, theevolution of simple DNA genomes and their geneticdiversity (Brüssow & Hendrix 2002).

Acknowledgements

We thank our colleagues at Nestlé for their sup-port of the project and the Swiss National ScienceFoundation for financial support in its BiotechnologyModule (grant 5002-044545/1). We thank MichielKleerebezem, Roland Siezen and Willem de Vos, Wa-geningen Center for Food Sciences, the Netherlandsfor sharing data on Lactobacillus plantarum strainWCFS1 prophages.

References

Altermann E, Klein JR & Henrich B (1999) Primary structure andfeatures of the genome of the Lactobacillus gasseri temperatebacteriophage phi adh. Gene 236: 333–346.

Botstein D (1980) A theory of modular evolution for bacterio-phages. Ann. NY Acad. Sci. 354: 484–490.

Braun V, Hertwig S, Neve H, Geis A & Teuber M (1989) Taxonomicdifferentiation of bacteriophages of Lactococcus lactis by elec-tron microscopy, DNA-DNA hybridization, and protein profiles.J. Gen. Microbiol. 135: 2551–2560.

Brondsted L, Ostergaard S, Pedersen M, Hammer K & VogensenFK (2001) Analysis of the complete DNA sequence of thetemperate bacteriophage TP901-1: evolution, structure, and gen-

ome organization of lactococcal bacteriophages. Virology 283:93–109.

Bruttin A, Desiere F, Lucchini S, Foley S & Brüssow H (1997) Char-acterization of the lysogeny DNA module from the temperateStreptococcus thermophilus bacteriophage Sfi21. Virology 233:136–148.

Brüssow H (2001) Phages of dairy bacteria. Annu. Rev. Microbiol.55: 283–303.

Brüssow H & Desiere F (2001a) Comparative phage genomics andthe evolution of Siphoviridae: insights from dairy phages. Mol.Microbiol. 39: 213–223.

Brüssow H & Desiere F (2001b) Comparative phage genomics andthe evolution of Siphoviridae: insights from dairy phages. Mol.Microbiol. 39: 213–222.

Brüssow H & Hendrix R (2002) Phage genomics: Small is beau-tiful. Cell 108: 13–16. Brüssow H, Fremont M, Bruttin A,Sidoti J, Constable A & Fryder V (1994) Detection and classi-fication of Streptococcus thermophilus bacteriophages isolatedfrom industrial milk fermentation. Appl. Environ. Microbiol. 60:4537–4543.

Casjens S, Hatfull GF & Hendrix R (1992) Evolution of dsDNAtailed-bacteriophage genomes. Sem. Virol. 3: 383–397.

Chandry PS, Moore SC, Boyce JD, Davidson BE & Hillier AJ(1997) Analysis of the DNA sequence, gene expression, originof replication and modular structure of the Lactococcus lactislytic bacteriophage sk1. Mol. Microbiol. 26: 49–64.

Chopin A, Bolotin A, Sorokin A, Ehrlich SD & Chopin M (2001)Analysis of six prophages in Lactococcus lactis IL1403: differentgenetic structure of temperate and virulent phage populations.Nucleic Acids Res. 29: 644–651.

Desiere F, Lucchini S, Bruttin A, Zwahlen MC & Brüssow H (1997)A highly conserved DNA replication module from Streptococcusthermophilus phages is similar in sequence and topology to amodule from Lactococcus lactis phages. Virology 234: 372–382.

Desiere F, Lucchini S & Brüssow H (1998) Evolution of Streptococ-cus thermophilus bacteriophage genomes by modular exchangesfollowed by point mutations and small deletions and insertions.Virology 241: 345–356.

Desiere F, Pridmore RD & Brüssow H (2000) Comparative genom-ics of the late gene cluster from lactobacillus phages. Virology275: 294–305.

Desiere F, Mahanivong C, Hillier A J, Chandry PS, Davidson BE &Brüssow H (2001) Comparative genomics of lactococcal phages:insight from the complete genome sequence of Lactococcuslactis phage BK5-T. Virology 283: 240–252.

Desiere F, McShan WM, van Sinderen D, Ferretti JJ & BrüssowH (2001b) Comparative genomics reveals close genetic relation-ships between phages from dairy bacteria and pathogenic strepto-cocci: Evolutionary implications for prophage-host interactions.Virology 288: 325–341.

Duplessis M & Moineau S (2001) Identification of a geneticdeterminant responsible for host specificity in Streptococcusthermophilus bacteriophages. Mol. Microbiol. 41: 325–336.

Eppler K, Wyckoff E, Goates J, Parr R & Casjens S (1991) Nucle-otide sequence of the bacteriophage P22 genes required for DNApackaging. Virology 183: 519–538.

Ferretti JJ, McShan WM, Ajdic D, Savic DJ, Savic G, Lyon K,Primeaux C, Sezate S, Suvorov AN, Kenton S, Lai HS, Lin SP,Qian Y, Jia HG, Najar FZ, Ren Q, Zhu H, Song L, White J, YuanX, Clifton SW, Roe BA & McLaughlin R (2001) Complete gen-ome sequence of an M1 strain of Streptococcus pyogenes. Proc.Natl. Acad. Sci. U.S.A. 98: 4658–4663.

90

Foley S, Lucchini S, Zwahlen MC & Brüssow H (1998) A shortnoncoding viral DNA element showing characteristics of a rep-lication origin confers bacteriophage resistance to Streptococcusthermophilus. Virology 250: 377–387.

Goshorn SC & Schlievert PM (1989) Bacteriophage associationof streptococcal pyrogenic exotoxin type C. J. Bacteriol. 171:3068–3073.

Harrington DJ, Sutcliffe IC, Chanter N (2002) The molecular basisof Streptococcus equi infection and disease. Microbes Infect.4(4): 501–105.

Hendrix RW, Smith MC, Burns RN, Ford ME & Hatfull GF (1999)Evolutionary relationships among diverse bacteriophages andprophages: all the world’s a phage. Proc. Natl. Acad. Sci. U.S.A.96: 2192–2197.

Hynes WL & Ferretti JJ (1989) Sequence analysis and expressionin Escherichia coli of the hyaluronidase gene of Streptococcuspyogenes bacteriophage H4489A. Infect.Immun. 57: 533–539.

Hynes WL, Hancock L & Ferretti J J (1995) Analysis of a secondbacteriophage hyaluronidase gene from Streptococcus pyogenes:evidence for a third hyaluronidase involved in extracellularenzymatic activity. Infect. Immun. 63: 3015–3020.

Inagaki Y, Myouga F, Kawabata H, Yamai S & Watanabe H(2000) Genomic differences in Streptococcus pyogenes sero-type M3 between recent isolates associated with toxic shock-likesyndrome and past clinical isolates. J. Infect. Dis. 181: 975–983.

Jarvis AW, Fitzgerald GF, Mata M, Mercenier A, Neve H, PowellIB, Ronda C, Saxelin M & Teuber M (1991) Species and typephages of lactococcal bacteriophages. Intervirology 32: 2–9.

Jarvis AW, Lubbers MW, Beresford TP, Ward LJ, Waterfield NR,Collins LJ & Jarvis BD (1995) Molecular biology of lactococcalbacteriophage c2. Dev. Biol. Stand. 85: 561–567.

Kodaira KI, Oki M, Kakikawa M, Watanabe N, Hirakawa M,Yamada K & Taketo A (1997) Genome structure of the Lacto-bacillus temperate phage phi g1e: the whole genome sequenceand the putative promoter/repressor system. Gene 187: 45–53.

Koonin EV, Aravind L & Kondrashov AS (2000) The impact ofcomparative genomics on our understanding of evolution. Cell101: 573–576.

Lawrence JG, Hendrix R & Casjens S (2001) Where are thepseudogenes in bacterial genomes? Trends Microbiol. 9(11):535–540.

Lazarevic V, Dusterhoft A, Soldo B, Hilbert H, Mauel C & Kara-mata D (1999) Nucleotide sequence of the Bacillus subtilistemperate bacteriophage SPbetac2. Microbiology 145 ( Pt 5):1055–1067.

Le Marrec C, van Sinderen D, Walsh L, Stanley E, Vlegels E,Moineau S, Heinze P, Fitzgerald G & Fayard B (1997) Twogroups of bacteriophages infecting Streptococcus thermophiluscan be distinguished on the basis of mode of packaging and ge-netic determinants for major structural proteins. Appl. Environ.Microbiol. 63: 3246–3253.

Loessner MJ, Inman RB, Lauer P & Calendar R (2000) Completenucleotide sequence, molecular analysis and genome structure ofbacteriophage A118 of Listeria monocytogenes: implications forphage evolution. Mol. Microbiol. 35: 324–340.

Lubbers MW, Waterfield NR, Beresford TP, Le PR & Jarvis AW(1995) Sequencing and analysis of the prolate-headed lactococ-cal bacteriophage c2 genome and identification of the structuralgenes. Appl. Environ. Microbiol. 61: 4348–4356.

Lucchini S, Desiere F & Brüssow H (1998) The structural genemodule in Streptococcus thermophilus bacteriophage phi Sfi11shows a hierarchy of relatedness to Siphoviridae from a widerange of bacterial hosts. Virology 246: 63–73.

Lucchini S, Desiere F & Brüssow H (1999a) Comparative genomicsof Streptococcus thermophilus phage species supports a modularevolution theory. J. Virol. 73: 8647–8656.

Lucchini S, Desiere F & Brüssow H (1999b) Similarly organ-ized lysogeny modules in temperate Siphoviridae from low GCcontent Gram-positive bacteria. Virology 263: 427–435.

Lucchini S, Desiere F & Brüssow H (1999c) The genetic relation-ship between virulent and temperate Streptococcus thermophilusbacteriophages: Whole genome comparison of cos-site phagesSfi19 and Sfi21. Virology 260: 232–243.

Madsen PL & Hammer K (1998) Temporal transcription of thelactococcal temperate phage TP901-1 and DNA sequence of theearly promoter region. Microbiology 144: 2203–2215.

Mahanivong C, Boyce JD, Davidson BE & Hillier AJ (2001)Sequence analysis and molecular characterization of the Lacto-coccus lactis temperate bacteriophage BK5-T. Appl EnvironMicrobiol. 67: 3564–3576.

Marciel AM, Kapur V & Musser JM (1997) Molecular populationgenetic analysis of a Streptococcus pyogenes bacteriophage-encoded hyaluronidase gene: recombination contributes to allelicvariation. Microb. Pathog. 22: 209–217.

Martin AC, Lopez R & Garcia P (1996) Analysis of the completenucleotide sequence and functional organization of the genomeof Streptococcus pneumoniae bacteriophage Cp- 1. J. Virol. 70:3678–3687.

McShan WM, Tang YF & Ferretti JJ (1997) Bacteriophage T12of Streptococcus pyogenes integrates into the gene encoding aserine tRNA. Mol. Microbiol. 23: 719–728.

Mikkonen M, Dupont L, Alatossava T & Ritzenthaler P (1996)Defective site-specific integration elements are present in thegenome of virulent bacteriophage LL-H of Lactobacillus del-brueckii. Appl. Environ. Microbiol. 62: 1847–1851.

Mikkonen M, Raisanen L & Alatossava T (1996) The early generegion completes the nucleotide sequence of Lactobacillus del-brueckii subsp. lactis phage LL-H. Gene 175: 49–57.

Moscoso M & Suarez JE (2000) Characterization of the DNA rep-lication module of bacteriophage A2 and use of its origin of rep-lication as a defense against infection during milk fermentationby Lactobacillus casei. Virology 273: 101–111.

Nelson D, Loomis L & Fischetti VA (2001) Prevention and elim-ination of upper respiratory colonization of mice by group Astreptococci by using a bacteriophage lytic enzyme. Proc. Natl.Acad. Sci. U.S.A. 98: 4107–4112.

Neve H, Zenz KI, Desiere F, Koch A, Heller KJ & Brüssow H(1998) Comparison of the lysogeny modules from the temperateStreptococcus thermophilus bacteriophages TP-J34 and Sfi21:Implications for the modular theory of phage evolution. Virology241: 61–72.

Nida SK & Ferretti JJ (1982) Phage influence on the synthesis ofextracellular toxins in group A streptococci. Infect. Immun. 36:745–750.

Prevots F, Mata M & Ritzenthaler P (1990) Taxonomic differen-tiation of 101 lactococcal bacteriophages and characterizationof bacteriophages with unusually large genomes. Appl. Environ.Microbiol. 56: 2180–2185.

Schouler C, Ehrlich SD & Chopin MC (1994) Sequence and organ-ization of the lactococcal prolate-headed bIL67 phage genome.Microbiology. 140: 3061–3069.

Smoot LM, Smoot JC, Graham MR, Somerville GA, Sturdevant DE,Migliaccio CA, Sylva GL & Musser JM (2001) Global differen-tial gene expression in response to growth temperature alterationin group A Streptococcus. Proc. Natl. Acad. Sci. U.S.A. 98:10416–10421.

91

Smoot JC, Barbian KD, Van Gompel JJ, Smoot LM, Chaussee MS,Sylva GL, Sturdevant DE, Ricklefs SM, Porcella SF, Parkins LD,Beres SB, Campbell DS, Smith TM, Zhang Q, Kapur V, Daly JA,Veasy LG & Musser JM (2002) Genome sequence and comparat-ive microarray analysis of serotype M18 group A Streptococcusstrains associated with acute rheumatic fever outbreaks. Proc.Natl. Acad. Sci. U.S.A. 99: 4668–4673.

Sonnhammer E L & Durbin R (1995) A dot-matrix program withdynamic threshold control suited for genomic DNA and proteinsequence analysis. Gene 167: GC1–10.

Stanley, E., Fitzgerald, G. F., Le Marrec, C., Fayard, B., and vanSinderen, D. (1997). Sequence analysis and characterization ofphi O1205, a temperate bacteriophage infecting Streptococcusthermophilus CNRZ1205. Microbiology. 143: 3417–3429.

Stanley E, Walsh L, van der ZA, Fitzgerald GF & van SinderenD (2000) Identification of four loci isolated from two Strepto-coccus thermophilus phage genomes responsible for mediatingbacteriophage resistance. FEMS Microbiol. Lett. 182: 271–277.

Tremblay DM & Moineau S (1999) Complete genomic sequenceof the lytic bacteriophage DT1 of Streptococcus thermophilus.Virology 255: 63–76.

van Sinderen D, Karsens H, Kok J, Terpstra P, Ruiters MH, VenemaG & Nauta A (1996) Sequence analysis and molecular charac-terization of the temperate lactococcal bacteriophage r1t. Mol.Microbiol. 19: 1343–1355.

Vasala A, Dupont L, Baumann M, Ritzenthaler P & AlatossavaT (1993) Molecular comparison of the structural proteins en-coding gene clusters of two related Lactobacillus delbrueckiibacteriophages. J. Virol. 67: 3061–3068.

Ventura M, Bruttin A, Canchaya C & Brüssow H (2002) Tran-scription analysis of Streptococcus thermophilus phages in thelysogenic state. (2002) Virology. (In Press).

Weeks CR & Ferretti JJ (1984) The gene for type A streptococcalexotoxin (erythrogenic toxin) is located in bacteriophage T12.Infect. Immun. 46: 531–536.

Whiteley M, Bangera MG, Bumgarner RE, Parsek MR, Teitzel GM,Lory S & Greenberg EP (2001) Gene expression in Pseudomonasaeruginosa biofilms. Nature 413: 860–864.

Yu CE & Ferretti JJ (1991) Molecular characterization of new groupA streptococcal bacteriophages containing the gene for strep-tococcal erythrogenic toxin A (speA). Mol. Gen. Genet. 231:161–168.