Phenotypic and genomic diversity of Lactobacillus plantarum strains isolated from various...

16
Phenotypic and genomic diversity of Lactobacillus plantarum strains isolated from various environmental nichesRoland J. Siezen, 1,2,3 * Vesela A. Tzeneva, 1,2† Anna Castioni, 1,4 Michiel Wels, 1,2,3 Hoa T. K. Phan, 5 Jan L. W. Rademaker, 2 Marjo J. C. Starrenburg, 2 Michiel Kleerebezem, 1,2,6 Douwe Molenaar 1,2 and Johan E. T. van Hylckama Vlieg 1,2 1 TI Food and Nutrition, PO Box 557, 6700 AN Wageningen, The Netherlands. 2 NIZO food research, Kernhemseweg 2, PO Box 20, 6710 BA Ede, The Netherlands. 3 Center for Molecular and Biomolecular Informatics, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500HB Nijmegen, The Netherlands. 4 Dipartimento Scientifico e Tecnologico, Università degli Studi di Verona, Strada le Grazie, 15, I-37134 Verona, Italy. 5 Food Industry Research Institute, Km 8 Nguyen Trai, Thanh Xuan, Hanoi, Vietnam. 6 Laboratory of Microbiology, Wageningen University, Dreijenplein 10. 6703 HB Wageningen, The Netherlands. Summary Lactobacillus plantarum is a ubiquitous microorgan- ism that is able to colonize several ecological niches, including vegetables, meat, dairy substrates and the gastro-intestinal tract. An extensive phenotypic and genomic diversity analysis was conducted to eluci- date the molecular basis of the high flexibility and versatility of this species. First, 185 isolates from diverse environments were phenotypically character- ized by evaluating their fermentation and growth characteristics. Strains clustered largely together within their particular food niche, but human fecal isolates were scattered throughout the food clusters, suggesting that they originate from the food eaten by the individuals. Based on distinct phenotypic profiles, 24 strains were selected and, together with a further 18 strains from an earlier low-resolution study, their genomic diversity was evaluated by comparative genome hybridization against the reference genome of L. plantarum WCFS1. Over 2000 genes were iden- tified that constitute the core genome of the L. plan- tarum species, including 121 unique L. plantarum- marker genes that have not been found in other lactic acid bacteria. Over 50 genes unique for the reference strain WCFS1 were identified that were absent in the other L. plantarum strains. Strains of the L. plantarum subspecies argentoratensis were found to lack a common set of 24 genes, organized in seven gene clusters/operons, supporting their classification as a separate subspecies. The results provide a detailed view on phenotypic and genomic diversity of L. plan- tarum and lead to a better comprehension of niche adaptation and functionality of the organism. Introduction Lactobacillus is a diverse genus (Stiles and Holzapfel, 1997) including more than 150 species (http://www. dsmz.de/microorganisms/). Some of the species are highly specialized and adapted to a specific ecological niche such as dairy substrates (Bolotin et al., 2004; van de Guchte et al., 2006), whereas others are typically found in the mammalian gastro-intestinal tract (Russell and Klaenhammer, 2001). The species Lactobacillus plantarum constitutes versatile lactic acid bacteria (LAB), which are found in many different ecological niches such as vegetables, meat and fish, and dairy products (Baleiras Couto et al., 1996; Gardner et al., 2001; Ercolini et al., 2003) as well as the gastro-intestinal tract (Ahrne et al., 1998; Bringel et al., 2005). Lactobacillus plantarum is a highly heterogeneous species (Dellaglio et al., 1975; Bringel et al., 1996) widely employed in many food and health applications, including starter cultures in fermenta- tion processes (Gardner et al., 2001; Antara et al., 2004; Filya et al., 2004; Noonpakdee et al., 2004; Kostinek et al., 2005). Several studies demonstrated that certain strains may have beneficial effects on mammalian gut functionality and some strains have been developed as probiotic cultures (Kalliomaki et al., 2001; Galdeano and Perdigon, 2006; Gueimonde et al., 2006). The complete genome sequence (3.3 Mb) of L. plantarum strain WCFS1 (a human saliva isolate) is known to be one of the Received 6 August, 2009; accepted 21 October, 2009. *For corre- spondence. E-mail [email protected]; Tel. (+31) 318659511; Fax (+31) 318650400. These authors contributed equally to the manuscript. Environmental Microbiology (2010) 12(3), 758–773 doi:10.1111/j.1462-2920.2009.02119.x © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd

Transcript of Phenotypic and genomic diversity of Lactobacillus plantarum strains isolated from various...

Phenotypic and genomic diversity of Lactobacillusplantarum strains isolated from variousenvironmental nichesemi_2119 758..773

Roland J. Siezen,1,2,3*† Vesela A. Tzeneva,1,2†

Anna Castioni,1,4 Michiel Wels,1,2,3 Hoa T. K. Phan,5

Jan L. W. Rademaker,2 Marjo J. C. Starrenburg,2

Michiel Kleerebezem,1,2,6 Douwe Molenaar1,2 andJohan E. T. van Hylckama Vlieg1,2

1TI Food and Nutrition, PO Box 557, 6700 ANWageningen, The Netherlands.2NIZO food research, Kernhemseweg 2, PO Box 20,6710 BA Ede, The Netherlands.3Center for Molecular and Biomolecular Informatics,Radboud University Nijmegen Medical Centre, PO Box9101, 6500HB Nijmegen, The Netherlands.4Dipartimento Scientifico e Tecnologico, Università degliStudi di Verona, Strada le Grazie, 15, I-37134 Verona,Italy.5Food Industry Research Institute, Km 8 Nguyen Trai,Thanh Xuan, Hanoi, Vietnam.6Laboratory of Microbiology, Wageningen University,Dreijenplein 10. 6703 HB Wageningen, TheNetherlands.

Summary

Lactobacillus plantarum is a ubiquitous microorgan-ism that is able to colonize several ecological niches,including vegetables, meat, dairy substrates and thegastro-intestinal tract. An extensive phenotypic andgenomic diversity analysis was conducted to eluci-date the molecular basis of the high flexibility andversatility of this species. First, 185 isolates fromdiverse environments were phenotypically character-ized by evaluating their fermentation and growthcharacteristics. Strains clustered largely togetherwithin their particular food niche, but human fecalisolates were scattered throughout the food clusters,suggesting that they originate from the food eaten bythe individuals. Based on distinct phenotypic profiles,24 strains were selected and, together with a further18 strains from an earlier low-resolution study, theirgenomic diversity was evaluated by comparative

genome hybridization against the reference genomeof L. plantarum WCFS1. Over 2000 genes were iden-tified that constitute the core genome of the L. plan-tarum species, including 121 unique L. plantarum-marker genes that have not been found in other lacticacid bacteria. Over 50 genes unique for the referencestrain WCFS1 were identified that were absent in theother L. plantarum strains. Strains of the L. plantarumsubspecies argentoratensis were found to lack acommon set of 24 genes, organized in seven geneclusters/operons, supporting their classification as aseparate subspecies. The results provide a detailedview on phenotypic and genomic diversity of L. plan-tarum and lead to a better comprehension of nicheadaptation and functionality of the organism.

Introduction

Lactobacillus is a diverse genus (Stiles and Holzapfel,1997) including more than 150 species (http://www.dsmz.de/microorganisms/). Some of the species arehighly specialized and adapted to a specific ecologicalniche such as dairy substrates (Bolotin et al., 2004; vande Guchte et al., 2006), whereas others are typicallyfound in the mammalian gastro-intestinal tract (Russelland Klaenhammer, 2001). The species Lactobacillusplantarum constitutes versatile lactic acid bacteria (LAB),which are found in many different ecological niches suchas vegetables, meat and fish, and dairy products (BaleirasCouto et al., 1996; Gardner et al., 2001; Ercolini et al.,2003) as well as the gastro-intestinal tract (Ahrne et al.,1998; Bringel et al., 2005). Lactobacillus plantarum is ahighly heterogeneous species (Dellaglio et al., 1975;Bringel et al., 1996) widely employed in many food andhealth applications, including starter cultures in fermenta-tion processes (Gardner et al., 2001; Antara et al., 2004;Filya et al., 2004; Noonpakdee et al., 2004; Kostineket al., 2005). Several studies demonstrated that certainstrains may have beneficial effects on mammalian gutfunctionality and some strains have been developed asprobiotic cultures (Kalliomaki et al., 2001; Galdeano andPerdigon, 2006; Gueimonde et al., 2006). The completegenome sequence (3.3 Mb) of L. plantarum strainWCFS1 (a human saliva isolate) is known to be one of the

Received 6 August, 2009; accepted 21 October, 2009. *For corre-spondence. E-mail [email protected]; Tel. (+31) 318659511;Fax (+31) 318650400. †These authors contributed equally to themanuscript.

Environmental Microbiology (2010) 12(3), 758–773 doi:10.1111/j.1462-2920.2009.02119.x

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd

largest of LAB (Kleerebezem et al., 2003). This largergenome size is probably related to the ability of this bac-terium to inhabit diverse environmental niches, allowing itto ferment a broad range of carbohydrates (Bringel et al.,2001). Recently, genome sequences have becomeavailable for L. plantarum strains JDM1 (Zhang et al.,2009) and ATCC14917 (NCBI Accession Code NZ_ACGZ00000000).

Several studies have explored diversity within theL. plantarum species using different approaches such asRAPD-PCR, AFLP and specific gene probes (Bringelet al., 1996; 2001; Curk et al., 1996; Torriani et al.,2001a,b; Tanganurat et al., 2009). Moreover, Molenaarand colleagues compared the gene content of 20 L. plan-tarum strains using a clone-based microarray (Molenaaret al., 2005) representing the genome of strain WCFS1,and demonstrated that the most variable genomic islandsincluded genes involved in sugar metabolism, plantaricinand exopolysaccharides biosynthesis, as well as proph-age components, suggesting the presence of ‘lifestyle’adaptation regions in the L. plantarum genome. A distinctgroup was classified as a new subspecies, namedL. plantarum ssp. argentoratensis (Bringel et al., 2005).These strains were distinguished based on their differen-tial fermentation profile of dulcitol, melezitose, methyla-D-mannoside, L-arabinose, D-turanose, RAPD-PCRfingerprinting and recA gene sequence analysis (Torrianiet al., 2001a,b; Bringel et al., 2005).

In the present study, we performed a phenotypicdiversity analysis of 185 L. plantarum strains isolated fromvery different sources (fermented foods from differentgeographical regions and human isolates). Twenty-fourL. plantarum strains were selected based on differentphenotypic profiles and, together with 18 strains studiedearlier (Molenaar et al., 2005), their genomic diversity wasinvestigated using comparative genome hybridization(CGH) with DNA microarrays carrying specific probes forindividual genes of strain WCFS1. The generated high-resolution data set on the genetic content of the strainsallows a detailed analysis of L. plantarum diversity.

Results

L. plantarum strain collection

The diversity in a collection of 185 strains was analysed,including four reference strains: L. plantarum WCFS1,L. plantarum type strains ATCC 14917T and LMG 6907T

and L. plantarum ssp. argentoratensis type strain DK022T

The collection comprised strains isolated from diversefermented foods and different geographic locations(Japan, Vietnam, Thailand, France, Italy, Spain andBelgium). Moreover, the collection contained severalstrains isolated from the human gastro-intestinal tract(Fig. S1, Supporting information).

Exploring phenotypic diversity within theL. plantarum strains

Initially a subset of 54 L. plantarum isolates was subjectedto extensive phenotypic analysis by performing 83 testsaimed at identifying traits that show strain-to-strain varia-tion. No strains were able to grow in milk, nor at 4°C, norin the presence of 10% NaCl. A limited number of strainswere able to grow at 17°C, or in the presence of 6%salt, or to resist high concentrations of nisin (1000 and2000 ng ml-1). Almost all strains were able to grow onMRS containing additional 4% salt, and tolerated nisinconcentrations lower than 125 ng ml-1. Carbohydrateslike D-arabinose, ribitol, ferric citrate, starch, glycerol,glycogen, inulin, lyxose, tagatose, L-fucose, arabitol,xylitol, Lactobacillus sp. levan, arabinogalactan andb-cyclodextrin were fermented by a limited number ofstrains.

Twenty-four tests with the highest discriminative powerwere selected and used to explore the physiological diver-sity of 185 L. plantarum strains. The tests included thedegradation of selected carbohydrates (xylose, amygdalin,glycerol, sucrose, a-methyl-D-glucopyranoside, a-methyl-D-mannopyranoside, melibiose, dulcitol, melezitose, raffi-nose, lactose, L-rhamnose, starch, D-sorbitol, turanose,L-arabinose, potassium-gluconate, D-trehalose, lactitolmonohydrate, Bacillus subtilis levan, cichorei fructan(Fructafit IQ), as well as growth at 45°C, tolerance of NaCl(6%) or nisin (1000 ng ml-1) in the growth medium.

The 185 L. plantarum strains were clustered based ontheir phenotypic profiles, and a part of this clustering treeis shown in Fig. 1 (for the complete tree see Fig. S1,Supporting information). A large phenotypic diversity wasfound, and several clear correlations were observedbetween strain clustering and source of isolation and/orgeographical origin. For instance, clusters were seen forfermented meat isolates from Vietnam (groups I and IV),kimchi isolates from Japan (cluster II), sourdough isolatesfrom Italy (clusters III and V), eggplant isolates from Spain(cluster VI) and cheese isolates from France (group VII).For strains with an ethnic food origin, the food sourceusually corresponds with a geographical origin. In con-trast, the human isolates (generally from faeces) weremostly scattered throughout this tree, suggesting that theyoriginate from the food eaten by the individuals.

Differential substrate utilization or stress response canbe observed among some clusters (Fig. 1, and Fig. S1).For example, all strains grow well on lactitol except thestrains from meat cluster I, while strains from eggplant(cluster VI, Fig. S1) do not grow on melizitose, turanose orL-arabinose, in contrast to most other isolates. Most sour-dough strains (clusters III and V) do not grow at 45°C.Many cheese strains (cluster VII) and sourdough strains(cluster V) are sensitive to nisin.

Phenotypic and genomic diversity of Lactobacillus plantarum 759

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

100

90

80

70

60

Glu

cose

Xylo

se

Am

ygalin

Lacty

tol

B s

ubtilis

levan

Chi F

ructa

an

Gly

cero

l

Sucro

se

M-D

-Glu

c.

M-a

D-M

ann.

Melib

iose

Dulc

itol

Mele

zitose

Raffin

ose

Lacto

se

L-r

ham

nose

Sta

rch

D-S

orb

itol

Tura

nose

L-a

rabin

ose

K-G

luconate

D-T

rehalo

se

MR

S:4

5°C

NaC

l 6%

Nis

in1000ng/m

l

Strain #NIZO2866NIZO2876NIZO2774NIZO2874NIZO2855NIZO2856NIZO2852NIZO2854NIZO2853NIZO2804NIZO2875NIZO2837NIZO2879NIZO2813NIZO2493NIZO2500NIZO2494NIZO2458NIZO2737NIZO2738NIZO2739NIZO2797NIZO2740NIZO2741NIZO2733NIZO2820NIZO2512NIZO2812NIZO2818NIZO2819NIZO2811NANIZO2841NIZO2735NIZO2794NIZO2780NIZO2785NIZO2777NIZO2778NIZO2029NIZO2807NIZO2767NIZO2801NIZO2823NIZO2761NIZO2765NIZO2825NIZO2833NIZO2822NIZO2779NIZO2781NIZO2806NIZO2753NIZO2754NIZO2736NANIZO2758NIZO2464NIZO2889NIZO2840NIZO2838NIZO2882NIZO2746NIZO2849NIZO2532NIZO2894NIZO2784NIZO2465NIZO2467NIZO2783NIZO2749NIZO2473NIZO2839NIZO2842NIZO2844NIZO2799NIZO2861NIZO2734NIZO2770NANIZO2772NIZO2890NIZO2768NANIZO1836NANIZO2828NIZO2843NIZO2880NIZO2829

J1X15CECT748X13

N58

N60N54N57N56LMG6907X14HOA2X221Lp90NCTH26NCHD6

NCTH27

CHEO4NOS132NOS134NOS135NOS131NOS139

NOS140

NOS118277-7-3LAC5(2)KB247253-2-1277-7-1KB2011051N114NOS121NOS123L6K22L2L3

MLC43

LMG21684H15

KOG18

KLH3H9H1326CUG1KLH1L4L8

LMG9208

Q2

Q3NOS128183H5CHEI1

LAC7

N113HOA3X74KOG7N31MT2(2)NOS142K21CHEI2NCTH2K14KOG14NCTH8N112N119N149KOG8N8NOS119H19BIFI71H22DM2H16BIFI73

WCFS1

BIFI72S61(EI1)N145X222L74(EI 23)

Shredded meat salted and driedHot dogCabbage pickledHot dogPork pickled sour sausagePork pickled sour sausagePork pickled sour sausagePork pickled sour sausagePork pickled sour sausageCabbage pickledHot dogPork pickled sour sausageHot dogWine red grapesPork pickled sour sausagePork pickled sour sausagePork pickled sour sausagePork pickled sour sausageLeek kimchiCabbage kimchiCabbage kimchiRadish kimchiCabbage kimchiCabbage kimchiCabbage kimchiHuman feces Banana fermentedHuman fecesHuman feces Human feces Human fecesCheesePork pickled sour sausageCabbage kimchiCabbage kimchiSourdoughSourdoughSourdoughSourdoughCheese raw with rennet Fermented dry sausage Sourdough fermented Turnip pickled with rice branHuman fecesSourdoughSourdoughSalami Dry sausageHuman fecesSourdoughSourdoughSauerkrautSourdoughSourdoughCabbage kimchiCheeseSourdoughPork pickled sour sausageBanana fermented Pork pickled sour sausagePork pickled sour sausageHot dogEggplant and cucumber pickled Pork pickled sour sausageShrimp pickled sausageCabbage kimchiSourdough fermented Pork pickled sour sausagePork pickled sour sausageSourdough fermented Eggplant pickled Pork pickled sour sausagePork pickled sour sausagePork pickled sour sausagePork pickled sour sausageCabbage kimchiPork pickled sour sausageCabbage kimchiSourdoughWine grapesSourdoughRadish pickled SourdoughWine grapesHuman salivaWine grapesNot identifiedPork pickled sour sausageHot dogNot identified

VietnamVietnamNot identifiedVietnamVietnam VietnamVietnamVietnam VietnamNot identifiedVietnamVietnamVietnamItalyVietnamVietnamVietnamVietnam JapanJapanJapanJapanJapanJapanJapanNot identifiedVietnamJapanNot identifiedNot identifiedJapanFranceVietnamJapanJapanItalyItalyItalyItalyItalyBelgiumItalyJapanJapanItalyItalyFranceNot identifiedJapanItalyItalyUnited KingdomItalyItalyJapanFranceItalyVietnamVietnamVietnam Vietnam VietnamJapanVietnamVietnamJapanItalyVietnamVietnamItalyJapanVietnamVietnam VietnamVietnam JapanVietnam JapanItalySpainItalyVietnamItalySpainEnglandSpainNot identifiedVietnam VietnamNot identified

This studyThis studySesenaThis studyThis studyThis studyThis studyThis studyThis studyVermeirenThis studyThis studyThis studySpanoThis studyThis studyThis studyThis studyNishitaniNishitaniNishitaniNishitaniNishitaniNishitaniNishitaniNaaberThis studyNishitaniNaaberNaaberNishitaniCallonThis studyNishitaniNishitaniPepePepePepePepe

VermeirenPepeNishitaniNishitaniPepePepeCallonVermeirenNishitaniPepePepeVermeirenPepePepeNishitaniCallonPepeThis studyThis studyThis studyThis studyThis studyNishitaniThis studyThis studyNishitaniPepeThis studyThis studyPepeNishitaniThis studyThis studyThis studyThis studyNishitaniThis studyNishitaniPepeVaqueroPepeThis studyPepeVaqueroThis studyVaqueroVermeirenThis studyThis studyVermeiren

Pepe

Receivedas Source

Geographicalorigin Reference

I

IV

III

II

Main source

Meat

Vegetables

Sourdough

Meat

Mixed

Fig. 1. Neighbor-joining cluster analysis based on phenotypic properties of the L. plantarum strains. Only part of the dendrogram is shown;the complete tree with Experimental procedures and strain references is shown in Fig. S1, Supporting information. The 24 phenotype tests arelisted at the top. Subgroups of strains are separated by dotted lines. Strains selected for gene content evaluation are highlighted with a bluesquare. Colour legend: optical density relative to the final OD600 reached in MRS broth at 30°C: dark blue, < 10%; light blue, 10% < x < 30%;dark green, 30% < x < 50%; light green, 50% < x < 70%; orange, 70% < x < 90%; red, x > 90%.

760 R. J. Siezen et al.

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

A subset of 24 strains was selected to reflect the overalldiversity observed, but also some phenotypically highlysimilar cultures were chosen. These strains were sub-jected to comparative genome analyses, together with 18strains previously analysed by clone-based micro arrays(Molenaar et al., 2005).

Exploring genomic diversity of L. plantarum strains

To explore the genomic diversity within L. plantarum,the 42 selected strains were analysed by CGH typingusing an oligomer-based DNA-microarray of L. plantarumWCFS1. These 42 strains were representative for distinctphenotypic clusters and were isolated from different fer-mented foods (meat, vegetable and dairy products).

DNA cohybridization of the tested L. plantarum strainswith the control WCFS1 allowed detecting differences influorescence signals (high, low and normal) reflectinghybridization to different probes. Normal hybridizationvalues (-1 < M < +1.0) indicate the presence of the probetarget, demonstrating that this sequence occurs in boththe query strain and reference strain WCFS1. Very lowfluorescence value (M < -2.0) indicated the absence inthe query strain of the region recognized by this probe.Higher hybridization signals (M > +2.0) indicated probe

targets present in multiple copies in the query strain com-pared with strain WCFS1, which could be due to eithermore copies of that gene on the queries chromosome, ordue to the gene of interest being encoded by a high-copy-number plasmid in the query strain, while on the chromo-some in the reference strain. Several regions for whichhigher hybridization intensities were observed encodeIS-type elements, which may be present in greaternumbers in the query strain (data not shown). Finally, asmall number of probes gave intermediate low M-values(-2.0 < M < -1.0), which were however, clearly abovebackground levels. These low values indicate either alower copy number of the probe target in the query strain,or poor hybridization due to sequence variation in thetarget sequence, rather than the complete absence of thetarget DNA. The presence of the lower signals was par-ticularly apparent for strains NCIMB12120 and DKO22T,which can be anticipated as these strains belongs tothe ssp. argentoratensis (see below) (Fig. 2). Diagnosticsequencing of three genes (lp_0575, lp_0576, lp_ 1823)with intermediate hybridization signals from various querystrains revealed that they are indeed present in thesestrains, but diverge by 15–25% in nucleotide sequenceleading to poor hybridization (described in detail inSupporting information).

1

2

3

4

5

6

7

Fig. 2. Clustering of 42 L. plantarum strains based on the presence/absence of genes. The genes are ordered based on their location in thegenome sequence of L. plantarum WCFS1 (horizontal axis). Black bars indicate ‘absent’ genes/regions. Subgroups of strains are separated bydotted lines.

Phenotypic and genomic diversity of Lactobacillus plantarum 761

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

Based on their hybridization profiles, the 42 L. plan-tarum strains were genotypically grouped into seven clus-ters (Fig. 2), of which group two represents four strains ofthe proposed subspecies argentoratensis (Bringel et al.,2005). Some pairs of strains demonstrated an extremelyhigh similarity in terms of presence/absence of genes, inagreement with their close phenotypic clustering, i.e. thepair of sourdough strains H14 and H4, the pair of sausagestrains NCTH19-1 and NCTH 19-2, and the pair of plantstrains LD2 and LD3. Considering the origin of thesestrains (similar food source and country of isolation), thismay not be surprising. However, this is not always thecase, as for example the sausage strains NCTH27 andCHEO3 appear to be genotypically highly similar (Fig. 2),but they do not cluster together phenotypically (Fig. S1,Supporting information). In general, the genotypic group-ing into seven clusters (Fig. 2) shows limited correspon-dence to the phenotypic clustering (Fig. 1). This possiblyrelates to the fact that: (i) phenotypes were not measuredfor all of the 42 strains and (ii) genotyping does not takeinto account the strain-specific genes that are present inquery strains, but are absent in the reference strainWCFS1.

A chromosome wheel plot displays the variability inneighbouring genomic regions of the tested L. plantarum

strains relative to L. plantarum WCFS1 (Fig. 3, ring 3). Sixlarge variable regions were identified (ring 3, regions withyellow to red colour). These regions included genesinvolved in plantaricin biosynthesis (region i), prophages(ii and v), restriction-modification (iii), exopolysaccharidebiosynthesis (iv) and oxidoreductases and PTS systems(vi). Therefore, the presence/absence of neighbouringgenes within these regions appears to be quite variable,suggesting that they are hot spots for mosaic evolution(Omelchenko et al., 2003). The diversity of genomicmake-up was largely in agreement with that observedpreviously (Molenaar et al., 2005) for a smaller subset ofstrains, and using less discriminative clone-based arrays(ring 1), but the present data is much less noisy (ring 2).

Conserved L. plantarum-specific genes

The CGH data allowed us to identify which genes areconserved or variable between strains (Fig. 4). Individualquery strains are found to lack 267–597 (9–20%) of genesthat occur in the chromosome of reference strain WCFS1(Table 1). On the other hand, a set of 2049 core genes(almost 70% of total genes analysed) was found to beconserved in all 42 selected L. plantarum strains, despitetheir heterogeneity in terms of physiological properties

Fig. 3. Chromosome wheel of L. plantarumWCFS1 displaying absence of regions of thechromosome in other L. plantarum strains.The chromosome wheel was constructedusing the Microbial Genome Viewer(Kerkhoven et al., 2004). Outer ring (ring 1):Relative conservation among 20 L. plantarumstrains as determined by Molenaar andcolleagues (2005). Peak height represents thenumber of strains in which specificchromosomal regions of strain WCFS1 werescored as absent in other strains ofL. plantarum. Ring 2: Relative conservation asdescribed for ring 1 based on the presentstudy of 42 L. plantarum strains. Ring 3:Chromosome flexibility region for L. plantarumWCFS1 as determined by scoring thedifference of absence/presence ofneighbouring genes between different strains(moving average of 20 adjacent genes). Redindicates a high variance between adjacentgenes in different strains, green indicates alow variance. Ring 4: G + C content centredaround the median G + C content. Ring 5:Codon Adaptation Index (Molenaar et al.,2005). The six most variable regions i–vi areboxed (brown).

762 R. J. Siezen et al.

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

and isolation origin. Another 149 genes (5%) appeared tolack in only one strain, of which 95 genes lacked only instrain NCIMB12120 (a subspecies argentoratensis), pos-sibly due to lower sequence similarity and hence lowerhybridization efficiency to the DNA array. Out of ~2800genes of L. plantarum WCFS1 spotted on the microarray,~2500 encode proteins with homologues in other LAB, asdetermined by LaCOG (Makarova et al., 2006; Makarovaand Koonin, 2007) and BLASTP analysis. Of the remain-ing ~300 genes with no homologues in sequenced LAB,121 genes were found to be present in all 42 L. plantarumstrains (Fig. 4), suggesting that these 121 genes not onlybelong to the core genome of L. plantarum, but also dis-tinguish it from other LAB. The large majority of theseL. plantarum-specific genes encode non-conserved hypo-thetical proteins, or ORFans, meaning that no significanthit was found in any protein database. For a full list anddetails of these 121 genes, including predictions of hori-zontal gene transfer (Cortez et al., 2009a) see Table S1,Supporting information. Although many unique genes aresmaller than average, meta-analysis of L. plantarumWCFS1 gene expression microarrays showed that atleast 60% of these genes, including many small genes,are expressed clearly above background, which indicatesthat most represent functional genes (Table S1).

Table 2 lists a subset of these L. plantarum-specificcore proteins, either predicted to be in a gene cluster, orwith predicted extracellular location or with a predictedfunction. Three gene clusters have a reasonably definedfunction. The first cluster (genes lp_1087-lp_1092)encodes a tartrate transporter, two subunits of a tartratedehydratase (EC 4.2.1.32), and a regulator of thesegenes. Metabolism of the organic acid tartrate by L. plan-tarum appears to be rather unique among LAB (Hegazi

and Abo-Elnaga, 1980), and this phenotype and corre-sponding genes could be used as markers for identifyingL. plantarum strains. L. plantarum-unique genes in thesecond cluster (lp_1377-lp_1385) are all involved insulfate metabolism and transport. The third gene clusterencodes a unique ABC transporter that possibly exports asmall protein (encoded in the same cluster by lp_1333) ofunknown function. Five other conserved L. plantarum-specific gene clusters encode only proteins of unknown orgeneral function (Table 2). Interestingly, 12 genes encodeproteins of unknown function predicted to be locatedextracellularly in the cell envelope (Zhou et al., 2008);these proteins may function in L. plantarum-specific inter-actions with its environment.

Distinguishing between L. plantarum ssp.argentoratensis and ssp. plantarum

In our set of 42 strains, four were previously classifiedusing phenotypic and molecular typing methods asbelonging to the subspecies argentoratensis, i.e. strainsDKO22T (the type strain), NCIMB12120, LP85-2 andSF2A35B (Bringel et al., 2005). This subspecies wasearlier referred to as group GLp2 or GB (Bringel et al., 2001;Molenaar et al., 2005). These four argentoratensis strainsappear to lack between 11% and 20% of the genes ofstrain WCFS1 (Table 1). This conclusion is tentative, assome genes may only appear to be absent due to poorhybridization. For example, the recA gene has only 93%nucleotide sequence identity between subspecies argen-toratensis and subspecies plantarum strains (Bringelet al., 2001), and our sequencing of 3 genes from strainDKO22 showed more than 15% sequence diversity withstrain WCFS1 (see Supporting information).

Fig. 4. Distribution of genes in 42L. plantarum strains, determined by CGH,relative to 2956 genes of the referencegenome of strain WCFS1. In a clockwisefashion, the number of genes is indicated thatare present in 42 (all) strains, 41 strains, 40strains, 39 strains, etc. until only in one strain(the reference strain WCFS1). The 2049 coregenes, present in all 42 strains, are dividedinto those also found in other LAB (1928genes, purple) and those with no homologuesin other LAB species (121 genes, light blue).

1 42

42 strains

39

40

41 strains

only in WCFS1

59 genes

1928 genes

149 genes

54 genes 121 genes

in 42 strains, L. plantarum specific

in 42 strains, also in other LAB

Phenotypic and genomic diversity of Lactobacillus plantarum 763

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

However, based on our analysis of differences intotal gene content, these four ssp. argentoratensisstrains are grouped together in the dendrogram of Fig. 2because they commonly lack seven gene clusters/operons that are present in all other 38 strains tested(Table 3). These missing clusters encode various trans-porters and two putative extracellular enzyme com-plexes (Boekhorst et al., 2006; Siezen et al., 2006) thatcould be involved in carbohydrate utilization. Theabsence of these genes could be used as additionalgenetic markers for the subspecies argentoratensis,although we realize that using absence of markers is notvery reliable.

Genes found only in reference strainL. plantarum WCFS1

More than 50 genes were found to be unique for L. plan-tarum strain WCFS1, as they were absent in all other 41strains analysed by CGH. Ten are putative phage pro-teins, while 40 are organized in large, continuous geneclusters that appear to have been acquired recently, astheir GC% deviates strongly from the 44.5% average GCcontent of strain WCFS1 (full details can be found inTable S2, Supporting information). Three large clustersare functionally most interesting, and these will bedescribed in more detail (Table 4). One is a cluster of

Table 1. L. plantarum strains selected for genotyping, sorted by their differences in number of ORFs absent relative to strain WCFS1.

Strain # Received as Isolation source Geographical originNumber ofORFs absenta

% of ORFsabsent

NIZO1836 WCFS1 Human saliva England 0 0NIZO2263 LP80 Silage n.a. 267 9.0NIZO2814 Lp95 Wine red grapes Italy 283 9.6CIP102359 CIP102359 Human spinal fluid France 287 9.7NIZO2726 ATCC8014 Maize ensilage n.a. 288 9.8NIZO2891 LD3 Radish pickled Vietnam 288 9.7NIZO2457 CHEO3 Pork pickled sour sausage Vietnam 289 9.8NIZO2535 LD2 Orange fermented Vietnam 289 9.8NIZO2830 BLL(EI31) n.a. n.a. 289 9.8NIZO2259 CIP104452 Human tooth abscess France 290 9.8NIZO2831 CECT221(24Ab04) Grass silage United States 290 9.8NIZO2262 LM3 Silage n.a. 293 9.9NIZO2494 NCTH27 Pork pickled sour sausage Vietnam 293 9.9NCDO1193 NCDO1193 Vegetables n.a. 293 9.9NIZO2806 LMG9208 Sauerkraut United Kingdom 296 10.0NIZO2896 ATCC14917b Cabbage pickled Denmark 298 10.1NIZO2741 NOS140 Cabbage kimchi Japan 299 10.1NIZO1837 299 Human colon United Kingdom 300 10.1NIZO2855 N58 Pork pickled sour sausage Vietnam 301 10.2NIZO2877 X17 Hot dog Vietnam 302 10.2NIZO2260 299v/DSM9843 Human intestine United Kingdom 305 10.3NIZO2029 MLC43 Raw cheese with rennet Italy 310 10.5NIZO2889 LAC7 Banana fermented Vietnam 313 10.6NIZO2264 LP85-2c Silage France 318 10.8NIZO2484 NCTH19-1 Pork pickled sour sausage Vietnam 323 10.9NIZO2485 NCTH19-2 Pork pickled sour sausage Vietnam 324 11.0NIZO2261 NC8 Grass silage Sweden 327 11.1NIZO2802 KOG24 Cheese Japan 332 11.2NIZO2801 KOG18 Turnip pickled Japan 348 11.8NIZO3400 LMG18021 Milk Senegal 393 13.3NIZO2753 Q2 Sourdough fermented Italy 407 13.8NIZO1839 SF2A35Bc Sour cassava South America 412 13.9NIZO2258 CIP104451 Human urine France 435 14.7NIZO2257 CIP104450 Human stool France 436 14.7CIP104448 CIP104448 Human stool France 436 14.7NIZO2897 DKO22c Sour cassava Nigeria 440 14.9NIZO2766 H14 Sourdough fermented Italy 466 15.8NIZO2757 H4 Sourdough fermented Italy 470 15.9NIZO2776 CECT4645 Cheese n.a. 491 16.6NIZO2256 CIP104441 Human stool France 541 18.3NIZO1838 CIP104440 Human stool France 542 18.3NIZO1840 NCIMB12120c Cereal fermented (Ogi) Nigeria 597 20.2

a. ORFs absent relative to 2956 analysed ORFs of strain WCFS1.b. Draft genome sequence available April 2009 (NZ_ACGZ00000000.1).c. Putative subspecies argentoratensis.Strains in bold were also compared in Molenaar and colleagues (2005). The other strains were new in this study.n.a., not available.

764 R. J. Siezen et al.

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

Table 2. Selected L. plantarum-specific genes and gene clusters, conserved in all tested L. plantarum strains, but not present in other LAB.

ORFa Product Length Best BLASTP hitb Hit product e-scorec

Gene clusterslp_0154 Transcription regulator, PadR family 109 Dethiobacter alkaliphilus AHT 1 Transcription regulator, PadR

family4.00E-13

lp_0155 Hypothetical protein 261 No hitlp_0156 Hypothetical protein 275 No hitlp_0554 Hypothetical protein 152 No hitlp_0555 Hypothetical protein 174 Streptococcus mutans UA159 Putative acetate kinase 2.00E-11lp_0556 Hypothetical protein 111 Streptococcus pyogenes

MGAS6180Hypothetical protein 1.00E-31

lp_1087 Tartrate transport protein 473 Lactobacillus salivarius ATCC11741d

DASS family divalentanion : sodium (Na+)symporter

7.00E-173

lp_1089 L(+)-tartrate dehydratase, subunit B 203 Clostridium nexile DSM 1787 Hypothetical protein 1.00E-82lp_1090 L(+)-tartrate dehydratase, subunit A 307 Eubacterium hallii DSM 3353 Hypothetical protein 5.00E-113lp_1092 Transcription regulator, AraC family 340 Veillonella parvula DSM 2008 DNA-binding domain-containing

protein, AraC-type5.00E-35

lp_1333 Hypothetical protein 55 Oceanobacillus iheyensisHTE831

Hypothetical protein 1.00E-04

lp_1334 ABC transporter, permease protein 245 Oceanobacillus iheyensisHTE831

Hypothetical protein 6.00E-44

lp_1335 ABC transporter, ATP-binding protein 285 Oceanobacillus iheyensisHTE831

ABC transporter ATP-bindingprotein

7.00E-95

lp_1336 Bifunctional protein: ABC transporter,ATP-binding protein; LytR familytranscriptional regulator

327 Oceanobacillus iheyensisHTE831

ABC transporter ATP-bindingprotein

2.00E-69

lp_1369 Lipoprotein precursor 141 No hitlp_1370 Hypothetical protein 39 No hitlp_1371 Extracellular protein, membrane-anchored 356 No hitlp_1377 Hypothetical protein 545 Alphaproteobacterium BAL199 Glycosyl transferase, group 2

family protein5.00E-26

lp_1378 Sulfate adenylyltransferase 391 Bacillus coagulans 36D1 Sulfate adenylyltransferase 2.00E-158lp_1379 Adenylylsulfate kinase 207 Bacillus coagulans 36D1 Adenylylsulfate kinase 4.00E-73lp_1385 Sodium/sulfate symporter 591 Oceanobacillus iheyensis

HTE831Sulfur deprivation response

regulator4.00E-108

lp_2864 Integral membrane protein 155 No hitlp_2865 Acetyltransferase, GNAT family 165 Bacillus pumilus SAFR-032 GNAT family acetyltransferase 1.00E-29lp_2866 Integral membrane protein 83 No hitlp_2867 Unknown 101 No hitlp_3350 Hypothetical protein 121 No hitlp_3351 Hypothetical protein 168 Streptococcus gordonii str.

Challis substr. CH1Hypothetical protein 5.00E-08

lp_3353 Hypothetical protein 118 No hit

Cell envelope/extracellularlp_0141 Extracellular protein 270 No hitlp_0473 Lipoprotein precursor 328 No hitlp_1357 Extracellular protein, membrane-anchored 112 No hitlp_1447 Cell surface protein precursor 123 No hitlp_2101 Polysaccharide polymerase 424 No hitlp_2636 Extracellular protein 223 No hitlp_2934 Lipoprotein precursor 157 No hitlp_2976 Cell surface protein precursor 125 No hitlp_3275 Extracellular protein, membrane-anchored 160 No hitlp_3454 Cell surface protein 92 No hit

Otherlp_0502 Serine transporter 426 Proteus penneri ATCC 35198 Hypothetical protein 1.00E-136lp_0507 DNA-binding protein 242 Corynebacterium

glucuronalyticum ATCC 51867Transcriptional regulator 2.00E-43

lp_1354a Peptide pheromone precursor 58 No hitlp_1682 Phosphopantetheinyltransferase 183 Bacillus cereus AH1134 4′-Phosphopantetheinyl

transferase6.00E-14

lp_3458 Transcription regulator, TetR family 195 Acidothermus cellulolyticus 11B TetR family transcriptionalregulator

7.00E-19

a. Gene numbering as in reference strain WCFS1.b. Excluding hits to L. plantarum.c. E-score cut-off 0.05.d. Best hits are in LAB, but these are malate transporters, and not in same operon with tartrate dehydratase. Therefore, these are not orthologues,but homologues.A full list of genes can be found in Table S1.

Phenotypic and genomic diversity of Lactobacillus plantarum 765

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

genes (lp_1177-lp_1187) for biosynthesis of an unknownextracellular polysaccharide. Most of these genes havehomologues in other LAB, but the gene order and genecluster composition in L. plantarum WCFS1 is quite differ-ent in other bacteria, suggesting that this gene cluster issubject to mosaic evolution (Omelchenko et al., 2003) asalso observed for polysaccharide gene clusters in strep-tococci (Bourgoin et al., 1999; Broadbent et al., 2003;Rasmussen et al., 2008). The gene cluster lp_0578-lp_0584 for non-ribosomal peptide synthesis is alsounique for strain WCFS1; it has best homologues inbacilli, but again the gene size, order and content arequite different from bacilli. Finally, the largest unique genecluster (17 genes) of L. plantarum WCFS1 appears to bea hybrid set, encoding proteins for alternative lysine bio-synthesis via the amino-adipate pathway, not found in anyother LAB, and also proteins for deoxyribose metabolism.We hypothesize that together they may be responsible forbiosynthesis of a novel macrolide antibiotic, as there isalso a macrolide efflux protein encoded in the genecluster. Moreover, seven genes of this ‘macrolide’ clusterare best homologues and syntenous (Table 4) with a sec-ondary metabolite biosynthesis gene cluster from Strep-tomyces pristinaespiralis ATCC 25486 (Fischbach et al.,2008), a known producer of antibiotics (Sezonov et al.,1997). Macrolides are a group of antibiotics, belonging tothe polyketide class of natural products, whose activity

stems from the presence of a large macrocyclic lactonering to which one or more deoxy sugars, usually cladinoseand desosamine, may be attached.

Discussion

In this study, we evaluated L. plantarum diversity usingboth phenotypic and genotypic approaches. An extensivecollection of strains was studied, isolated mainly fromdifferent fermented foods but also from human feces, andobtained from different culture collections and differentgeographical origin. This heterogeneous set of strainsallowed us to obtain a broader view of the species and itsnatural biodiversity. We found a remarkable strain-to-strain diversity in phenotypes, which may be a result ofthe high adaptation capacity of the species to diversenatural environments (Bringel et al., 2005; Molenaaret al., 2005). This phenotypic diversity allowed us to select24 phenotypic tests that could be used to discriminatebetween the L. plantarum strains used in this study.

The clusters defined by phenotypic profiles were thesource for selection of a set of 42 diverse candidates forfurther genomic characterization with CGH. In the currentstudy, an open-reading frame (ORF)-based oligonucle-otide microarray was used for CGH instead of the cloned-based microarray used previously (Molenaar et al., 2005).This allowed us to obtain a much higher resolution view ofthe presence/absence in different strains of each probetarget and ORF annotated within the WCFS1 genome(Mukherjee et al., 2006). However, it is important to notethat the applied approach could not evaluate the completegenome content of the different L. plantarum strains, ashybridization profiles allowed monitoring of the presenceof only the corresponding genes found in WCFS1. Thus,the estimated diversity of these strains relates to thegenes detected by the microarray rather than to theircomplete genomes.

The core genome of L. plantarum, being present in 42 or41 strains (excluding strain NCIMB12120), was found tocomprise about 2050 or 2200 genes (69–74% of total)respectively. As expected, genes with crucial cellular roles,such as those involved in DNA replication, and in centralmetabolism (i.e. glycolytic enzymes), appeared to behighly conserved in all analysed strains. About 120 fullyconserved genes were found to be unique for L. plantarum,as they have no homologues in other sequenced LABgenomes. Hence, these could be used as marker genes forthe species L. plantarum. The fact that nearly all theseL. plantarum-unique genes have a GC content very similarto the average of the whole genome (44.5% GC) (seeTable S1) indicates that these genes were not acquiredrecently but have been part of the L. plantarum coregenome from far back in evolution. Many L. plantarum-unique genes encode hypothetical proteins of unknown

Table 3. Genes absent in all four L. plantarum ssp. argentoratensisstrains DKO22, NCIMB12120, LP85-2, SF2A35B relative to referencestrain WCFS1.

Absent genea Function

lp_1341 Transcription regulator, MerR familylp_1342 Unknownlp_1343 Transport proteinlp_1668 Short-chain dehydrogenase/oxidoreductaselp_1669 Transcription regulator, AraC familylp_1919 Cadmium-/manganese- transporting P-type ATPaselp_1920 Enolase (phosphopyruvate hydratase)lp_1954 ABC transporter, ATP-binding proteinlp_1955 ABC transporter, permease proteinlp_1956 ABC transporter, permease proteinlp_1957 ABC transporter, permease proteinlp_1958 Acetoin ABC transporter, ATP-binding proteinlp_1959 Acetoin transport repressor, GntR familylp_3193 Peptidylprolyl isomeraselp_3219 Sucrose PTS, EIIBCAlp_3220 Maltase-sucrase, probably sucrose-6-P hydrolaselp_3221 Transcription regulator, LacI family, sucrose-relatedlp_3412 Extracellular proteinlp_3413 Cell surface protein precursor, DUF916 familylp_3414 Extracellular proteinlp_3676 Extracellular proteinlp_3677 Cell surface protein precursorlp_3678 Cell surface protein precursor, DUF916 familylp_3679 Extracellular protein

a. Absent relative to the listed gene in reference strain L. plantarumWCFS1.

766 R. J. Siezen et al.

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

Tab

le4.

Gen

esfo

und

only

inL.

plan

taru

mst

rain

WC

FS

1an

dno

tin

othe

rte

sted

L.pl

anta

rum

stra

ins.

OR

FG

ene

Siz

eP

rodu

ctG

C%

Bes

tB

LAS

TP

hit

E-s

core

Mac

rolid

eb

iosy

nth

esis

(lys

ine

bio

syn

thes

isan

dd

eoxy

rib

ose

met

abo

lism

)lp

_048

228

9Tr

ansc

riptio

nre

gula

tor

32.3

Ent

eroc

occu

sfa

eciu

mT

X13

304.

00E

-31

lp_0

483

56M

etal

-bin

ding

prot

ein

40.5

Str

epto

myc

espr

istin

aesp

iralis

AT

CC

2548

6a7.

00E

-03

lp_0

484

289

AT

P-d

epen

dent

carb

oxyl

ate-

amin

e/th

ioll

igas

efa

mily

prot

ein

35.1

Str

epto

myc

espr

istin

aesp

iralis

AT

CC

2548

6a3.

00E

-42

lp_0

485

209

Hyp

othe

tical

prot

ein

37.1

Str

epto

myc

espr

istin

aesp

iralis

AT

CC

2548

6a1.

00E

-41

lp_0

486

439

Pyr

idox

al-p

hosp

hate

-dep

ende

ntam

inot

rans

fera

se38

.7S

trep

tom

yces

pris

tinae

spira

lisA

TC

C25

486a

3.00

E-1

24lp

_048

7ar

gC1

343

N-a

cety

l-gam

ma-

glut

amyl

-pho

spha

tere

duct

ase

36.1

Str

epto

myc

espr

istin

aesp

iralis

AT

CC

2548

6a4.

00E

-68

lp_0

488

612

Bifu

nctio

nalp

rote

in:

acet

yl-g

luta

mat

eki

nase

;ac

etyl

-orn

ithin

ede

acet

ylas

e(p

utat

ive)

36.4

Str

epto

myc

espr

istin

aesp

iralis

AT

CC

2548

6a ;T

herm

usth

erm

ophi

lus

HB

83.

00E

-70;

4.00

E-1

4

lp_0

489

tkt1

A27

5Tr

ansk

etol

ase,

thia

min

edi

phos

phat

ebi

ndin

gdo

mai

n37

.1S

trep

tom

yces

pris

tinae

spira

lisA

TC

C25

486a

5.00

E-6

0lp

_049

018

4H

ypot

hetic

alpr

otei

n33

.9B

acill

usco

agul

ans

36D

15.

00E

-27

lp_0

491

tkt1

B30

5Tr

ansk

etol

ase,

pyrim

idin

e-bi

ndin

gdo

mai

n38

.1G

eoba

cter

uran

iired

ucen

sR

f42.

00E

-29

lp_0

492

403

Tran

spor

tpr

otei

n,M

FS

supe

rfam

ily36

.6C

lost

ridiu

mbo

tulin

umE

3st

r.A

lask

aE

431.

00E

-84

lp_0

493

413

Hyp

othe

tical

prot

ein

37.0

Clo

strid

ium

botu

linum

E3

str.

Ala

ska

E43

1.00

E-1

08lp

_049

5de

oR31

6D

eoxy

ribon

ucle

osid

ere

gula

tor

43.0

Lact

obac

illus

ferm

entu

mIF

O39

564.

00E

-82

lp_0

497

deoC

215

Deo

xyrib

ose-

phos

phat

eal

dola

se46

.2La

ctob

acill

usfe

rmen

tum

IFO

3956

1.00

E-7

5lp

_049

8de

oP45

4D

eoxy

ribos

etr

ansp

orte

r,M

FS

supe

rfam

ily44

.0A

ctin

omyc

esco

leoc

anis

DS

M15

436

3.00

E-1

42lp

_049

9de

oM33

8D

eoxy

ribos

em

utar

otas

e42

.6La

ctob

acill

usfe

rmen

tum

IFO

3956

5.00

E-1

29lp

_050

0de

oK30

7D

eoxy

ribok

inas

e44

.2La

ctob

acill

usfe

rmen

tum

IFO

3956

6.00

E-9

9

NR

PS

gen

ecl

ust

erlp

_057

8np

sA52

89N

on-r

ibos

omal

pept

ide

synt

heta

seN

psA

35.9

Bac

illus

thur

ingi

ensi

sB

GS

C4B

D1

0lp

_057

9pa

nD13

0A

spar

tate

1-de

carb

oxyl

ase

41.0

Cop

roco

ccus

com

esA

TC

C27

758

2.00

E-4

7lp

_058

033

0M

etal

-dep

ende

ntph

osph

ohyd

rola

se,

HD

fam

ily37

.5B

acill

usce

reus

Roc

k4-2

2.00

E-3

2lp

_058

1np

sB80

5N

on-r

ibos

omal

pept

ide

synt

heta

seN

psB

36.1

Elu

sim

icro

bium

min

utum

Pei

191

2.00

E-8

8lp

_058

2np

sC21

44′

-pho

spho

pant

ethe

inyl

tran

sfer

ase

33.6

Geo

baci

llus

sp.

Y41

2MC

102.

00E

-09

lp_0

583

127

Hyp

othe

tical

prot

ein

35.7

Bac

illus

coag

ulan

s36

D1

8.00

E-4

9lp

_058

441

3Tr

ansp

ort

prot

ein

35.7

Lact

obac

illus

rham

nosu

sH

N00

11.

00E

-86

Po

lysa

cch

arid

eb

iosy

nth

esis

gen

ecl

ust

erlp

_117

7cp

s1A

330

Pol

ysac

char

ide

bios

ynth

esis

prot

ein

33.1

Bac

tero

ides

capi

llosu

sA

TC

C29

799

6.00

E-4

8lp

_117

8cp

s1B

353

Gly

cosy

ltran

sfer

ase

36.0

Leuc

onos

toc

citr

eum

KM

206.

00E

-63

lp_1

179

cps1

C47

5R

epea

tun

ittr

ansp

orte

r33

.8S

trep

toco

ccus

ther

mop

hilu

s2.

00E

-128

lp_1

180

cps1

D25

8G

lyco

syltr

ansf

eras

e34

.2La

ctob

acill

usfe

rmen

tum

IFO

3956

1.00

E-7

8lp

_118

1cp

s1E

334

Acy

ltran

sfer

ase/

acet

yltr

ansf

eras

e34

.4V

ibrio

vuln

ificu

sC

MC

P6

3.00

E-3

3lp

_118

2cp

s1F

207

Pol

ysac

char

ide

bios

ynth

esis

,ch

ain

leng

thre

gula

tor

35.9

Lact

obac

illus

ferm

entu

mA

TC

C14

931

1.00

E-0

6lp

_118

3cp

s1G

376

Gly

cosy

ltran

sfer

ase

36.0

Lact

obac

illus

rham

nosu

s9.

00E

-59

lp_1

184

cps1

H30

8G

lyco

syltr

ansf

eras

e(r

ham

nosy

ltran

sfer

ase)

34.4

Lact

obac

illus

reut

eri1

00-2

31.

00E

-54

lp_1

185

cps1

I38

8P

olys

acch

arid

epo

lym

eras

e35

.0S

trep

toco

ccus

ther

mop

hilu

s2.

00E

-14

lp_1

186

rfbA

287

Glu

cose

-1-p

hosp

hate

thym

idyl

yltr

ansf

eras

e37

.5E

nter

ococ

cus

faec

ium

DO

4.00

E-1

18lp

_118

742

9G

lyco

sylh

ydro

lase

36.2

The

rmob

acul

umte

rren

umA

TC

CB

AA

-798

2.00

E-1

3

a.Lo

cate

don

Str

epto

myc

espr

istin

aesp

iralis

AT

CC

2548

6su

perc

ontig

1.37

,N

CB

IG

enB

ank

acce

ssio

nD

S57

0818

AB

JI01

0000

00.

NR

PS

,no

n-rib

osom

alpe

ptid

esy

nthe

sis.

Phenotypic and genomic diversity of Lactobacillus plantarum 767

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

function, but a few encode known functions that could beused for phenotypic typing. The main candidate is a fullyconserved gene cluster for tartrate uptake and metabo-lism, including a tartrate dehydratase that catalyses theconversion of tartrate to oxaloacetate. In an earlier study,L. plantarum was found to be the only lactic acid bacteriumable to utilize tartrate and produce CO2 gas (Hegazi andAbo-Elnaga, 1980). This unique ability to metabolize tar-trate may be related to the high concentration of tartrate insome plants, notably grapes. Another gene cluster uniquefor L. plantarum appears to be involved in sulfur transportand metabolism: two genes (lp_1378, lp_1379) make uptwo-thirds of the pathway converting sulfate into sulfite. InBacillus, the third enzyme of this pathway, a phosphoad-enosine phosphosulfate reductase, is encoded in the samegene cluster with sulfate adenylyltransferase and adeny-lylsulfate kinase, but no orthologue for this gene was foundin the L. plantarum genome. We speculate that thesesulfate-converting enzymes, together with a phospho/sulfo-esterase (encoded by lp_1380) and an extracellulararylsulfate sulfo-transferase (lp_1381), may be involved inutilization of sulfate from sulfated polysaccharides found inplants (Hooper et al., 1996; Honke and Taniguchi, 2002).

The recent description of an interspecies taxonomicrank, L. plantarum ssp. argentoratensis, of which DKO22T

is the type strain (Bringel et al., 2005), is supported by ourwhole genome content analysis. Four strains of this sub-species were found to lack the same subset of 24 genes,organized in seven gene clusters/operons, encodingvarious transporters and putative extracellular enzymecomplexes (Boekhorst et al., 2006; Siezen et al., 2006).Quite possibly, these missing genes could be related tothe discriminative phenotypes such as lack of growth onmelezitose or methyl a-D-mannoside (Bringel et al., 2001;2005).

The reference genome itself, L. plantarum WCFS1, wasfound to harbour over 50 genes that were not found in anyof the selected 41 strains isolated from diverse environ-ments. More specifically, three gene clusters are unique tothis strain and were presumably acquired recently in evo-lution as their GC content deviates considerably from thegenome average. All three clusters encode functionsdealing with interactions with the environment, i.e. biosyn-thesis of an exopolysaccharide, a putative macrolide and anon-ribosomal synthesized hybrid peptide-polyketide,suggesting that these are recent adaptations, possibly forsurvival in a specific niche. Moreover, each of the 41strains analysed lacked many more genes relative toWCFS1, up to 20% of the genome (Table 1). Many ofthese non-conserved genes are located in discreteregions of the WCFS1 genome, the ‘life-style adaptation’regions as previously described (Molenaar et al., 2005).These observations raise the question: how many uniqueor shared gene clusters will be present in each of the other

41 strains that are not present in our WCFS1 referencegenome? It appears that only full-genome sequencing andcomparative genomics of many L. plantarum strains withdistant genotypes and phenotypes will give us insight intothe size and complexity of its pan-genome. Such studiesare currently ongoing (R.J. Siezen, B. Renckens, L.Axelson, M. Kleerebezem and S.A.F.T van Hijum, unpubl.results), and preliminary comparison of four sequencedgenomes suggests that each strain has 250–350 genesthat are not present in strain WCFS1, of which many areshared by some strains. These genes represent, e.g.prophages, polysaccharide biosynthesis clusters, sugarmetabolism clusters and plasmids.

Our observation that some strains showing similar geno-type profiles also possessed quite comparable physiologi-cal properties suggests that it may be possible to matchphenotypic traits to specific genes, i.e. find correlationsbetween gene content and functional properties. However,gene-trait matching is not straightforward as, for instance,many genes encode proteins of yet unknown function,genes can be inactivated or differentially expressed, andphenotypic test results can often be ambiguous. On theother hand, our extensive data set is the obvious startingpoint for further research to investigate gene-trait matchingin L. plantarum strains and to move further in the genomeannotation procedure. In this sense, the genes need to beseen in their genome and biological context and, in par-ticular, the context of the cellular metabolic pathways(Teusink et al., 2005). Therefore, the use of novel softwareand innovative bioinformatics tools, such as RandomForest methods, are currently being used to investigategene-trait matching and to evaluate these data in a func-tional perspective (J. Bayjanov, D. Molenaar, S.A.F.T. vanHijum and R.J. Siezen, unpubl. results).

Experimental procedures

Strain collection

The 185 L. plantarum strains analysed in this study wereisolated from fermented European and non-European foodsand the human body. Their isolation source, geographicalorigin and references are listed in the legend to Fig. S1(Supporting information). The strains can be divided in twogroups: (i) 127 L. plantarum strains previously described andobtained from different culture collections and (ii) 58 strainsfrom different Vietnamese fermented foods, isolated andidentified in this study. The strains selected for genotyping arelisted in Table 1. L. plantarum strain WCFS1 was included asthe reference strain (Kleerebezem et al., 2003).

Phenotypic characterization

Strains were grown in De Man, Rogosa and Sharpe (MRS)broth (OXOID, the Netherlands) at 30°C using 300 ml 96-MicrotiterWell plates (Cellstar, Germany). Overnight cultureswere 10-fold diluted in peptone-saline solution (8.5 g l-1

768 R. J. Siezen et al.

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

NaCl and 1.0 g l-1 peptone) and 25 ml of these diluted cultureswas inoculated in 225 ml glucose-free MRS broth, preparedas described below for each conducted test, and incubated at30°C for 48 h. In each plate, L. plantarum WCFS1 wasincluded as reference strain and non-inoculated wells wereincluded as negative controls. All growth tests were carriedout in quadruplicate, inoculating each strain in different posi-tions of the microtiter plate in order to avoid false-positive orfalse-negative results due to cross-contamination and/orpositional effects.

Phenotypic biodiversity was first investigated on a limitedset (54) of isolates by performing 83 different growth tests.These assays included utilization of carbohydrates, toleranceto NaCl and nisin, growth at different temperatures, growth inmilk and in milk supplemented with yeast extract (for detailssee the legend of Fig. S1). Carbohydrate fermentation abilitywas evaluated in glucose-free MRS broth supplemented withdifferent carbon sources at final concentration of 1% (w/v)and then sterilized by filtering. In total 46 monosaccharidesand 18 di-, tri- and poly-saccharides were tested as carbonsources. The diversity of the strains and their relationship tothe analysed physiological properties was evaluated usingPrincipal Component Analysis with the BioNumerics packageversion 4.50 (Applied Maths, Saint-Martens-Latem, Belgium).The correlation between phenotypic properties and source ofisolation and/or geographical origin was verified by multivari-ate ANOVA using BioNumerics software.

Integrating these two analyses, the 83 physiological assayswere ordered based on their discriminative power. Twenty-fourtests were selected as most discriminative for further pheno-typic characterization of all 185 strains. The selected assaysincluded fermentation ability on 18 different carbohydrates(for details see the legend of Fig. S1), growth at 45°C in MRSand their ability to tolerate the presence of NaCl (6%) ornisin (1000 ng ml-1). In particular, the catabolism of dulcitol(= galactitol), melezitose, a-methyl-D-glucopyranoside anda-methyl-D-mannopyranoside were previously described tobe informative assays for distinguishing heterogeneousgroups of L. plantarum strains (Bringel et al., 2001; 2005;Molenaar et al., 2005). The strain-to-strain variation of nisin orsalt tolerance and polysaccharides fermentation has not beenpreviously considered for distinguishing L. plantarum strains.Strain growth in MRS with or without glucose, at 30°C, wasalso verified as positive and negative controls respectively.

After 48 h of incubation, bacterial growth in each mediumwas scored by measuring optical density at 600 nm (OD600)using the Microplate Spectrophotometer SPECTRAmax Plus(Molecular Devices Corporation, California, USA). The opticaldensity was determined for each strain (calculated as averageof four replicates) grown in each condition and expressed as avalue (%) relative to the final OD600 reached in MRS broth at30°C, which is considered as optimal growth condition (100%).The percentage value (x) was then converted into integervalues, from 0 to 5, as follows: if x < 10% then 0, if 10% < x< 30% then 1, if 30% < x < 50% then 2, if 50% < x < 70% then3, if 70% < x < 90% then 4, if x > 90% then 5.

Genomic DNA preparation

For sequencing and PCR experiments, genomic DNA wasextracted from 2 ml overnight bacterial culture using the

commercial kit InstaGene Matrix (Bio-Rad, California, USA)according to the supplier’s instructions.

Chromosomal DNA for microarray hybridization experi-ments was extracted as previously described (Molenaaret al., 2005) and then fragmented by shearing as follows:about 50 ng of DNA was mixed with 900 ml of shearing buffer(TE, pH = 8.0 and glycerol 10%) and loaded in a nebulizer(Invitrogen, the Netherlands). Compressed air was appliedfor 2 min at 1.7 bar in order to obtain DNA fragments with anaverage size of approximately 1 kb. The DNA integrity andextraction efficiency were verified by electrophoresis in a 1%agarose gel in 1¥ TAE buffer, stained with ethidium bromide.DNA purity and concentration were evaluated using a Nano-drop ND-1000 Spectrophotometer (NanoDrop Technologies,Germany).

Primers, PCR conditions and sequencing

The primers used in this study (Table 5) were from MWGBiotech AG (Germany). The amplification reactions werecarried out in a Thermocycler PE9600 (Applied Biosystems,California, USA).

Rep-PCR reactions were performed using the commercialkit Ready-To-Go RAPD Analysis Beads (Amersham Bio-sciences, UK). The amplification was carried out in 25 mladding the primer (GTG)5 (0.2 mM) to the mixture suppliedwith the kit. After an initial denaturation step of 5 min at 95°C,32 cycles of 15 s at 95°C, 30 s at 50°C and 80 s at 72°C, anda final extension of 8 min at 72°C were performed. Amplifi-cation products were resolved by electrophoresis in 1.5%agarose gel in 1¥ TAE buffer, for 6.45 h at 100 Volt, at 4°Cand visualized using ethidium bromide staining (5 pg ml-1

final concentration). The fingerprints were analysed by theBioNumerics software package, version 4.50 (Applied Maths,Belgium).

Initially, to identify true L. plantarum strains, new isolateswere characterized by amplification of repetitive bacterialDNA elements fingerprinting using (GTG)5 primer (data notshown). The applicability of this method for rapid identificationof lactobacilli was previously demonstrated (Gevers et al.,2001). Isolates tentatively identified as L. plantarum on thebasis of (GTG)5-PCR profiles, were further identified by 16SrRNA gene sequencing allowing the identification of isolatesthat clustered to the group of lactobacilli that includes thespecies L. plantarum, L. pentosus and L. paraplantarum.The 16S rRNA genes of strains isolated in this study andcharacterized by different rep-PCR profiles were partiallysequenced. A 16S rRNA gene portion was amplified using

Table 5. Sequences of the PCR primers used in this study.

Primername Primer sequence (5′-3′) Reference

(GTG)5 GTGGTGGTGGTGGTG Gevers et al. (2001)V1.1-F GCGGCGTGCCTAATACATGC Klijn et al. (1991)V3.2-R ATCTACGCATTTCACCGCTAC Klijn et al. (1991)pREV TCGGGATTACCAAACATCAC Torriani et al. (2001b)paraF GTCACAGGCATTACGAAAAC Torriani et al. (2001b)pentF CAGTGGCGCGGTTGATATC Torriani et al. (2001b)planF CCGTTTATGCGGAACACCTA Torriani et al. (2001b)

Phenotypic and genomic diversity of Lactobacillus plantarum 769

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

V1.1-F and V3.2-R primers as previously reported (Klijnet al., 1991) and sequenced (BaseClear, Leiden, the Nether-lands). A multiplex-PCR assay based on the recA genesequence was applied to identify L. plantarum strains in thecollection (Torriani et al., 2001a).

Partial 16S rRNA gene sequences

The partial 16S rRNA gene sequences obtained during theidentification of L. plantarum isolates were submitted to theGenBank database with accession numbers: EF426247–EF426287.

Analysis of the gene content using DNA microarrays

Microarrays with 60-mer oligonucleotide probes were suppliedby Agilent Technologies (Stanford, USA). Two (PlatformGPL4318; http://www.ncbi.nlm.nih.gov/geo/) or eight (Plat-form GPL5874) identical arrays were present on each glassslide with 11-K features per array, designed on the putativeORFs annotated in the complete genome sequence of L. plan-tarum strain WCFS1, including plasmids (Kleerebezem et al.,2003). Usually three different probes per gene, evenly distrib-uted over the gene sequence, were present on the array (induplicate). The 25 ORFs not represented on the microarraywere mainly prophage- and transposon-related, or too small.They were not printed on the microarray because no uniqueprobe could be designed to target these genes.

For CGH microarray analysis, 42 representative strainswere selected (Table 1; of which 24 strains are marked by arectangle in Fig. S1, Supporting information), including 18strains previously analysed using clone-based arrays(Molenaar et al., 2005).

The DNA (1 mg) isolated from the selected L. plantarumstrains was sheared and labelled by incorporation of Cy3-dUTP or Cy5-dUTP (Amersham Biosciences, UK) using thecommercial kit BioPrime DNA Labelling System (Invitrogen).The L. plantarum WCFS1 DNA applied as reference for eachhybridization experiment was labelled with Cy3-dUTP dye,and the chromosomal DNA of the 41 other selected L. plan-tarum strains was labelled with Cy5-dUTP. Purification oflabelled DNA was performed using CyScribe GFX column(Amersham Biosciences) according to the manufacturer’sinstructions. Dye incorporation efficiency was measuredusing a Nanodrop ND-1000 Spectrophotometer.

Cohybridization of the two differently (Cy3 and Cy5)labelled DNA samples was carried out overnight at 60°C in 1¥hybridization buffer following manufacture’s instruction andcircularly rotated in a stove (Agilent Technologies). Afterhybridization, the slides were washed 10 min at room tem-perature in wash solution 1 (6¥ SSC, 0.005% triton X-102),and then transferred in wash solution 2 (0.1¥ SSC, 0.005%triton X-102) for 5 min on ice. The slides were quickly driedusing compressed nitrogen and stored in the dark.

The hybridization image was acquired using a ScanArray4000 scanner (Perkin Elmer, USA) setting the resolution at10 nm. The photo-multiplier value was adjusted in order tobalance signals obtained for both channels (Cy5 and Cy3signal). Quantification of the signal for each spot was per-formed with ImaGene software, version 4.2 (BioDiscovery,

Canada). Subsequently, the quantification data wereimported into BASE (Saal et al., 2002). The control probeswere removed and the background was subtracted from thesignal for a preliminary data analysis. The differences in Cy3-and Cy5-dUTP hybridization efficiency were evaluated byCy3 and Cy5 labelling of L. plantarum WCFS1 genomic DNA.In this case, signal normalization of the two channels resultedin the formation of a straight line (M-values around 0), con-firming the complete identity in the evaluation of the genecontent by labelling with different dyes (Figs S2 and S3A,Supporting information).

Statistical analysis

Hybridization data were normalized by local fitting of an M-Aplot, implementing a LOWESS algorithm in R-2.2.1 program(http://cran.r-project.org/bin/windows/base/) as previouslydescribed (Molenaar et al., 2005). The M-value was definedas log2 (Ch1/Ch2) and the A-value was calculated with thefollowing formula [log2(Ch1) + log2(Ch2)]/2, considering Cy3(reference DNA of WCFS1 strain) as channel 2 (Ch2) andCy5 signal (tested DNA) as channel 1 (Ch1). Presence of aprobe target was scored if the M-value was above -1.5.Details of calculation of presence or absence of genes aredescribed in the Supporting information.

Cluster analysis of all L. plantarum strains based ontheir phenotypic characterization was carried out with theBioNumerics program, using the product moment/UPGMA(unweighted pair group method using arithmetic averages).For cluster analysis of CGH data, the M-values were con-verted to two discrete classes: 0 and 1, corresponding toabsent and present. Hierarchical clustering of strains wasperformed using average linkage agglomeration and binarydistance as a distance measure with the hclust function in R.The plasmid-specific probes were not included in the con-struction of the dendrograms, as these highly mobile andvariable elements will perturb phylogenetic relations betweenstrains.

Chromosome variability

Gene-pair variability was scored for all pairs of neighbouringgenes by calculating the absence/presence pattern for the 42tested strains. If a gene was found to be present while theneighbouring gene was absent, a value of 1 was added to thenumber of gene-pair variability for the specific locus. Hence,gene-pair variability could reach a minimum of 0 (the twogenes are present or absent together in all strains) and amaximum of 41 (the two genes were found differentiallypresent in all strains). The average of this total number ofgene-pair variability was used for establishing the gene-pairvariability. To determine the region variability, the gene-pairvariability was added for a span of 20 genes and this wasshifted one gene downstream at a time. This datum wasplotted as statistical data on a genome wheel using the Micro-bial Genome Viewer (Kerkhoven et al., 2004).

Conserved and unique gene analysis

LaCOG predictions were obtained from the supplementarydata of Makarova and colleagues (Makarova et al., 2006;

770 R. J. Siezen et al.

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

Makarova and Koonin, 2007). All genes on the genome ofL. plantarum WCFS1, as annotated by Kleerebezem andcolleagues (2003), were checked for their presence in anLaCOG. Genes that were not classified into a LaCOG weresubjected to further analysis. BLASTP analysis (Altschulet al., 1990) was performed to identify closest homologues inother bacteria; genes with homologues in other LAB [i.e. newgenome sequences published since (Makarova et al., 2006;Makarova and Koonin, 2007)] were classified into newLaCOGs. ORFans (open-reading frames without matches incurrent databases) and genes putatively acquired by horizon-tal gene transfer were obtained from Cortez et al. (2009a,b).

Gene expression analysis

Gene expression data have been deposited in the GEOdatabase under Accession Numbers GSE5882, GSE8348,GSE8672, GSE8743, GSE8744, GSE8956, GSE9392,GSE9961, GSE11383, GSE17634, GSE17847, GSE18339,GSE18340, GSE18354, GSE18432 and GSE18435.

Acknowledgements

We thank Joris Hafkamp for the construction of the genomicwheel, and Jeroen Hugenholtz and Erwin G. Zoetendal foruseful discussions.

References

Ahrne, S., Nobaek, S., Jeppsson, B., Adlerberth, I., Wold,A.E., and Molin, G. (1998) The normal Lactobacillus flora ofhealthy human rectal and oral mucosa. J Appl Microbiol 85:88–94.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman,D.J. (1990) Basic local alignment search tool. J Mol Biol215: 403–410.

Antara, N.S., Sujaya, I.N., Yokota, A., Asano, K., and Tomita,F. (2004) Effects of indigenous starter cultures on themicrobial and physicochemical characteristics of Urutan, aBalinese fermented sausage. J Biosci Bioeng 98: 92–98.

Baleiras Couto, M.M., Eijsma, B., Hofstra, H., Huis in’t Veld,J.H., and van der Vossen, J.M. (1996) Evaluation ofmolecular typing techniques to assign genetic diversityamong Saccharomyces cerevisiae strains. Appl EnvironMicrobiol 62: 41–46.

Boekhorst, J., Wels, M., Kleerebezem, M., and Siezen, R.J.(2006) The predicted secretome of Lactobacillus plantarumWCFS1 sheds light on interactions with its environment.Microbiology 152: 3175–3183.

Bolotin, A., Quinquis, B., Renault, P., Sorokin, A., Ehrlich,S.D., Kulakauskas, S., et al. (2004) Complete sequenceand comparative genome analysis of the dairy bacteriumStreptococcus thermophilus. Nat Biotechnol 22: 1554–1558.

Bourgoin, F., Pluvinet, A., Gintz, B., Decaris, B., and Guedon,G. (1999) Are horizontal transfers involved in the evolutionof the Streptococcus thermophilus exopolysaccharide syn-thesis loci? Gene 233: 151–161.

Bringel, F., Curk, M.C., and Hubert, J.C. (1996) Character-ization of lactobacilli by Southern-type hybridization with aLactobacillus plantarum pyrDFE probe. Int J Syst Bacteriol46: 588–594.

Bringel, F., Quenee, P., and Tailliez, P. (2001) Polyphasicinvestigation of the diversity within Lactobacillus plantarumrelated strains revealed two L. plantarum subgroups. SystAppl Microbiol 24: 561–571.

Bringel, F., Castioni, A., Olukoya, D.K., Felis, G.E., Torriani,S., and Dellaglio, F. (2005) Lactobacillus plantarum subsp.argentoratensis subsp. nov., isolated from vegetable matri-ces. Int J Syst Evol Microbiol 55: 1629–1634.

Broadbent, J.R., McMahon, D.J., Welker, D.L., Oberg, C.J.,and Moineau, S. (2003) Biochemistry, genetics, and appli-cations of exopolysaccharide production in Streptococcusthermophilus: a review. J Dairy Sci 86: 407–423.

Cortez, D., Forterre, P., and Gribaldo, S. (2009a) A hiddenreservoir of integrative elements is the major source ofrecently acquired foreign genes and ORFans in archaealand bacterial genomes. Genome Biol 10: R65.

Cortez, D., Delaye, L., Lazcano, A., and Becerra, A. (2009b)Composition-based methods to identify horizontal genetransfer. Methods Mol Biol 532: 215–225.

Curk, M.C., Hubert, J.C., and Bringel, F. (1996) Lactobacillusparaplantarum sp. nov., a new species related to Lactoba-cillus plantarum. Int J Syst Bacteriol 46: 595–598.

Dellaglio, F., Bottazzi, V., and Vescovo, M. (1975) Deoxyri-bonucleic acid homology among Lactobacillus species ofthe subgenus Streptobacterium Orla-Jensen. Int J SystBacteriol 25: 160–172.

Ercolini, D., Hill, P.J., and Dodd, C.E.R. (2003) Bacterialcommunity structure and location in Stilton cheese. ApplEnviron Microbiol 69: 3540–3548.

Filya, I., Sucu, E., and Karabulut, A. (2004) The effect ofPropionibacterium acidipropionici, with or without Lactoba-cillus plantarum, on the fermentation and aerobic stabilityof wheat sourghum and maize silages. J Appl Microbiol 97:818–826.

Fischbach, M., Ward, D., Young, S., Jaffe, D., Gnerre, S.,Berlin, A., et al. (2008) Annotation of Streptomyces pristi-naespiralis ATCC 25486. In NCBI Genbank.

Galdeano, C.M., and Perdigon, G. (2006) The probioticbacterium Lactobacillus casei induces activation of thegut mucosal immune system through innate immunity.Clin Vaccine Immunol 13: 219–226.

Gardner, N.J., Savard, T., Obermeier, P., Caldwell, G., andChampagne, C.P. (2001) Selection and characterization ofmixed starter cultures for lactic acid fermentation of carrot,cabbage, beet and onion vegetable mixtures. Int J FoodMicrobiol 64: 261–275.

Gevers, D., Huys, G., and Swings, J. (2001) Applicability ofrep-PCR fingerprinting for differentiation of Lactobacillusspecies. FEMS Microbiol Lett 205: 31–36.

van de Guchte, M., Penaud, S., Grimaldi, C., Barbe, V.,Bryson, K., Nicolas, P., et al. (2006) The complete genomesequence of Lactobacillus bulgaricus reveals extensiveand ongoing reductive evolution. Proc Natl Acad Sci USA103: 9274–9279.

Gueimonde, M., Sakata, S., Kalliomaki, M., Isolauri, E.,Benno, Y., and Salminen, S. (2006) Effect of maternalconsumption of Lactobacillus GG on transfer and estab-

Phenotypic and genomic diversity of Lactobacillus plantarum 771

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

lishment of fecal bifidobacterial microbiota in neonates.J Pediatr Gastroenterol Nutr 42: 166–170.

Hegazi, F.Z., and Abo-Elnaga, I.G. (1980) Degradation oforganic acids by dairy lactic acid bacteria. SchriftenrZentralbl Arbeitsmed Arbeitsschutz Prophyl 135: 212–222.

Honke, K., and Taniguchi, N. (2002) Sulfotransferases andsulfated oligosaccharides. Med Res Rev 22: 637–654.

Hooper, L.V., Manzella, S.M., and Baenziger, J.U. (1996)From legumes to leukocytes: biological roles for sulfatedcarbohydrates. FASEB J 10: 1137–1146.

Kalliomaki, M., Salminen, S., Arvilommi, H., Kero, P., Koski-nen, P., and Isolauri, E. (2001) Probiotics in primary pre-vention of atopic disease: a randomised placebo-controlledtrial. Lancet 357: 1076–1079.

Kerkhoven, R., van Enckevort, F.H., Boekhorst, J., Molenaar,D., and Siezen, R.J. (2004) Visualization for genomics:the Microbial Genome Viewer. Bioinformatics 20: 1812–1814.

Kleerebezem, M., Boekhorst, J., van Kranenburg, R.,Molenaar, D., Kuipers, O.P., Leer, R., et al. (2003)Complete genome sequence of Lactobacillus plantarumWCFS1. Proc Natl Acad Sci USA 100: 1990–1995.

Klijn, N., Weerkamp, A.H., and de Vos, W.M. (1991) Identifi-cation of mesophilic lactic acid bacteria using PCR-amplified variable regions of 16S rRNA and specificprobes. Appl Environ Microbiol 57: 3390–3393.

Kostinek, M., Specht, I., Edward, V.A., Schillinger, U., Hertel,C., Holzapfel, W.H., and Franz, C.M. (2005) Diversity andtechnological properties of predominant lactic acid bacteriafrom fermented cassava used for the preparation of Gari, atraditional African food. Syst Appl Microbiol 28: 527–540.

Makarova, K.S., and Koonin, E.V. (2007) Evolutionarygenomics of lactic acid bacteria. J Bacteriol 189: 1199–1208.

Makarova, K., Slesarev, A., Wolf, Y., Sorokin, A., Mirkin, B.,Koonin, E., et al. (2006) Comparative genomics of thelactic acid bacteria. Proc Natl Acad Sci USA 103: 15611–15616.

Molenaar, D., Bringel, F., Schuren, F.H., de Vos, W.M.,Siezen, R.J., and Kleerebezem, M. (2005) Exploring Lac-tobacillus plantarum genome diversity by using microar-rays. J Bacteriol 187: 6119–6127.

Mukherjee, A., Jackson, S.A., LaClerc, J.E., and Cebula, T.A.(2006) Exploring genotypic and phenotypic diversity ofmicrobes using microarray approaches. Toxicol MechMethods 16: 121–128.

Noonpakdee, W., Sitthimonchai, S., Panyim, S., and Lertsiri,S. (2004) Espression of the catalase gene catA in starterculture Lactobacillus plantarum TISTR850 tolerates oxida-tive stress and reduces lipid oxidation in fermented meatproduct. Int J Food Microbiol 95: 127–135.

Omelchenko, M.V., Makarova, K.S., Wolf, Y.I., Rogozin, I.B.,and Koonin, E.V. (2003) Evolution of mosaic operons byhorizontal gene transfer and gene displacement in situ.Genome Biol 4: R55.

Rasmussen, T.B., Danielsen, M., Valina, O., Garrigues, C.,Johansen, E., and Pedersen, M.B. (2008) Streptococcusthermophilus core genome: comparative genome hybrid-ization study of 47 strains. Appl Environ Microbiol 74:4703–4710.

Russell, W.M., and Klaenhammer, T.R. (2001) Identificationand cloning of gusA, encoding a new {beta}-glucuronidasefrom Lactobacillus gasseri ADH. Appl Environ Microbiol 67:1253–1261.

Saal, L.H., Troein, C., Vallon-Christersson, J., Gruvberger,S., Borg, Å., and Peterson, C. (2002) BioArray softwareenvironment: a platform for comprehensive managementand analysis of microarray data. Genome Biol 3:software0003.1–0003.6.

Sezonov, G., Blanc, V., Bamas-Jacques, N., Friedmann, A.,Pernodet, J.L., and Guerineau, M. (1997) Complete con-version of antibiotic precursor to pristinamycin IIA by over-expression of Streptomyces pristinaespiralis biosyntheticgenes. Nat Biotechnol 15: 349–353.

Siezen, R., Boekhorst, J., Muscariello, L., Molenaar, D.,Renckens, B., and Kleerebezem, M. (2006) Lactobacillusplantarum gene clusters encoding putative cell-surfaceprotein complexes for carbohydrate utilization are con-served in specific gram-positive bacteria. BMC Genomics7: 126.

Stiles, M.E., and Holzapfel, W.H. (1997) Lactic acid bacteriaof foods and their current taxonomy. Int J Food Microbiol36: 1–29.

Tanganurat, W., Quinquis, B., Leelawatcharamas, V., andBolotin, A. (2009) Genotypic and phenotypic characteriza-tion of Lactobacillus plantarum strains isolated from Thaifermented fruits and vegetables. J Basic Microbiol 49:377–385.

Teusink, B., van Enckevort, F.H.J., Francke, C., Wiersma, A.,Wegkamp, A., Smid, E.J., and Siezen, R.J. (2005) In silicoreconstruction of the metabolic pathways of Lactobacillusplantarum: comparing predictions of nutrient requirementswith those from growth experiments. Appl Environ Micro-biol 71: 7253–7262.

Torriani, S., Felis, G.E., and Dellaglio, F. (2001a) Differentia-tion of Lactobacillus plantarum, L. pentosus, and L. para-plantarum by recA gene sequence analysis and multiplexPCR assay with recA gene-derived primers. Appl EnvironMicrobiol 67: 3450–3454.

Torriani, S., Clementi, F., Vancanneyt, M., Hoste, B., Della-glio, F., and Kersters, K. (2001b) Differentiation of Lacto-bacillus plantarum, L. pentosus and L. paraplantarumspecies by RAPD-PCR and AFLP. Syst Appl Microbiol 24:554–560.

Zhang, Z.Y., Liu, C., Zhu, Y.Z., Zhong, Y., Zhu, Y.Q., Zheng,H.J., et al. (2009) Complete genome sequence of Lacto-bacillus plantarum JDM1. J Bacteriol 191: 5020–5021.

Zhou, M., Boekhorst, J., Francke, C., and Siezen, R.J. (2008)LocateP: genome-scale subcellular-location predictor forbacterial proteins. BMC Bioinformatics 9: 173.

Supporting information

Additional Supporting Information may be found in the onlineversion of this article:

Supplemental text: Statistical analysis of presence/absenceof genes.Fig. S1. Neighbor-joining cluster analysis based on pheno-typic properties of the 185 L. plantarum strains using the

772 R. J. Siezen et al.

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773

product-moment coefficient and UPGMA (unweighted pairgroup method using arithmetic averages). Strains selected forgene content evaluation are highlighted with a blue square.Fig. S2. Self-hybridization of WCFS1.Fig. S3. A. M-A plot for WCFS1-WCFS1 (self) hybridization.B. M-A plot for WCFS1-ATCC14917 hybridization.C. M-A plot for WCFS1-DKO22 hybridization.Fig. S4. Density distribution of probes on all microarrays.M-values of �1 and �2 are indicated with dashed lines.

Table S1. Genes present in all 42 L. plantarum strains, butnot found in other lactic acid bacteria.Table S2. Genes found only in the reference strain L. plan-tarum WCFS1, and not in the other 41 strains analysed.

Please note: Wiley-Blackwell are not responsible for thecontent or functionality of any supporting materials suppliedby the authors. Any queries (other than missing material)should be directed to the corresponding author for the article.

Phenotypic and genomic diversity of Lactobacillus plantarum 773

© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology, 12, 758–773