Structural Properties and Evolutionary Relationships of PspA

Post on 03-May-2023

3 views 0 download

Transcript of Structural Properties and Evolutionary Relationships of PspA

Vol. 174, No. 2

Structural Properties and Evolutionary Relationships of PspA,a Surface Protein of Streptococcus pneumoniae,

as Revealed by Sequence AnalysisJANET YOTHERl* AND DAVID E. BRILES12'3

2 ~~~~~~~~~~~3Departments of Microbiology,' Pediatrics,2 and Comparative Medicine,University ofAlabama at Birmingham, Birmingham, Alabama 35294

Received 16 July 1991/Accepted 6 November 1991

Analysis of the sequence for the gene encoding PspA (pneumococcal surface protein A) of Streptococcuspneumoniae revealed the presence of four distinct domains in the mature protein. The structure of theN-terminal half of PspA was highly consistent with that of an a-helical coiled-coil protein. The a-helical domainwas followed by a proline-rich domain (with two regions in which 18 of 43 and 5 of 11 of the residues areprolines) and a repeat domain consisting of 10 highly conserved 20-amino-acid repeats. A fourth domainconsisting of a hydrophobic region too short to serve as a membrane anchor and a poorly charged regionfollowed the repeats and preceded the translation stop codon. The C-terminal region of PspA did not possessfeatures conserved among numerous other surface proteins, suggesting that PspA is attached to the cell by amechanism unique among known surface proteins of gram-positive bacteria. The repeat domain of PspA wasfound to have significant homology with C-terminal repeat regions of proteins from Streptococcus mutans,Streptococcus downei, Clostridium difficile, and S. pneumoniae. Comparisons of these regions with respect tofunctions and homologies suggested that, through evolution, the repeat regions may have lost or gained amechanism for attachment to the bacterial cell.

Among gram-positive bacteria, a number of surface pro-teins have been identified and characterized with respect totheir structural, virulence, or immunogenic features (19, 22,25, 31, 62). The amino-terminal regions of these moleculesoften share the properties of environmental exposure andvariability, although their structural similarities may belimited. In contrast, the carboxy termini of many of theseproteins show extensive similarities in their cell-associatedregions, suggesting that a common mechanism for attachingsurface proteins to cells may exist (19, 21, 23, 25, 37, 62).Pneumococcal surface protein A (PspA) is the only sur-

face protein of Streptococcus pneumoniae known to exhibitsignificant immunogenic and virulence properties. Further-more, as shown in these studies, it appears to have uniquestructural properties. PspA has been identified on all pneu-mococci examined (13). It is variable both serologically andwith respect to molecular weight (13, 43, 65). However,conserved epitopes are present on PspA molecules (13), andantibodies to PspA can protect mice against lethal challengeeven if the immunizing PspA is of a different serotype thanthe PspA of the challenge pneumococci (8, 42, 44, 45). PspAis required for full virulence in mice (45), and antibody toPspA in human sera has been detected (12).The observations regarding the serologic, protective, and

virulence properties of PspA indicate its importance for S.pneumoniae and suggest a role for antibody to PspA inimmunity to pneumococcal infections. To further character-ize PspA with respect to structure and, ultimately, function,we have used pspA insertion mutants as a means to cloneand express pspA in Escherichia coli. From these clones wehave determined the complete nucleotide sequence ofpspA.In the accompanying paper, we describe the cloning andmutant analysis and correlate those findings with the se-

* Corresponding author.

quence analysis (69). Here, we present the pspA sequenceand describe its implications regarding the structural prop-erties and evolutionary relationships of PspA.

MATERIALS AND METHODS

Bacterial strain and plasmids. The pspA gene was isolatedfrom S. pneumoniae Rxl as described in the accompanyingarticle (69). The sequence was obtained from overlappingsubclones of plasmids pJY4173 and pJY4244. These plas-mids contain the PspA C- and N-terminal-encoding regions,respectively. The region sequenced and shown in Fig. 1 and2 is the HindIII-KpnI fragment that contains the completepspA gene (69).DNA sequencing and analysis. Sequencing was performed

with the Sequenase kit (U.S. Biochemical, Cleveland, Ohio)by using protocols recommended by the manufacturer. Plas-mid DNA (5) for sequencing was purified either by CsClcentrifugation (52) or by GeneClean purification (Bio 101, LaJolla, Calif.) of mini-prep DNA. Plasmids were denaturedwith NaOH as described in the Sequenase protocols. BothDNA strands were sequenced by subcloning them intopUC18 and pUC19 (67) to obtain opposite orientations foruse with the universal or -40 primer or by sequencing agiven clone by using both the forward and reverse primers(U.S. Biochemical). An additional primer for sequencingfrom within pspA (used with pJY4244 and pJY4284 [69]) wasobtained from Oligos Etc. Inc. (Guilford, Conn.). Approxi-mately 97% of the sequence was obtained for each strand.Sequence analysis was performed with DNA Strider (41) andUniversity of Wisconsin Genetics Computer Group Se-quence Analysis Package Version 6.2 (16).

Nucleotide sequence accession number. The pspA sequencereported here has been assigned GenBank accession numberM74122.

601

JOURNAL OF BACTERIOLOGY, Jan. 1992, p. 601-6090021-9193/92/020601-09$02.00/0Copyright © 1992, American Society for Microbiology

602 YOTHER AND BRILES

S3 BU BU T.. I.I

S S3 D P P S3I I I I I..

BX HcB31 Hc Hc BN Hc KI I . I a I I 6I IIII I I I III I I I I I I

05 1.0 1.5 2.0kb

*~ ~~*

FIG. 1. Restriction map of pspA and sequencing strategy. The HindIII-KpnI fragment (2,086 bp) contains the complete pspA gene.Fragments were sequenced from overlapping subclones. Arrows indicate the direction and length obtained for each sequencing reaction.Boxes indicate the regions for which sequence was not obtained for one of the strands. The leftmost box represents 60 unsequencednucleotides (2.9% of strand), and the rightmost box represents 73 unsequenced nucleotides (3.5% of strand). Restriction sites are BstNI (BN),BstUI (BU), BstXI (BX), Dral (D), HinclI (Hc), Hindlll (H), KpnI (K), PvuII (P), Sacl (S), Sau3AI (S3), and TaqI (7). B31 is a site createdby Bal 31 digestion (69).

RESULTS and DISCUSSION

The nucleotide sequence of pspA was obtained fromoverlapping subclones spanning a HindIII-KpnI fragmentthat contains the complete pspA gene (Fig. 1). The cloning ofthis fragment and the resulting production of PspA in E. coliare described in the accompanying paper (69). The sequenceof the HindIII-KpnI fragment, shown in Fig. 2, revealed an

open reading frame of 1,857 nucleotides that, as shownbelow and in the accompanying paper, is pspA. The pspAsequence has a GC content of 41.7%, in contrast to the-38.5% GC content of the S. pneumoniae genome (15). Thehigher number is accounted for by three regions of themolecule (described in detail below) which have unusuallyhigh GC contents. Two proline-alanine-rich regions have GCcontents of 55.8 and 52.9%, respectively. A repeat region,consisting of 10 nearly identical 20-amino-acid repeats, has aGC content of 46.9%. Combined, these regions represent-39% of the molecule and have a GC content of 48.9%. Ashas been observed with other genes from gram-positiveorganisms of low GC content, codon usage is biased towardscodons with adenines and thymines in the third nucleotideposition (37), with such codons representing 75% ofpspA. Inaddition to the PspA open reading frame, two other smallopen reading frames were found. One, of 447 nucleotides,occurs from nucleotide positions 1376 to 1822. The second,of 341 nucleotides, reads in the opposite direction, frompositions 1417 to 1076. It is not known whether eitherencodes a product.

Transcription and translation signals. Putative transcrip-tion and translation signals are indicated in Fig. 2. Thepromoter region contains a consensus E. coli -35 sequenceand a nearly consensus -10 sequence (C replaces A inposition 5). The two regions are separated by 19 bp, com-pared with the optimum of 17 bp (34). A potential ribosome-binding site begins at nucleotide 113 with a Shine-Dalgarnosequence (56) that is separated by 8 bp from the putativeATG translation start codon at nucleotide 127. A secondATG is located at nucleotide 142, but, on the basis of thesignal sequence characteristics (see below), it is less likely toserve as the start. An open reading frame of 1,857 nucleo-tides follows the initiator ATG. A translation stop codonoccurs at nucleotide 1984. Two possible transcription termi-nation sites occur. The first, from nucleotides 1994 to 2031,has a potentially long stem structure but is not followed by arun of T's. A similar lack of T's was noted in the sequence ofthe M protein of the group A streptococcus (37). A secondsite for a potential stem-loop structure occurs within the firstsite, from nucleotides 2004 to 2023. Although this sequencewould have a shorter stem structure, it is followed by several

T's and may thus function as an efficient transcriptionterminator (6).Leader peptide. The PspA coding region most likely begins

with the Met designated -31 in Fig. 2. The sequencefollowing this codon has the characteristics required of abacterial signal peptide (50, 63). Like most leader sequencesof proteins from gram-positive bacteria, the length is greaterthan that observed with signal sequences of proteins fromgram-negative bacteria (19, 37, 57). It contains 31 aminoacids, with 3 of the first 4 being charged and the remainderbeing predominantly hydrophobic. A signal peptidase cleav-age site occurs at the Ala designated -1 in Fig. 2.Mature PspA. The first 45 amino acids of the deduced

sequence beyond the Ala-Glu cleavage site agree completelywith amino-terminal amino acid sequencing of isolated PspA(59). The predicted size of the mature PspA is 65,380 Da,compared with the estimation of 84,000 Da based on migra-tion in sodium dodecyl sulfate-polyacrylamide gels (43).Analysis of the deduced amino acid sequence revealed fourdistinct domains in mature PspA (Fig. 2). Each of thedomains was predicted to have a high degree of potentialantigenicity on the basis of Jameson and Wolf analysis (38).

(i) Charged-helical domain. The N-terminal half of PspA(residues 1 through 288) is highly charged, and both theGarnier-Osguthorpe-Robson (29) and Chou-Fasman (10)analyses indicated that the secondary structure is predomi-nantly a-helical. The only nonhelical structure, which ex-tends from positions 115 to 120, is predicted by Gamier et al.to be a a-sheet. Chou-Fasman analysis provided the sameprediction for the region and further predicted only fourother nonhelical regions, each consisting of one to threeamino acids.The amino acid sequence of the a-helical N-terminal

portion of PspA is strongly predictive of a coiled-coil struc-ture. Coiled-coil a-helical proteins exhibit a seven-amino-acid motif, with predominantly hydrophobic residues atpositions a and d and primarily hydrophilic helix-promotingresidues at positions b, c, e, f, and g (14, 46, 58). Thenonpolar amino acids at positions a and d allow noncovalentattachment of individual a-helical chains to each other. Thehydrophilic amino acids at positions b, c, e, f, and g ensurethat the coiled-coil structure has few other hydrophobicinteractions and that it exists as a fibrous rather than as aglobular protein (11, 14, 46, 58).As shown in Fig. 3, eight distinct regions of PspA are

consistent with a coiled-coil structure. These regions repre-sent 70 a-helical turns. At only 2 of these 70 turns are highlyhydrophilic amino acids found at position a or d (residues104 and 121). The general features of the PspA heptad

H

J. BACTERIOL.

~e0

t--o

003

l-0

We-I'toO

OD1N

GI~ 003lbHI

C)C 00G

<0tct-

toO W-

i-C

03

0 -IV la-

0 ' <

aC)

-0 C

0%N a0% CO

of<03

i e0 i-

Wi-I., 0000

pi- 0~0 0 00a

0)0 WiII

0

t -' M

i-I0 0o

00 i- 0

00 n00S~0 00

<0 00 in-bI. K

Wi-

i-' la 0 I-'a

i0 Wi -I3 i- OWe-I 0 0 WI-I

a-'MO t-0

(nGI (nt-In (nt-I

too too too0

IcA11 ,

We- W-I W1i-Ift t el

~: O, r-2 on

00-

00

00

l.aa

0i-

00

<0

MO

00i-

Al

toO

i-0

fAl

~e0

.01

0%A~t1%

e0 00

W

<0 00

00 :00

0

~e 0 t-t0

0 00

0 t-'

('I

<0

WeI

elb

i-0Wi-I0

0%i--I

00)

0

:0

iI,

0

0

:0

00

Wi-I

100

1--',

cl-

PC00i- 0

G0eI

i0

i-'0

0 0

0v 0I-i-

li-t

cti

IA

0<

rt- 0

lbnel

inco

I-A

-I

0

iI0

0

i-I

i-3

0

i-I1-I3

N)Alk

w

.0%

0

00A2 P

a i0

t.0

t 00

WI-

t-40

WO

i-aUt

I-ito

I-I,'0'0

00i-'0i-Ct-Ii-li-IMOtoo<0We-I

toe-I00

:11.05,'tot-I

I-I,'0.0MOelI-IMOtooi-li-Ii-C,'MOI-li-IMi-I

in-toWe-I

(nt-I00

00I-.0i-Ce-I

i-b

WI-IrI-0

Uto

MO1-a00

t~oo

0 l00

lb0We-ICD',

~Ut

00

:110WiI

WII

Ca 0

00

i-0

00

Utl

0 0

00

WHI

00

..I--

la1el 10Wc

t Do

00 0 0

GiI a-

t0 in -eao

We-I W-I0

G I (A

MO 0H

0< 0

WI 00l

« <« «

iC,'G -c,'

0< 0

lOD

in-I

I<b

iCs

iC,'MO

cto

la

00

:00

.0<C

00

.100

5,

WI-.0eAlSI-b

laR

lao

I-eO

Al

la0"2I-b<Al0lao11<

0

la0

la0

i-b,'

W

laO

IIe.e.

SOlaO

.W <

00

iC,'

0<

00

00

00

co

lb~

OiC -

603

604 YOTHER AND BRILES

a b c d e f g

s p v a s

K a E15 y D a a K22 a K N a K29 V E D a 036 l D D a K

K DK DK aK a

9

a a43 Q K K y D E D50 Q K K t E E K57 a a L E K a a

E E a D K aa a V a

L a y Q Q aD K a a K D

a D K mD E a K K RE E a K t KN t x R a mv p E p E Q

a E t K K KE E a K Q Kp E L t K KE E a K a KE E a E K Kt E a K Q KD

aQ a

9

repeats are very similar to those of myosin, tropomyosin,and other fibrous eukaryotic proteins. As shown in Fig. 4,the amino acids most commonly found at positions a throughg of the N-terminal 288 residues of PspA are generally themost common residues observed at the same positions forknown coiled-coil regions from fibrous proteins. Similarcomparisons have been used by others in the prediction ofcoiled-coil regions (40). The data from Fig. 4 suggest thatPspA may exhibit a coiled-coil structure. In the ox-helicalportion of PspA, only 13% of the amino acids in the a and dpositions are hydrophilic. If the calculation is restricted toregions of predicted coiled-coil structure, this number dropsto only 3%. These results compare very favorably with thoseof eukaryotic coiled-coil proteins such as tropomyosin andmyosin in which, on average, 17% of the a and d residues arehydrophilic (11). By this measure, PspA resembles thetypical coiled-coil sequence of known fibrous proteins sig-nificantly more than do the sequences of the streptococcal Mproteins M5 (47), M6 (24), M12 (53), M24 (48), and M49 (33).In the case of M6, whose sequence has a strong heptadrepeat that has been shown to be consistent with a predictedcoiled-coil structure (24, 47), the amino acids in 30 to 40% ofthe a and d positions are hydrophilic rather than hydropho-bic. Even so, the structure of the M6 molecule has beenshown in X-ray diffraction studies to be an a-helical coiledcoil (51).There are seven breaks in the coiled-coil motif of PspA.

Five of these are frameshifts of one or two amino acids. Theothers are 8- and 23-amino-acid sequences (indicated byitalics in Fig. 3) that by themselves would not be highlypredictive of coiled-coil structure. These interruptions in theideal coiled-coil sequence, which are similar to breaksobserved with other coiled-coil proteins, may provide re-gions of flexibility in what would otherwise be a rather rigidstructure (11). As with M proteins, the coiled-coil structuredoes not extend to the N terminus of the protein, thusproviding a short (in the case of PspA, 9-amino-acid) flexibleN terminus.A stable coiled-coil structure is thought to require 10

contiguous turns with an a(-helical coiled-coil motif (35).Three coiled-coil regions of PspA may meet this require-ment. One, from residues 93 to 168, contains 22 turns withonly two hydrophilic amino acids at the 22 a and d positions.Another, from residues 10 to 92, includes the first andsecond coiled-coil motifs with nine helical turns each. Thesemotifs are separated by six helical turns with a weakercoiled-coil motif that is in phase with the other two. Alto-gether, this region represents 24 helical turns with nonhy-drophilic residues at all but 3 of the 24 a and d residues.Another region of likely coiled-coil structure consists of thelast four coiled-coil motifs, a total of 24 helical turns inter-rupted by three single-residue frameshifts.

22

E E X a

K178 i a E L E N 0

185 Z H R L E 0 E192 L K E L D E s

199 E

s E D y204 a K E g L R a211 p L 0 s K L D218 a K K a K L s

225 K

226 l E E L s D K

233 i D E L D a E240 i a K L E D 0

247 l K a a E E N

254 N N X E D y260 L K E g L E K267 t L a a K K a

274 E

275 L E K t E a D282 l K K a x N E

6

7

8

5

FIG. 3. Coiled-coil potential of PspA N-terminal region. Resi-dues 1 to 288 are aligned to show the heptad repeats of hydrophobicand hydrophilic amino acids that correspond to cx-helical coiled-coilstructure. a to g are positions within the heptad repeat. The eightregions exhibiting coiled-coil structure are separated from eachother and from non-coiled-coil regions by spacing. Regions in italicsexhibit a lack of fit with coiled-coil structure. Numbers to the left areresidue numbers. To the right are the number of ox-helical turnswithin a given coiled-coil region (3.6 residues per turn). Amino acidsare indicated as follows: boldface uppercase, hydrophilic; lower-case, nonhydrophilic, internal or external; lowercase underline,hydrophobic, internal.

4

1 E E8 0 s

64 s71 v78 y85 t92 a

97 i104 Eill L118 X125 l1 32 s139 a146 jl153 l160 a167 V

174 p

J. BACTERIOL.

S. PNEUMONIAE pspA SEQUENCE 605

a b

- E 9 (1) 23(21)+ K 2 (8) 23 (15)- D 0 (0) 15 (13)+ R 0 (6) 0 (6)+ H 0 (1) 3 (3)

N 0 (4) 5 (4)Q 7 (1) 0 (9)

c

38 (19)20 (12)5 (13)3 (8)0 (2)5 (5)3 (8)

d e

5 (6) 27 (32)0 (1) 32 (9)0 (1) 12 (4)0 (1) 2 (6)0 (1) 0 (1)0 (1) 0 (6)0 (4) 7 (14)

18 (8) 41 (22)3 (4) 7 (2)3 (8) 5 (2)0 (0) 5 (6)3 (0) 2 (0)0 (0) 0 (0)0 (4) 5 (1)0 (0) 0 (1)

5 (2) 3 (4) 15 (35)3 (1) 0 (2) 2 (6)3 (2) 0 (2) 7 (6)0 (0) 0 (1) 0 (2)0 (1) 0 (1) 2 (2)

2 (4) 17 (11)2 (5) 2 (4)2 (5) 0 (8)0 (0) 0 (0)2 (0) 0 (0)0 (0) 0 (1)0 (1) 0 (4)0 (0) 0 (0)

24 (9)0 (4)7 (5)5 (0)0 (0)0 (0)0 (1)0 (0)

2 (6) 5 (4) 0 (6)0 (2) 0 (2) 0 (2)5 (2) 0 (3) 0 (3)2 (0) 0 (0) 0 (0)0 (1) 0 (1) 5 (0)

FIG. 4. Comparison of PspA coiled-coil structure with coiled-coil structures of know fibrous proteins. Numbers represent the frequency(percentage) of each amino acid at the given position, a through g, in the heptad repeat. The number in the left of each column is the frequencyfor the a-helical N terminal of PspA (residues 1 to 288); a number in parentheses is an average frequency calculated from coiled-coil regionsof myosin, tropomyosin, paramyosin, and the intermediate filaments. The numbers in parentheses are from reference 11. Amino acids areindicated as in Fig. 3.

In addition to the expected pattern of hydrophobic andhydrophilic residues, the third coiled-coil region of PspA(from residues 93 to 168) exhibits a very regular chargemotif. Of the 21 b and c residues, 14 have net negativecharges and none have net positive charges. Twelve of theseresidues are glutamic acids. Of the 33 e, f, and g residues, 20have net positive charges and only three have net negativecharges. Eighteen of these 33 residues are lysines. Thus, inthese 22 turns of the helix, the hydrophilic residues ofconsecutive turns alternate from positive to negative charge.This is a feature not commonly observed with coiled-coilstructures. The alternating charge may tend to stabilize theindividual helices and could have some as yet unknownbiologic function.

(ii) Proline-rich domain. Two distinct proline regions areevident between residues 289 and 370 (Fig. 2). In the first, 18of 43 residues are prolines, and in the second, 5 of 11 areprolines. The two regions are separated by a hydrophilicstretch of 27 amino acids. The pentamer Pro-Ala-Pro-Ala-Pro appears four times in these two regions.

(iii) Repeat domain. The proline-rich domain is followedimmediately by the first of 10 tandem, direct repeats (resi-dues 371 through 571). Each repeat consists of 20 aminoacids, and, as can be seen in Fig. 2, is highly conserved atboth the amino acid and nucleotide levels. The middlerepeats (3 through 8) appear to be alternating; i.e., repeats 3,5, and 7 are identical except for four silent changes in thenucleotide sequence and repeats 4, 6, and 8 are identicalexcept for one silent change. Divergence is observed mainlyat the outside repeats. Repeats 1 and 2 most closely resemblerepeats 3, 5, and 7, whereas repeats 9 and 10 are more similarto repeats 4, 6, and 8. The only exception to the periodicity

of 20 amino acids occurs in repeat 9, where a lysine has beeninserted at residue 550. A Kyte-Doolittle (39) hydrophobicityplot of the repeat region revealed a recurring peak of weakhydrophobicity extending from about the last four aminoacids of each repeat through the first four amino acids of thenext repeat (data not shown).

(iv) C terminus. Within the last repeat begins the only largehydrophobic region of the mature molecule (residues 564through 576). The Kyte-Doolittle (39) hydrophobicity valuefor this region is 1.1. Twelve amino acids (residues 577through 588), three of which are charged, follow this regionand precede the stop codon.Model for the attachment of PspA to S. pneumoniae. Exam-

inations of the sequences of numerous surface proteins fromgram-positive bacteria have revealed several conserved fea-tures, suggesting the existence of a common mechanismfor the attachment of surface proteins to the bacterial cell(19, 21, 23, 25, 32, 37, 62, 64). In the C-terminal regions ofthese proteins, the transcription stop codon is preceded byfour to seven charged amino acids (the charged stop-transfertail), a highly hydrophobic (approximately 16 of 20 aminoacids) membrane-spanning anchor region, the hexamerLPXTGE, and a proline-rich region. The proline regions ofthe staphylococcal protein A and the group A streptococcalM6 protein have been localized within the cell wall and havebeen implicated as potentially important regions in attach-ment of the respective molecules to the bacterial cell (32,49).The C-terminal region of PspA is clearly different from

that just described, in that (i) the region immediately preced-ing the stop codon does not contain a highly charged tail; (ii)the hydrophobic region is not as hydrophobic or as long as

f15 (15)29 (11)10 (10)2 (13)0 (3)5 (5)12 (6)

g

7 (20)26 (15)14 (8)2 (10)0 (1)2 (3)7 (13)

13 (12)3 (4)0 (4)0 (1)3 (0)0 (0)0 (2)0 (0)

a 16 (10)t 5 (1)s 5 (2)y 5 (4)p 5 (0)c 0 (1)g 0 (1)w 0 (0)

1 21 (32)i 9 (13)v 12 (9)f 5 (2)m 0 (5)

VOL. 174, 1992

606 YOTHER AND BRILES

1 3 5 7 9 11 13 15 17 19 21 23

TGWKQENTGWLQNNT G W L Q Y NTGWAKVNT G W L Q Y NT GWAKVNT G W L Q Y NTGWAKVNTGWVKDGSQWFXVS

g t W K Q D S

n k W a x i GS a L X D NT G W V 1 V GT G V Y kn e f iKas G

G M Y F Y N T DG S W Y Y L N S NG S W Y Y L N A NG S W Y Y L N A NG S W Y Y L N A NG S W Y Y L N ANG S W Y Y L N ANG S W Y Y L N A ND T W Y Y L E A SD X Y Y V N G L

G S N AG AN AG AN AG A AG AM AG AM AG AM AG AM AG A X AGA LA

k G WvwF r r n N G S f p y

G v W Y Y f d S k G y c L to W Y Y L k d N G A N AseWYYmddS GAMvn nW Y Ymt n e r G nNv s

kG Y F mN T N G a L A

T G q N d t G y W Y V h S D G S y p k

d r f e X i N G T Y Y f d S S G y M Ld r W r h t d G nW Y w f d n S G e M A

T G WK i a D X W Y Y f N e e G AM KT G W V K Y k D t W Y Y L d A k e G A M v s

a f i q s a D G t G W Y Y L k p D G t L A

TGWq-i dTGWq-i dTGWq-i dTG---i Ng-fK-- Nni--Q- -

NTGWr- i N

nG-i-i -

a - Y Y f N T NG K - Y Y f N T NG K - Y Y f N T NG K - - Y f N T DG - - Y F - - A N- y - - - f - T LG K - Y Y f - S Dn K - Y Y f N p Nn - - Y Y f - - D- n - - Y f d A N

- A - - A- A - - A- A - - s

G - M -

- - - -

_ _ - v

- A - - AG - L -

_ a _ X _

n - f F -d G X W Y Y --A D G- L AS9) T G W q - i d G K-- Y f N - D G S -vA - G - i - V- t G - - - - Y hp-ins S- - V 1EN k - Y Y f dp D -Ge L -

k - r - V - a S Y Y - - L - -e

- i K h - G - Y L - D G X A--G W r - i G G X - Y Y f d T N G-- v

-G - L - s t d G n - - L - S --- G------TG - A --N----K Y Y f -n- G y v-

-T - F Y - G - dG- G vT -W q i d G M - Y Y f E p S -G --v-G - i a E- d G K - Y Y L d o D-G--v

- -F-T - - A - -- K - Y Y f - A D G S -R -

-T G q -i d G - Y Y f k d D - S -X A- -- - --G S --- d G - - - - - -

d G - F -D --G- F V T N r--- k- - -- N- Y Y Y -S D G-- vS G q --d G K - Y Y f --D --- K

--- G - - - d G K - Y - f d D G e v

GtfC T G - V - D G(aa 109-1330) - - - F - - -

34% PspA T G A Q - -

51% lysins -- 1- ---N

-G ---V-gQ fi---n r f V -N ST G -V - -N- - - r - -Nn r f V -N ST G

-G y - Y Y - t S G n - A-n n W Y Y f d n N G y M v

-G --Y Y f - S N G-- L--- Y Y - N d G-- -

---W - Y f -n G - M A--v -Y F - -S G-- A

-G -- Y f d -D -G n---k G e W - f d -N G-- v- - - - Y F - n G -- A- - - - Y Y d p N -G n -vk G eW- - f d - N G - v

expected for a membrane-spanning region and may not besufficiently long to traverse the membrane (Kyte-Doolittlevalue of 1.1 versus minimum expected value of 1.6, 13 aminoacids versus an expected 20 amino acids [18, 39]); (iii) thehydrophobic region is preceded not by prolines but rather byan uncharged 20-amino-acid sequence repeated 10 times; (iv)the proline region that is present in PspA is more proline richand is significantly farther from the stop codon than theproline regions of these other proteins; and (v) the LPXTGEsequence does not appear. Despite these discrepancies,experiments described in the accompanying paper (69) showthat the C-terminal region of PspA is essential for properattachment of PspA to the cell. Thus, we anticipate thatPspA attaches by a mechanism unlike that of these othersurface proteins. Given the present data, there are severalpossible mechanisms of attachment. First, the hydrophobicregion, possibly together with the poorly charged region,may suffice to anchor the protein in the membrane. Alterna-tively, anchoring may be aided by interaction of the repeatdomain with the membrane via the nine weakly hydrophobicregions located within the repeats. Such multiple weakinteractions may serve to stabilize the anchoring, obviatingthe need for a single highly hydrophobic region. A thirdpossibility is that PspA differs from the other surface pro-teins in that it is not a membrane-anchored protein. Thus,attachment may occur via interactions with cell wall- or cellmembrane-associated components. The repeats of the pneu-mococcal lysins, which are similar to those of PspA (seebelow), appear to bind to membrane-attached lipoteichoicacid via interaction with choline residues rather than inter-acting directly with the membrane (7).

Interaction of the repeat domain of PspA with the mem-brane or membrane-associated components could place theproline domain within the cell wall where, as has beenpostulated for other surface proteins of gram-positive bacte-ria, it too could be involved in attachment. The generallyhydrophilic nature of the proline domain makes it unlikelythat this portion of the protein is globular. Thus, the 81amino acids in this region could easily span the approxi-mately 16-nm width of the relatively thin pneumococcal cellwall (9, 60). The 288 residues of the N-terminal a-helix, at0.15 nm per residue (36), would be expected to be almost 43nm long and could extend at least that far from the pneumo-coccal surface. Genetic evidence for such a model, includinga surface-exposed location for the oa-helical domain and anonexposed location for the proline and other C-terminaldomains, is presented in the accompanying paper (69).Homology of PspA with other proteins and evolutionary

implications. Searches for sequences homologous to that ofPspA showed that the oa-helical domain has homology withseveral proteins whose sequences are consistent with at-he-lical coiled-coil structures, including myosin, tropomyosin,and the M proteins of group A streptococci (approximately22% identity plus 50% similarity). The most, and only other,significant homology occurred in the repeat domain, which

FIG. 5. Comparison of the PspA repeat domain with similarregions in other proteins. Amino acids were aligned with the PspArepeat on the basis of visual inspection. Gaps were introduced intothe PspA sequence in order to maximize the fit with Cpl and LytA;thus, the position numbers here differ from those in Fig. 2. Upper-case letters indicate amino acids present in PspA. Lowercase lettersrepresent amino acids present in the lysins but not in PspA. Dashesindicate amino acids not found in PspA or the lysins. Percentages ofsimilarity were calculated by dividing the number of PspA or lysinresidues present in the comparison sequence by the total number ofresidues in the comparison sequence. Proteins and regions shown

are as follows: PspA, residues 371 to 571; Cpl, the lysin of S.pneumoniae phage Cp-1, residues 200 to 323 (26); LytA, theautolysin of S. pneumoniae, residues 176 to 302 (27); ToxA, toxin Aof C. difficile, residues 2077 to 2271 (17); Gbp, glucan-bindingprotein of S. mutans, residues 160 to 559 (2); and GtfC, glucosyl-transferase C of S. mutans, residues 1094 to 1330 (61). The repeatsfrom ToxA represent approximately one-fourth of the repeat region.They are contiguous and include both class I and class II repeats(17). All of the other repeat regions are complete.

J. BACTERIOL.

PspA(aa 371-57)

Cpl(m 200-360% PspA

LytA(aa 176-302)60% PspA

n

ToxA(ma 2077-2271)44% PspA63% lysins

_IVl-GKp(aa 160-5539% Psp52% lysi

S. PNEUMONIAE pspA SEQUENCE 607

was found to be highly homologous to repeat regions foundin the C termini of the S. pneumoniae autolysin (lytA [27]);the lysins of pneumococcal phages Cp-1 (cpl-l [26]), Cp-9(cpl-9 [28]), and HB3 (hblA [54]); glucosyltransferases fromStreptococcus mutans (gtfB [57] and gtfC [61]) and Strepto-coccus downei (gtfl [20] and gtfS [30]); a glucan-bindingprotein from S. mutans (gbp [2]); and toxins A (17) and B (3)from Clostridium difficile. Comparisons of the PspA repeatswith those from several of these other proteins are shown inFig. 5. Outside of the repeat regions, there was no homologybetween PspA and these other proteins.The C-terminal repeats of the S. mutans, S. downei, C.

difficile, and other S. pneumoniae proteins range from 17 to65 amino acids in length and occur up to 38 times within therespective proteins. In our comparison of each of theserepeats with those of PspA, we used the repeats of PspA andthe residues most conserved among all the proteins, i.e.,positions 1, 2, 9, 12, 13, and 19 in Fig. 5, to determine ouralignment. Thus, all of the repeats were reduced to units ofapproximately 20 amino acids each, i.e., to one PspA repeatunit. Repeat units of more than 20 amino acids in several ofthe other proteins appear to be the result of multiple dupli-cations, with intervening mutation events, of a single PspA-like repeat unit. Even within PspA, alternating repeat unitswere observed, suggesting that the original repeat unit mayhave undergone a duplication event, followed by mutationand then subsequent duplications of the two-unit repeat.The extreme sequence conservation of the PspA repeats

may indicate that they were more recently duplicated thanthe repeats of the other proteins. Alternatively, the conser-vation of the same sequence in the different PspA repeatsmay indicate that the function of these repeats requires asequence that exhibits little variation from the observedsequence. Examination of the DNA sequence in the PspArepeat region provides some evidence for the latter possibil-ity. Of 201 codons, 36 differ by a single base from theconsensus sequence. Twenty-two of the 36 changes (61%)occurred in the third base of the codon, and all of thesechanges resulted in silent mutations. In the absence ofselective pressure against most alternative amino acids atthese positions in the repeats, we would have expected lessthan one-third of the 36 changes to result in silent mutations.The observed result differed from random by P < 0.025 bythe Cochran corrected chi-square analysis (70).The homologies observed between the repeats of PspA

and the other proteins imply a direction of evolution, i.e.,PspA +- the lysins +- ToxA, Gbp, and GtfC. The repeats ofPspA are the only ones to have recurring hydrophobicregions. As mentioned above, these regions may interactdirectly with the cell membrane and, in so doing, aid inanchoring PspA to the cell. In contrast, the lysins probablybind to the cell via interactions with membrane-associatedcomponents (7). Finally, the S. mutans, S. downei, and C.difficile proteins are all secreted from these bacteria. Thus,one of the functions of the repeats that may have been lost orgained through evolution is a mechanism for attachment tothe bacterial cell. The presence of the highly conservedamino acids in all of the repeats suggests their involvementin some important common structural function.As has been suggested for the pneumococcal lysins, all of

these distinct proteins may be the result of modular evolu-tion in which the N- and C-terminal domains represent twodistinct functional molecules that have been fused (26, 28,54). The C-terminal region of each of these proteins isthought to have a specific binding activity (17, 20, 26, 66).Although the precise function of PspA is not yet known, it

appears to have no relationship to the S. mutans, S. downei,and C. difficile proteins or to its closest homolog, thepneumococcal autolysin. PspA has no apparent lytic activity(68), PspA mutants remain autolysin positive and autolysinmutants remain PspA+ (68), and the effects on virulence ofmutations in the two genes are clearly distinguishable (4, 45).

Conclusions. The results presented here indicate that PspAis a complex molecule with unique structural features and anancestral link to a number of otherwise distinct bacterial andphage proteins. Several features of PspA, including itsvariability and potential coiled-coil structure, are reminis-cent of the M proteins of the group A streptococcus. Earlierstudies demonstrated the presence of a type-specific M-likeprotein (or proteins) on the surface of pneumococci (1). Aswith the PspA serotype, the pneumococcal M-like serotypevaried independently of capsular type. In the absence ofeither isolated M protein or antibody to M protein from theearlier studies, we cannot rule out the possibility that PspAis the pneumococcal M protein of Austrian and MacLeod.However, experiments described previously (43) suggestthat they are not the same molecule and that PspA differsfrom the streptococcal M protein. The studies presentedhere point to additional distinctions between PspA and thestreptococcal M protein. In contrast to the M proteins, theot-helical coiled-coil region of PspA does not contain tandemrepeats. Further, it is clear that the domains of PspA aredistinctly different from those identified in M protein and inother surface proteins of gram-positive bacteria. PspA is thefirst surface protein of a gram-positive bacterium shown tolack the common C-terminal anchor domain found in allother such proteins. The presence of the repeat domain inthe C terminal of PspA suggests that it is part of a differentfamily of proteins, one in which unrelated N termini havebeen fused to functionally distinct yet obviously relatedC-terminal repeat regions. In S. pneumoniae, sequencessimilar to those of the repeats of PspA and the lysins mayprove to be important in mediating attachment of manyproteins to the pneumococcal cell, as such sequences appearto occur often in the S. pneumoniae chromosome (55).The structure predicted by the deduced PspA sequence

has revealed three, and possibly four, distinct structuraldomains that undoubtedly confer different functional prop-erties on the protein. In the accompanying paper (69), wedescribe the generation and analysis of partial PspA prod-ucts and their use in the localization of specific properties tothese distinct domains.

ACKNOWLEDGMENTS

This work was supported by Public Health Service grantsA121548, A128457, and HD17812 from the National Institutes ofHealth. Genetics Computer Group sequence analysis was madeavailable through the computer resources of the University ofAlabama at Birmingham Center for Aids Research (P30 AI27767).We thank Susan Hollingshead, William Benjamin, Jr., Joseph

Dillard, and Larry McDaniel for their advice and comments onnumerous aspects of these studies and Geraline Handsome forexpert technical assistance in the subcloning of pspA fragments.

REFERENCES1. Austrian, R., and C. M. MacLeod. 1949. A type-specific protein

from pneumococcus. J. Exp. Med. 89:439-450.2. Banas, J. A., R. R. B. Russell, and J. J. Ferretti. 1990. Sequence

analysis of the gene for the glucan-binding protein of Strepto-coccus mutans Ingbritt. Infect. Immun. 58:667-673.

3. Barroso, L. A., S.-Z. Wang, C. J. Phelps, J. L. Johnson, andT. D. Wilkins. 1990. Nucleotide sequence of Clostridium diffi-cile toxin B gene. Nucleic Acids Res. 18:4004.

VOL. 174, 1992

608 YOTHER AND BRILES

4. Berry, A. M., R. A. Lock, D. Hansman, and J. C. Paton. 1989.Contribution of autolysin to virulence of Streptococcus pneu-moniae. Infect. Immun. 57:2324-2330.

5. Birnboim, H. C., and J. Doly. 1979. A rapid alkaline extractionprocedure for screening recombinant plasmid DNA. NucleicAcids Res. 7:1513-1523.

6. Brendel, V., and E. N. Trifonov. 1984. A computer algorithm fortesting potential prokaryotic terminators. Nucleic Acids Res.12:4411-4426.

7. Briese, T., and R. Hakenbeck. 1985. Interaction of the pneumo-coccal amidase with lipoteichoic acid and choline. Eur. J.Biochem. 146:417-427.

8. Briles, D. E., C. Forman, J. C. Horowitz, J. E. Volanakis, W. H.Benjamin, Jr., L. S. McDaniel, J. Eldridge, and J. Brooks. 1989.Antipneumococcal effects of C-reactive protein and monoclonalantibodies to pneumococcal cell wall and capsular antigens.Infect. Immun. 57:1457-1464.

9. Briles, E. I. B. 1974. Ph.D. thesis. The Rockefeller University,New York, N.Y.

10. Chou, P. Y., and G. D. Fasman. 1978. Prediction of secondarystructure of proteins from their amino acid sequence. Adv.Enzymol. Relat. Areas Mol. Biol. 47:45-147.

11. Cohen, C., and D. A. D. Parry. 1990. a-Helical coiled-coils andbundles: how to design an a-helical protein. Proteins Struct.Funct. Genet. 7:1-15.

12. Crain, M. J., and D. E. Briles. Unpublished data.13. Crain, M. J., W. D. Waltman II, J. S. Turner, J. Yother, D. K.

Talkington, L. S. McDaniel, B. M. Gray, and D. E. Briles. 1990.Pneumococcal surface protein A (PspA) is serologically variableand is expressed by all clinically important capsular serotypes ofStreptococcus pneumoniae. Infect. Immun. 58:3293-3299.

14. Crick, F. H. C. 1953. The packing of alpha-helices: simplecoiled-coils. Acta Crystallogr. 6:689-697.

15. Deibel, R. H., and H. W. Seeley, Jr. 1974. Family II. Strepto-coccaceae fam. nov. 7. Streptococcus pneumoniae, p. 499-500.In R. E. Buchanan and N. E. Gibbons (ed.), Bergey's manual ofdeterminative bacteriology. The Williams & Wilkins Co., Bal-timore.

16. Devereux, J., P. Haeberli, and 0. Smithies. 1984. A comprehen-sive set of sequence analysis programs for the VAX. NucleicAcids Res. 12:387-395.

17. Dove, C. H., S.-Z. Wang, S. B. Price, C. J. Phelps, D. M. Lyerly,T. D. Wilkins, and J. L. Johnson. 1990. Molecular characteriza-tion of the Clostridium difficile toxin A gene. Infect. Immun.58:480-488.

18. Eisenberg, D. 1984. Three-dimensional structure of membraneand surface proteins. Annu. Rev. Biochem. 53:595-623.

19. Fahnestock, S. R., P. Alexander, J. Nagle, and D. Filpula. 1986.Gene for an immunoglobulin-binding protein from a group Gstreptococcus. J. Bacteriol. 167:870-880.

20. Ferretti, J. J., M. L. Gilpin, and R. R. B. Russell. 1987.Nucleotide sequence of a glucosyltransferase gene from Strep-tococcus sobrinus MFe28. J. Bacteriol. 169:4271-4278.

21. Ferretti, J. J., R. R. B. Russell, and M. L. Dao. 1989. Sequenceanalysis of the wall-associated protein precursor of Streptococ-cus mutans antigen A. Mol. Microbiol. 3:469-478.

22. Fischetti, V. A. 1989. Streptococcal M protein: molecular designand biological behavior. Clin. Microbiol. Rev. 2:285-314.

23. Fischetti, V. A., V. Pancholi, and 0. Schneewind. 1990. Conser-vation of a hexapeptide sequence in the anchor region of surfaceproteins from gram-positive cocci. Mol. Microbiol. 4:1603-1605.

24. Fischetti, V. A., D. A. D. Parry, B. L. Trus, S. K. Hollingshead,J. R. Scott, and B. N. Manjula. 1988. Conformational charac-teristics of the complete sequence of group A streptococcal M6protein. Proteins 3:60-69.

25. Gailiard, J.-L., P. Berche, C. Frehel, E. Gouin, and P. Cossart.1991. Entry of L. monocytogenes into cells is mediated byinternalin, a repeat protein reminiscent of surface antigens fromgram-positive cocci. Cell 65:1127-1141.

26. Garcia, E., J. L. Garcia, P. Garcia, A. Arraras, J. M. Sanchez-Puelles, and R. Lopez. 1988. Molecular evolution of lytic en-zymes of Streptococcus pneumoniae and its bacteriophages.

Proc. Natl. Acad. Sci. USA 85:914-918.27. Garcia, P., J. L. Garcia, E. Garcia, and R. Lopez. 1986.

Nucleotide sequence and expression of the pneumococcal au-tolysin gene from its own promoter in Escherichia coli. Gene43:265-272.

28. Garcia, P., J. L. Garcia, J. M. Sanchez-Puelles, and R. Lopez.1990. Modular organization of the lytic enzymes of Streptococ-cus pneumoniae and its bacteriophages. Gene 86:81-88.

29. Garnier, J., D. J. Osguthorpe, and B. Robson. 1978. Analysis ofthe accuracy and implications of simple methods for predictingthe secondary structure of globular proteins. J. Mol. Biol.120:97-120.

30. Gilmore, K.. S., R. R. B. Russell, and J. J. Ferretti. 1990.Analysis of the Streptococcus downei gtfS gene, which specifiesa glucosyltransferase that synthesizes soluble glucans. Infect.Immun. 58:2452-2458.

31. Goldschmidt, R. M., M. Thoren-Gordon, and R. Curtiss Ill.1990. Regions of the Streptococcus sobrinus spaA gene encod-ing major determinants of antigen I. J. Bacteriol. 172:3988-4001.

32. Guss, B., M. Uhlen, B. Nilsson, M. Lindberg, J. Sjoquist, and J.Sjodahl. 1984. Region X, the cell-wall-attachment part of sta-phylococcal protein A. Eur. J. Biochem. 138:413-420. (Author'scorrection, 143:685.)

33. Haanes, E. J., and P. P. Cleary. 1989. Identification of adivergent M protein gene and an M protein-related gene familyin Streptococcus pyogenes serotype 49. J. Bacteriol. 171:6397-6408.

34. Hawley, D. K., and W. R. McClure. 1983. Compilation andanalysis of Escherichia coli promoter DNA sequences. NucleicAcids Res. 11:2237-2255.

35. Hodges, R. S., P. D. Semcuk, A. K. Taneja, C. M. Kay, andJ. M. R. Parker. 1988. Protein design using model syntheticpeptides. Pept. Res. 1:19-30.

36. Hodges, R. S., J. Sodek, L. B. Smillie, and L. Jurasek. 1972.Tropomyosin: amino acid sequence and coiled-coil structure.Cold Spring Harbor Symp. Quant. Biol. 37:299-310.

37. Hollingshead, S. K., V. A. Fischetti, and J. R. Scott. 1986.Complete nucleotide sequence of type 6 M protein of the groupA streptococcus. J. Biol. Chem. 261:1677-1686.

38. Jameson, B. A., and H. Wolf. 1988. The antigenic index: a novelalgorithm for predicting antigenic determinants. Comput. AppI.Biosci. 4:181-186.

39. Kyte, J., and R. F. Doolittle. 1982. A simple method fordisplaying the hydropathic character of a protein. J. Mol. Biol.157:105-132.

40. Lupas, A., M. V. Dyke, and J. Stock. 1991. Predicting coiled-coils from protein sequences. Science 252:1162-1164.

41. Marck, C. 1988. 'DNA Strider': a 'C' program for the fastanalysis of DNA and protein sequences on the Apple Macintoshfamily of computers. Nucleic Acids Res. 16:1829-1836.

42. McDaniel, L. S., G. Scott, J. F. Kearney, and D. E. Briles. 1984.Monoclonal antibodies against protease sensitive pneumococcalantigens can protect mice from fatal infection with Streptococ-cus pneumoniae. J. Exp. Med. 160:386-397.

43. McDaniel, L. S., G. Scott, K. Widenhofer, J. Carroll, and D. E.Briles. 1986. Analysis of a surface protein of Streptococcuspneumoniae recognized by protective monoclonal antibodies.Microb. Pathog. 1:519-531.

44. McDaniel, L. S., J. S. Sheffield, P. Delucchi, and D. E. Briles.1991. PspA, a surface protein of Streptococcus pneumoniae, iscapable of eliciting protection against pneumococci of morethan one capsular type. Infect. Immun. 59:222-228.

45. McDaniel, L. S., J. Yother, M. Viayakumar, L. McGarry,W. R. Guild, and D. E. Briles. 1987. Use of insertional inacti-vation to facilitate studies of biological properties of pneumo-coccal surface protein A (PspA). J. Exp. Med. 165:381-394.

46. McLachlan, A. D., M. Stewart, and L. B. Smillie. 1975. Se-quence repeats in ao-tropomyosin. J. Mol. Biol. 98:281-291.

47. Miller, L., L. Gray, E. Beachey, and M. Kehoe. 1988. Antigenicvariation among group A streptococcal M proteins, nucleotidesequence of the serotype 5 M protein gene and its relationshipwith genes encoding types 6 and 24 M proteins. J. Biol. Chem.263:5668-5673.

J. BACTERIOL.

S. PNEUMONIAE pspA SEQUENCE 609

48. Mouw, A. R., E. H. Beachey, and V. Burdett. 1988. Molecularevolution of streptococcal M protein: cloning and nucleotidesequence of the type 24 M protein gene and relation to othergenes of Streptococcus pyogenes. J. Bacteriol. 170:676-684.

49. Pancholi, V., and V. A. Fischetti. 1988. Isolation and character-ization of the cell-associated region of group A streptococcal M6protein. J. Bacteriol. 170:2618-2624.

50. Perlman, D., and H. 0. Halvorson. 1983. A putative signalpeptidase recognition site and sequence in eukaryotic andprokaryotic signal peptides. J. Mol. Biol. 167:391-409.

51. Phillips, G. N., Jr., P. F. Flicker, C. Cohen, B. N. Manjula, andV. A. Fischetti. 1981. Streptococcal M protein: a-helical coiled-coil structure and arrangement on the cell surface. Proc. Natl.Acad. Sci. USA 78:4689-4693.

52. Radloff, R., W. Bauer, and J. Vinograd. 1967. A dye-buoyant-density method for the detection and isolation of closed circularduplex DNA: the closed circular DNA in HeLa cells. Proc.Natl. Acad. Sci. USA 57:1514-1520.

53. Robbins, J. C., J. G. Spanier, S. J. Jones, W. J. Simpson, andP. P. Cleary. 1987. Streptococcus pyogenes type 12 M proteingene regulation by upstream sequences. J. Bacteriol. 169:5633-5640.

54. Romero, A., R. Lopez, and P. Garcia. 1990. Sequence of theStreptococcus pneumoniae bacteriophage HB-3 amidase re-veals high homology with the major host autolysin. J. Bacteriol.172:5064-5070.

55. Sheffield, J. S., J. Yother, L. S. McDaniel, and D. E. Briles.Unpublished data.

56. Shine, J., and L. Dalgarno. 1974. The 3'-terminal sequence ofEscherichia coli 16S rRNA: complementarity to nonsense trip-lets and ribosome binding sites. Proc. NatI. Acad. Sci. USA71:1342-1346.

57. Shiroza, T., S. Ueda, and H. K. Kuramitsu. 1987. Sequenceanalysis of the gtfB gene from Streptococcus mutans. J. Bacte-riol. 169:4263-4270.

58. Sodek, J., R. S. Hodges, B. Smillie, and L. Jurasek. 1972.Amino-acid sequence of rabbit skeletal tropomyosin and itscoiled-coil structure. Proc. Natl. Acad. Sci. USA 69:3800-3804.

59. Talkington, D. F., D. L. Crimmins, D. C. Voellinger, J. Yother,and D. E. Briles. 1991. A 43-kilodalton pneumococcal surfaceprotein, PspA: isolation, protective abilities, and structuralanalysis of the amino-terminal sequence. Infect. Immun. 59:1285-1289.

60. Tomasz, A. 1981. Surface components of Streptococcus pneu-moniae. Rev. Infect. Dis. 3:190-211.

61. Ueda, S., T. Shiroza, and H. K. Kuramitsu. 1988. Sequenceanalysis of the gtfC gene from Streptococcus mutans GS-5.Gene 69:101-109.

62. Uhlen, M., G. Guss, B. Nilsson, S. Gatenbeck, L. Philipson, andM. Lindberg. 1984. Complete sequence of the staphylococcalgene encoding protein A. J. Biol. Chem. 259:1695-1702. (Au-thor's correction, 259:13628.)

63. von Heijne, G. 1983. Patterns of amino acids near signal-sequence cleavage sites. Eur. J. Biochem. 133:17-21.

64. Vos, P., G. Simons, R. J. Siezen, and W. M. deVos. 1989.Primary structure and organization of the gene for a prokary-otic, cell envelope-located serine protease. J. Biol. Chem.264:13579-13585.

65. Waltman, W. D., II, L. S. McDaniel, and D. E. Briles. 1990.Variation in the molecular weight of PspA (pneumococcalsurface protein A) among Streptococcus pneumoniae. Microb.Pathog. 8:61-69.

66. Wren, B. W. 1991. A family of clostridial and streptococcalligand-binding proteins with conserved C-terminal repeat se-quences. Mol. Microbiol. 5:797-803.

67. Yanisch-Perron, C., J. Vieira, and J. Messing. 1985. ImprovedM13 phage cloning vectors and host strains: nucleotide se-quences of the M13mpl8 and pUC19 vectors. Gene 33:103-119.

68. Yother, J. Unpublished data.69. Yother, J., G. L. Handsome, and D. E. Briles. 1992. Truncated

forms of PspA that are secreted from Streptococcus pneumo-niae and their use in functional studies and cloning of the pspAgene. J. Bacteriol. 174:610-618.

70. Zar, J. H. 1984. Biostatistical analysis. Prentice-Hall, Inc.,Englewood Cliffs, N.J.

VOL. 174, 1992