Evolution of Aminobenzoate Synthases: Nucleotide Sequences of Salmonella typhimurium and Klebsiella...

18
Evolution of Aminobenzoate Synthases: Nucleotide Sequences of Salmonella typhimurium and Klebsiella aerogenes pabB 1 Paul Goncharoff 2 and Brian P. Nichols Laboratory for Cell, Molecular, and Developmental Biology, Department of Biological Sciences, University of Illinois at Chicago p_Aminobenzoate synthase (PS) and anthranilate synthase (AS) are structurally related enzymes that catalyze similar reactions and produce similar products, pam- and o&o-aminobenzoate (anthranilate) . Each enzyme is composed of two non- identical subunits: a glutamine amidotransferase subunit (Co11 ) and a subunit that produces an aminobenzoate product ( CoI) . Nucleotide sequence comparisons of the Escherichia coli genes encoding each of the subunits suggest a common evo- lutionary origin for both subunits of the enzyme complexes. We report here the nucleotide sequences of the pabB genes that encode Salmonella typhimurium and Klebsiella aerogenes PS CoI. Comparative sequence information suggests that pabB is encoded as the first gene in a multicistronic transcript. Comparison of deduced amino acid sequences of PS Co1 genes indicates that the majority of sequence identity occurs in the C-terminal two-thirds of the proteins. Similarly, identities in an alignment of eight PS and AS Co1 sequences are confined to the C-terminal segments of the proteins. Secondary-structure predictions for the nine sequences suggest considerable similarity in the folding of the C-terminal portions of the aminobenzoate synthases. Introduction The study of p-aminobenzoate synthase (PS) and anthranilate synthase (AS) presents a model system for the study of genes and proteins that are evolutionarily related but have diverged to carry out different enzymatic functions. Both PS and AS consist of two nonidentical subunits, component I (CoI) and component II (CoII). In each synthase complex, Co11 provides a glutamine amidotransferase activity whose function is to transfer the -NH2 group from glutamine to CoI, which then uses the -NH2 to form an aromatic aminobenzoate product. In the absence of CoII, Co1 can use free NH3 to form the aromatic product (Ito et al. 1968; S. Doktor and B. P. Nichols, unpublished results). PS and AS share at least a portion of the reaction mechanism from substrate to product (Teng et al. 1985). The structural relationships between the PS and AS glutamine amidotransferase (CoII) subunits have been recently documented (Kaplan and Nichols 1983; Kaplan et al. 1984, 1985). The Co11 subunits perform identical roles in the aminobenzoate 1. Key words: p-aminohenzoate synthase, anthranilate synthase, duplication. 2. Current address: Department of Microbiology, College of Physicians and Surgeons of Columbia University, 701 West 168th Street, New York, New York 10032. Address for correspondence and reprints: Dr. Brian P. Nichols, Laboratory for Cell, Molecular, and Developmental Biology, Department of Biological Sciences, University of Illinois at Chicago, P.O. Box 4348, Chicago, Illinois 60680; telephone (3 12) 996-5064. Mol. Biol. Evol. 5(5):531-548. 1988. 0 1988 by The University of Chicago. All rights reserved. 0737-4038/88/MOS-0006.$02.00 531 by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from by guest on October 9, 2013 http://mbe.oxfordjournals.org/ Downloaded from

Transcript of Evolution of Aminobenzoate Synthases: Nucleotide Sequences of Salmonella typhimurium and Klebsiella...

Evolution of Aminobenzoate Synthases: Nucleotide Sequences of Salmonella typhimurium and Klebsiella aerogenes pabB 1

Paul Goncharoff 2 and Brian P. Nichols Laboratory for Cell, Molecular, and Developmental Biology, Department of Biological Sciences, University of Illinois at Chicago

p_Aminobenzoate synthase (PS) and anthranilate synthase (AS) are structurally related enzymes that catalyze similar reactions and produce similar products, pam- and o&o-aminobenzoate (anthranilate) . Each enzyme is composed of two non- identical subunits: a glutamine amidotransferase subunit (Co11 ) and a subunit that produces an aminobenzoate product ( CoI) . Nucleotide sequence comparisons of the Escherichia coli genes encoding each of the subunits suggest a common evo- lutionary origin for both subunits of the enzyme complexes. We report here the nucleotide sequences of the pabB genes that encode Salmonella typhimurium and Klebsiella aerogenes PS CoI. Comparative sequence information suggests that pabB is encoded as the first gene in a multicistronic transcript. Comparison of deduced amino acid sequences of PS Co1 genes indicates that the majority of sequence identity occurs in the C-terminal two-thirds of the proteins. Similarly, identities in an alignment of eight PS and AS Co1 sequences are confined to the C-terminal segments of the proteins. Secondary-structure predictions for the nine sequences suggest considerable similarity in the folding of the C-terminal portions of the aminobenzoate synthases.

Introduction

The study of p-aminobenzoate synthase (PS) and anthranilate synthase (AS) presents a model system for the study of genes and proteins that are evolutionarily related but have diverged to carry out different enzymatic functions. Both PS and AS consist of two nonidentical subunits, component I (CoI) and component II (CoII). In each synthase complex, Co11 provides a glutamine amidotransferase activity whose function is to transfer the -NH2 group from glutamine to CoI, which then uses the -NH2 to form an aromatic aminobenzoate product. In the absence of CoII, Co1 can use free NH3 to form the aromatic product (Ito et al. 1968; S. Doktor and B. P. Nichols, unpublished results). PS and AS share at least a portion of the reaction mechanism from substrate to product (Teng et al. 1985).

The structural relationships between the PS and AS glutamine amidotransferase (CoII) subunits have been recently documented (Kaplan and Nichols 1983; Kaplan et al. 1984, 1985). The Co11 subunits perform identical roles in the aminobenzoate

1. Key words: p-aminohenzoate synthase, anthranilate synthase, duplication. 2. Current address: Department of Microbiology, College of Physicians and Surgeons of Columbia

University, 701 West 168th Street, New York, New York 10032.

Address for correspondence and reprints: Dr. Brian P. Nichols, Laboratory for Cell, Molecular, and Developmental Biology, Department of Biological Sciences, University of Illinois at Chicago, P.O. Box 4348, Chicago, Illinois 60680; telephone (3 12) 996-5064.

Mol. Biol. Evol. 5(5):531-548. 1988. 0 1988 by The University of Chicago. All rights reserved. 0737-4038/88/MOS-0006.$02.00

531

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

by guest on October 9, 2013

http://mbe.oxfordjournals.org/

Dow

nloaded from

532 Goncharoff and Nichols

synthase complexes, and specificity appears to be provided solely through subunit associations. The relationship between PS and AS is further underscored by the ob- servation that some bacteria (Acinetobacter calcoaceticus [ Sawula and Crawford 19731, Pseudomonas acidovorans [ Buvinger et al. 198 11, and Bacillus subtilis [Kane 19771) express a single amphibolic Co11 subunit which functions in both p-aminobenzoate (PABA) and anthranilate synthesis. The similarity in the nucleotide and predicted amino acid sequences among Co11 genes from diverse sources reflects the identical reactions they perform and suggests that they are paralogous (Kaplan and Nichols 1983; Kaplan et al. 1984, 1985).

The structural relationship between PS and AS Co1 was first inferred from the ability of antibodies raised against AS to cross-react with fractionated extracts con- taining PS (Reiners et al. 1978 ) . More recently, the nucleotide and amino acid sequence relationships between the Escherichia coli PS and AS Co1 genes have been clarified (Goncharoff and Nichols 1984). The Co1 proteins are less similar than their Co11 counterparts (26% identity compared with 44% identity, respectively) and, rather than showing similarity throughout the sequences, are identical primarily in the C-terminal two-thirds of the proteins. The sequence similarity most likely results from portions of the sequence that are responsible for catalyzing similar portions of the reaction sequence from substrate to product. As with CoII, the sequence similarities suggest paralogy .

To further our understanding of the relationship between PS and AS, we have determined the nucleotide sequences of the pabB genes for PS from Salmonella ty- phimurium and Klebsiella aerogenes. Comparisons of the flanking sequences of these two enteric pabB genes with the E. coli pabB flanking sequences (Goncharoff and Nichols 1984) suggest that pabB is transcribed and translated as the first gene in a polycistronic mRNA and that another gene is transcribed from a promoter oriented oppositely from the pabB promoter.

Comparison of the deduced amino acid sequences of the pabB genes shows that PS Co1 sequences are more divergent than many other orthologous proteins from these enteric bacteria. Alignment of eight amino acid sequences derived from PS and AS Co1 gene sequences reveals several regions of identity clustered in the C-terminal portions of the sequences. Secondary-structure predictions on each of the sequences also suggest similarities in polypeptide folding. The data are consistent with the hy- pothesis that PS and AS are ( 1) paralogous and ( 2) have diverged to play different roles in intermediary metabolism.

Material and Methods Media

Complete media consisting of 1% tryptone, 0.5% yeast extract, and 0.5% NaCl along with a minimal salt medium (Vogel and Bonner 1956) were used routinely for both liquid broth cultures and solid media. Supplements added to media were at the following concentrations: amino acids, 20 pg/ml; PABA and thiamine, 1 pg/ml; 5- bromo-4-chloro-3-indolyl-/.3-D-galactopyranoside ( X-gal), 40 pg / ml; isopropyl-p-r>- thiogalactoside (IPTG), 0.4 mM; glucose, 0.4%; ampicillin, 50 pg/ml.

Bacterial Strains, Phages, and Plasmids

The Escherichia coli strains used in the present study were E. coli AB3303 (pabB.3) (Huang and Pittard 1967 ) and JM 103 (Messing et al. 198 1). M 13 phages employed in sequence determination were Ml 3mp8, M 13mp9 (Messing and Vieira 1982),

Enteric pabB Sequences 533

M13mp18, and M13mp19 (Yanisch-Perron et al. 1985). Ml3mp8/MZuI (Stanley 1983) was obtained from J. Stanley. A h library containing SaZmoneZZa typhimurium genomic DNA was obtained from Robert Lawther. A plasmid library containing Kleb- siella aerogenes genomic DNA and pBR322 vector was obtained from Ronald Law.

Enzymes and Biochemicals

Restriction endonucleases were purchased from New England Biolabs or Boeh- ringer-Mannheim or were prepared in our laboratory by published procedures. Esch- erichia coli DNA polymerase I (Klenow fragment), S 1 nuclease, calf intestinal phos- phatase, and terminal transferase were purchased from Boehringer-Mannheim. T4 DNA polymerase was purchased from International Biotechnologies, Inc. T4 poly- nucleotide kinase was a gift from A. Berk. [y-32P]ATP (>2,000 mCi/p.mol) and [ c+~*P]~CTP (800 mCi/pmol) were purchased from Amersham.

DNA Manipulations

Restriction digests, ligations, and gel electrophoresis experiments were carried out according to the method of Maniatis et al. (1982). Plasmid DNA was routinely transformed into E. coli cells according to a method described by Mandel and Higa ( 1970). Small-scale plasmid DNA preparations were produced according to the Bim- boim and Doly ( 1979) method. When large-scale ( 1-mg) quantities of plasmid were needed, the method of choice was that of Guerry et al. ( 1973). Fragment elution from acrylamide gels was performed according to the crush-and-soak method of Maxam and Gilbert ( 1980 ) .

DNA Sequence Analysis

DNA sequence determination was performed according to the chain termination method of Sanger et al. (1977). Isolation of specific single-stranded templates was performed according to a method described by Messing et al. ( 198 1). Plasmids pMS 1 (fig. 1) and pMB465 (fig. 2) served as the source of DNA fragments that were cloned into various Ml 3 phage templates. All of the nucleotide sequence presented in the present work was determined on both strands of the DNA with all restriction sites overlapped. In generating several clones, the single-strand deletion method described by Dale et al. ( 1985 ) was employed.

Computer Analysis

DNA and amino acid sequences were analyzed using computer programs pur- chased from IBI and Hitachi. Secondary-structure predictions were done according to the method of Gamier et al. ( 1978 ) by using software obtained from Intelligenetics.

Results Nucleotide Sequence Determination

Salmonella typhimurium pabB was isolated from a pool of hybrid hgt7 phage containing EcoRI fragments of S. typhimurium genomic DNA. A phage containing pabB was identified by its ability to transduce Escherichia coli AB3303 (pabB3) to PABA independence. The DNA from this phage (hpabB) was isolated and digested with EcoRI. A 7.6-kb EcoRI fragment subcloned into pBR322 (pFE 1) conferred PABA independence on E. coli AB3303. An EcoRI-Sal1 subclone in pBR322 (pMS1) con- tained 4.8 kb of S. typhimurium DNA that complemented the pabB mutation. The location of S. typhimurium pabB was determined by DNA sequence analysis. A 3.2-

534 Goncharoff and Nichols

FIG. 1 .-DNA sequencing strategy for Salmonella typhimurium pabB. The upper portion of the figure illustrates the restriction maps of subclones containing pabB (open arrows). Below are shown ( 1) the fragments that were subcloned into M 13 vectors and (2) the direction and extent of sequence analysis. The dashed arrow in one TaqI fragment indicates a sequence determined using a 126-bp 7’aqI restriction fragment (immediately to the right of the arrow) as a sequencing primer. Heavy dotted lines represent DNA deleted from mPG103 (a phage containing the M/u1 fragment).

kb iWu1 fragment containing the 5’ portion of pabB at one of its ends was isolated and ligated to MZuI-digested M 13mp8 /iWuI (Stanley 1983). Recombinant phages containing the 3.2-kb MZuI fragment present in both orientations were isolated (mPG 103 and mPG 104 ) . Nucleotide sequences obtained from mPG 104 were iden- tified as pabB by comparison with the E. coli pabB sequence. The sequencing strategy of the entire S. typhimurium pabB gene is shown in figure 1. Most of the DNA sequence was obtained from HinPI, HpaII, and TaqI subclones derived from the 3.2-kb AccI fragment. Additional sequence data were determined from several deletions of mPG 103 constructed according to the rapid single-strand deletion method described by Dale et al. (1985) or by using the 126-bp TaqI fragment as a primer for template containing the 902-bp HpaII fragment (fig. 1, dashed line).

Klebsiella aerogenes pabB was isolated from a pBR322 library containing K. aerogenes chromosomal DNA. BamHI digests of K. aerogenes genomic DNA and plasmid pBR322 were ligated and transformed into E. coli AB3303. One plasmid conferring PABA independence and ampicillin resistance, pRLpabB, contained a 12- kb BamHI fragment in the BamHI site of pBR322 (fig. 2). The location of the pabB gene was deduced from subclones pPG 10 and pMB465. Subclone pPG 10 carried an 8-kb EcoRI- BamHI fragment, while pMB465 contained an overlapping 7-kb BamHI- Sal1 fragment. All three plasmids-pRLpabB, pPGl0, and pMB465-contained in common a 2.5-kb EcoRI-Sal1 fragment. The location ofpabB within the EcoRI-Sal1 fragment was confirmed by Southern hybridization experiments using E. coli pabB

Enteric pabB Sequences 535

FIG. 2.--DNA sequencing strategy for Klebsiella aerogenes pabB. The upper portion of the figure illustrates the restriction maps of subclones containing pabB (open arrows). Below are shown ( 1) the fragments that were subcloned into M 13 vectors and (2) the direction and extent of sequence analysis.

as a probe (data not shown). The nucleotide sequence of K. aerogenes pabB was determined from M 13mp8 or M 13mp 18 ( Yanisch-Perron et al. 1985 ) subclones con- taining fragments derived from the 2.5-kb EcoRI-Sal1 fragment. The nucleotide se- quence of a 66-bp region on one of the strands was determined by using the 193-bp HpaII fragment to prime synthesis from a template containing the 507-bp Sau3A fragment.

Comparison of pabB Sequences

The nucleotide sequence and deduced amino acid sequence alignment of E. coli, S. typhimurium, and K. aerogenes pabB are presented in figure 3. The pabB sequences are essentially colinear, differing only at the extreme 5 ‘-terminus, at the position of the presumed initiating codons. The pairwise comparisons between sequences of E. coli, S. typhimurium, and K. aerogenes pabB are consistent with 5s RNA sequence comparisons that indicate that E. coli, S. typhimurium, and K. aerogenes speciated at approximately the same time (Hori and Osawa 1979).

Table 1 summarizes the nucleotide and amino acid sequence differences among the three pabB genes. As might be expected, most of the nucleotide substitutions (57%-60%) among the three pairwise sequence comparisons occur in the third position of codons and are randomly distributed throughout the sequence. However, approx- imately two-thirds of the base substitutions in the first and second codon positions (as well as amino acid replacements) are found in the first 40% (to residue 177 ) of

30 60 90 Kc --- ATG AAG ACG TTA TCT CCC GCT GTG ATT ACT TTA CTC TGG CGT CAG GAC CCC GCT GAA TTT TAT TTC TCC CCC TTA AGC CAC CTG CCC St ATG A T C C C C CAC TGG C G A TT C Ka ___ ___ ___ T G C AA TCCG C CC T AC T AC CC TTC A C

120 150 100 TGG GCG ATG CTT TTA CAC TCC GGC TAT GCC GAT CAT CCG TAT AGC CCC TTT CAT ATT GTG GTC CCC GAG CCC ATT TGC ACT TTA ACC ACT

GCG T A TG T AT C A T C T TGCAC ACG G GCG T T T A CC A T C CAT C CCC CC G C G GTG

210 240 270 TTC GGT AAA GAA ACC GTT GTT ACT GAA AGC GAA AAA CCC ACA ACG ACC ACT CAT GAC CCG CTA CAG GTG CTC CAG CAG GTG CTG GAT CCC CC CO C G ACG A T ACT CCC CCC CC ACT C GTC T CTC C T T C TTG ACT CAA G GCG CA C CT ACC C A C C GG CC GT T CT CC CC T TAC C GG T CA CC A

300 330 360 GCA GAC ATT CGC CCA ACG CAT AAC GAA CAT TTG CCA TTT CAG GGC GGC GCA CTG GGG TTG TTT GGC TAC CAT CTG CCC CCC CGT TTT GAG CTG CCT T AT TCA C G CCT C A C G TC T T TA G G C A TGT A CC AG G CA C C T CCG C C G TC G AC C T T T A

390 420 450 TCA CTG CCA GAA ATT CCC GAA CAA GAT ATC GTT CTG CCC OAT ATG GCA GTG GGT ATC TAC CAT TGG GCG CTC ATT GTC GAC CAC CAG CGT ATT T C T C C CC GC C TA C CAT CCT C G T A AAA CAT G CG CGG C C CCC AA G A T T C TA C T T C

480 510 540 CAT ACA GTT TCT TTG CTG ACT CAT AAT CAT GTC AAT CCC CGT CGG CCC TOG CTG GAA AGC CAG CAA TTC TCG CCC CAG GAA GAT TTC ACG

GGTG G CC AA CT CC CA G C A TAT CC C ACC CC GCG C A C GC ACG CCC G CC GAG A G ATC CT CC CCC C G G C T G GCA ACT GCG C T GTC GCT C ACC T C

570 600 630 CTC ACT TCC GAC TGG CAA TCC AAT ATG ACC CGC GAG CAG TAC GGC GAA AAA TTT CGC CAG GTA CAG GAA TAT CTG CAC AGC GGT GAT TGC

T C AC G T TGC G T G G T G CC GG G C G C A CC CC G T G G T G G C AT CT c* c

660 690 720 TAT CAG GTG AAT CTC CCC CAA CGT TTT CAT GCG ACC TAT TCT GGC CAT GAA TGG CAG GCA TTC CTT CAG CTT AAT CAG CCC AAC CGC GCG

C TT G G G C GAG T T GAA CC C CC T C C C G G G G ACG G CCC G G C G C CC T

750 780 810 CCA TTT AGC GCT TTT TTA CGT CTT GAA CAG GGT GCA ATT TTA AGC CTT TCG CCA GAG CGG TTT ATT CTT TGT GAT AAT ACT GAA ATC CAG

G C C CT TACTGC C C A G T C T C AA CTG G G CT G C C AT C G TGA G CGA G C T AG CTC CGG C A G C C T

840 870 900 - ACC CCC CCC ATT AAA GGC ACG CTA CCA CCC CTG CCC GAT CCT CAG GAA GAT AGC AAA CAA GCA GTA AAA CTG GCG AAC TCA GCG AAA CAT

G C T T G T AA G CC CTCG G G CAG T T AT G C C G G AT CAG G C G CG C GCG CTG G CAG CA G GCA C

930 960 990 CGT CCC GAA AAT CTG ATG ATT GTC CAT TTA ATG CGT AAT CAT ATC GGT CGT GTT GCC GTA GCA GGT TCG GTA AAA GTA CCA GAG CTG TTC

C T T G C T C G C C G G G A C C C T C G G A CC AGC T CGC T T T

1020 1050 1080 GTG GTG GAA CCC TTC CCT GCC GTG CAT CAT CTG GTC AGC ACC ATA ACG'GCG CAA CTA CCA GAA CAG TTA CAC CCC AGC CAT CTG CTG CGC

c c A T T C T T C C GTT G CTCCT T TC C G C G CC C C AGG G C CT TCG CC C

1110 1140 1170 GCA GCT TTT CCT GGT GGC TCA ATA ACC GGG GCT CCC AAA GTA CGG GCT ATG GAA ATT ATC GAC GAA CTG GAA CCC CAG CGA CGC AAT CCC

G c c c C T C G T G A G C G C c c C C G G CAA G G C T G C G

1200 1230 1260 TGG TGC CCC AGC ATT GGC TAT TTG AGC TTT TGC GGC AAC ATG GAT ACC AGT ATT ACT ATC CGC ACG CTG ACT GCC ATT AAC CGA CAA ATT

T C T C T C G CCC G A CGCG C cc C cc C G C T G TGG C G ccc

1290 __13Jo . 1350 TTC TGC TCT GCG GGC GGT CGA ATT GTC GCC CAT AGC CAG GAA GAA GCG GAA TAT CAG GAA ACT TTT GiiT AAA G?T AAT CCT ATC CTG AAG AT A C C T C G AC cc A c c C G GA GC CT

CAA CTG GAG AAG TAA C C

GC G

FIG. 3.-The nucleotide sequences of Escherichia coli, Salmonella typhimurium, and Klebsiella aerogenes pabB. The E. coli sequence (Goncharoff and Nichols 1984) is shown in its entirety, while only differences from that sequence are shown for S. typhimurium and K. aerogenes. Numbers begin at the putative initiation codon of the S. typhimurium sequence. The dot at nucleotide 1328 represents an axis of dyad symmetry that extends through the overlined nucleotides. Hyphens indicate gaps introduced to align sequences. Corresponding amino acid sequences are given in fig. 5. Ec = E. coli; St = S. typhimurium; Ka = K. aerogenes.

538 Goncharoff and Nichols

Table 1 Nucleotide and Amino Acid Differences among pabB Genes

EC-St Ec-Ka St-Ka

Nucleotide differences: First codon position . . . . . Second codon position . . . Third codon position . . . .

Total . . . . . . . . . . . . . . Total amino acid differences .

83 (22) 92 (24) 97 (25) 65 (17) 81 (21) 69 (18)

225 (60) 222 (58) 219 (57) 373 (23) 385 (29) 385 (29) 106 (23) 117 (26) 112 (25)

NOTE.-Numbers in parentheses indicate percent of total differences. EC = Escherichiu coli; St = Salmonella typhimurium; Ka = Klebsiella aerogenes.

the sequences, indicating a higher degree of divergence and, presumably, relaxed func- tional constraints in the early portion of the sequence.

One striking region of exact nucleotide sequence identity among all three se- quences is found at nucleotide positions 1305- 1347. The lack of substitutions within a 43-bp sequence does not appear to result from codon usage constraints as dictated by the amino acid sequence; however, the region does contain a sequence that might contribute to the formation of a stable secondary structure at either the mRNA or DNA level ( overlined in fig. 3 ) . As discussed below, comparison of the 3’-flanking regions has revealed the presence of an open reading frame. It is conceivable that the conserved sequence or structure within the coding region ofpabB affects the expression of sequences downstream of pabB, although at present no experimental evidence is available to support this hypothesis.

The pattern of codon usage among the three enteric sequences is similar to and consistent with that of genes that are not expressed at high levels (Grantham et al. 198 1). Most of the codons that are seldom used in E. coli genes are also rare among the pabB sequences. These include codons AGA/G, CGA (Arg), ATA (Ile), and CTA (Leu). The infrequent use of these codons is consistent with the low levels of corresponding cognate tRNA molecules in E. co/i (Ikemura 198 1).

Amino Acid Sequence Divergence among PS Components I

The amount of amino acid sequence divergence in pairwise comparisons of the enterobacterial PS Co1 sequences ranges from 23% to 26% (table 1). The total amount of dissimilarity among the PS Co1 sequences is high compared with that of other orthologous proteins from these organisms. For example, alignments of amino acid sequences encoded by E. coli and S. typhimurium up, metJ, trpB, trpG, ompA, araC, and trpE genes display OS%-12.5% divergence (see Cossart et al. 1986 and references within). Comparisons of amino acid sequences encoded by E. coli, 5’. typhimurium, and K. aerogenes trpA and pabA indicate 12.7%-16% and 15%-20% divergence, re- spectively (Nichols et al. 198 1 a; Kaplan et al. 1985). These data suggest that PS Co1 is able to tolerate a greater number of amino acid replacements than do several other proteins. As mentioned above, nearly two-thirds of the amino acid replacements occur in the N-terminal 40% of the sequence, and most of these are clustered into discrete regions. Analysis of the amino acid replacements within these regions does not reveal any bias for conservative or nonconservative replacements with respect to the four functional amino acid classes (i.e. nonpolar, uncharged polar, acidic, and basic). Pre-

540 Goncharoff and Nichols

sumably the N-terminal region of the polypeptide function as is the C-terminal segment.

is not as critically important for PS

Comparison of pabB 5 ‘-flanking Regions

An open reading frame 73 bp upstream and oriented in the opposite direction from E. coli pabB is conserved in all three organisms. Presumably the 73-bp intercis- tronic region shown in figure 4a contains two divergently oriented promoter sequences. S 1 nuclease mapping of E. coli RNA has identified the initiating point ofpabB mRNA near position -48. A sequence with similarity to the E. coli consensus promoter se- quence (Hawley and McClure 1983) lies immediately upstream. The relative positions of these sequences and the 5 ‘-terminus of an mRNA suggests that transcription ini- tiation occurs at position -48. The strong conservation of the -35 and - 10 regions in S. typhimurium and K. aerogenes suggests that these sequences serve as pabB pro- moters as well. It is interesting that the -35 region of these promoters lies within the upstream coding region.

Sl nuclease mapping has identified the 5’-termini of two oppositely oriented transcripts. The two transcripts initiate at positions -5 1 and +2 1. Each of these signals is preceded by sequences similar to the - 10 and -35 regions of the E. coli consensus promoter sequence (fig. 4a).

Comparison of the pabB 3’4lanking Regions

A comparison of pabB 3’4anking regions (fig. 4b) reveals another conserved open reading frame. The initiation codon of the downstream reading frame is GTG in E. coli and S. typhimurium, while the more common ATG codon is present in K. aerogenes. An intercistronic spacing of 3 bp is conserved between the two genes in all three organisms. Available sequence data indicate that the reading frame extends for at least 600 bp beyond the sequence shown in figure 4b. Transcriptional fusions between sequences downstream of pabB and ZacZ have been constructed (data not shown) and indicate that the region is transcribed into mRNA. The origin of the transcriptional activity may result either from a polycistronic message containing both pabB and downstream sequences or from transcription initiation with pabB. Taken together, the presence of transcription, the conservation of the coding region, and the presence of a Shine-Dalgarno sequence ( 5’-GGAG-3’) 9 bp upstream of the coding region strongly suggest that a protein product is encoded by sequences immediately down- stream of pabB.

PS and AS Co1 Sequence Comparisons

The amino acid sequence comparisons described above have been extended to include all known PS Co1 and AS Co1 sequences. These sequences include (1) the three enteric PS Co1 sequences resulting from this work and (2) AS Co1 from two enteric bacteria, two Gram-positive bacteria, and yeast. The alignment in figure 5 has been constructed manually and is based in part on alignments reported elsewhere (Goncharoff and Nichols 1984). Additional gaps were required to align AS Co1 se- quences. In all, 15 gaps have been inserted into the sequences.

Forty-six conserved amino acid residues are present among all eight Co1 sequences (fig. 5, boxed regions). The occurrence of identities among all of these sequences provides further support for the common ancestry ofpaminobenzoate and anthranilate synthases. A similar although more pronounced pattern of amino acid conservation is apparent in comparisons of E. coli PS Co1 and AS Co1 (Goncharoff and Nichols

E. s. X. 6. S. B. B. S.

20 40 coli pabB MKTLSPAVI typhimorium pabB MHKTLSPTVI aerogenes pabB MLSPAMI coli trpE MQTQKPTLELLTCEGAYRDNPTALFHQLCGDRPAT-LLLESADIDSKDDL typhiouriurtrp~MQTeKPTLELLTCDAAYRENPTALFHQVCGDRPAT-LLLESADIDSKDDL subtilis trpE MNFQSNISAFLEDS ferment8 turn trpE MSTNPHVFSLDVRYHEDASALFAHLGGTTADDAALLESADITTKNGI cerevisiae trp2 MTASIKIQPDIDSL- DQLQQQNDDSSINM

60 80 100 TLLWRQDAAEFYFSRLSHLPWAMLLHSGYADHPYSRFDIVVAEPICTLTTFGKETVVSES TLPWRPDAAEHYFAPVNHLPWAMLLHSGDAIHPYNRFDILVADPVTTLTTRAQETTVCTA SLPWRPDAAEYYFSPLSSQPWAMLLHSGFAEHAHNRFDIIVAQPRATLVTHGQLTTLREG KSLLLVDSALRITALGDTVTIQALSGNGESLLPLLDNALPAGVESEQSPNCRVLRFPPVS KSLLLVDSALRFTALGDTVTSQALSDNGASLLPLLDTALPAGVRNDVLPAGRVLRFPPVS LSHHTPIPIVETFTVDTLTPIQMIEKLDREITYLLESKDDTSTWSRTSFIGLNPFLTIKE SSLAVLKSSVRITCTGNTVVTQPLTDSGRAVVARLTQQLGQYNTAENTFS------FP~S YPVYAYLPSLDLTPHVAYLKLAQLNNPDRKESFLLESAKTNNELDRYSFIGISPRKTIKT

120 140 20 EKRTTTTDDPLQVLQQVLDRAD---------- IRPTHNEDLPFQGGALGLFGYDtGRRFB RTTTVTLDDPLHVLQTQLEALP-------- --FHPQPDPDLPFQGGALGLFGYDLGRRFE ETVSTSAADPLTLVHQQLAHCN---------- LQPQPHPHLPFLGGALGLFGYDLGRRFE PLLDEDARLCSLSVFDAFRLLQ--------- -NLLNVPKEEREAMFFSGLFSYDLVAGFE PLLDEDARLCSLSVFDAFRLLQ--------

II 1 i

--GVVNIPTQEREAMFFGGLFAYDLVAGFE EQGRFSAADQDSKSLYTGDELKEVLNWMNTTYKIKTPELGIPFVGGAVGYLSYDMIPLIR DAVDERERLTAPSTIEVLRKLQ---------FESGYSDASLPL---LMGGFAFDFLETFE CPTEGIETDPLEILEKEMSTFK-------- --VAENVPGLPKLSGGAIGYISYDCVRYFE

L_.l 180 200 220

SLPEI -AEQDIVLPDMAVGIYDWALIVDHQRHTVSLLSHNDV--------------NARR ILPDT -AARDIALPDMAIGLYDWALIVDHQKQVVSLISYHDA--------------DARY HLPAR -ADADIELPDMAVGIYDWALIVDHQRREVSLFSYDDP--------------QARL DLPQLSAENNC --PDFCFYLAETLMVIDHQKKSTRIQASLFAPNEEEKQRLTARLNELR- ALPHHEAGNNC --PDYCFYLAGTLMVIDHRKKSTRIQASLFTASDREKQRLNARLAYLS- PSVPSHTKETDM- EKCMLFVCRTLIAYDHETKNVHFIQYARLTGEETKNEKMDVFHQNH- TLPAV -EESVNTYPDYQFVLAEIVLDINHQDQTAKLTGVSNAPGELEAELNKLSLLIDAA PKTRRPLKDVLRLPEAYLMLCDTIIAFDNVFQRFQIIHNINTNETSLEEGYQAAAQII-T

-ar-L~-l

240 260 280 AWLESQQFSPQED-------------- FTLTSDWQSNMTREQYCEKFRQVQEYLHSGDCY RWLTSQRAPTRTP-------------- FRLTSAWQSNMTRCEYGEKFRQVQAWLHSGDCY AWLEAQTAPVAAT-------------- FTLTSAWRANMSREEYGEKFRQIQAYLHSGDCY QQLTEAAPPLPVV-------------- SVPHMRCBCNQSDEEFGGVVRLLQKAIRAGEIF QQLTQPAPPLPVT-------------- PVPDMRCECNQSDDAFGAVVRQLQKAIRAGEIF LELQNLIEKMMDQKNIKELFLSADSYKTPSFETVSSNYEKSAFMADVEKIKSYIKAGDIF LPATEHAYGTTPH-------------- DGDTLRVVADIPDAQFRTQINELKENIYNGDIY DIVSKLDRRFLANTIPEQPPIKPNQLLNRMWARKVTKITSPTLKKHIKKGDIIQGVPSQR

-a-

300 QVNLAQRFHA QVNLSQRFQA QVNLAQRFTA QVVPSRRFSL QVVPSRRFSL QCVLSQKFEV QVVPARTFTA VARPSRYILS

b- 360

-L 1

420 PEL PEL

_ 320 340 TYSGDEWQAFLQLNQAN-RAPFSAFLR-LEQG---AILSLSPERFILCDNS SYEGDEWQAFBRLNRANRAPFSAFLR-LHDG---AILSLSPERFIQLENG

II II II II TYRGDEWQAFRQLNRANRAPFSAFIR-LDEG---AVLSLSPERFIQLRQG

CT CT CT CT CT CT CT OT

--

PELFVVEPFPA ADLTKVDRYSY ADLTKVDRYSC PEFTKIVSFSH ADLLQVDRYSR DKLLTIQKFSH

d----I 4

DPQE DPQA DPEQ DRDL DRDL DKAB NDBL ATEE

YMFFMQ-DNDF- --TLFGASPESSLKY IYMFFMQ-DNDF- --TLFGASPESSLKY YMYYMK-LLDR- --EIVGSSPERLIHV

Ll YMFYIRGLNEGRSYELFGASPESNLKF YLFYIDCLDF----QIIGASPE-LLCK

-a--I l- ,J

-400

a

AWCCSIGYLSF AWCGSIGYLSF AWCGSIGYLSF SYGGAVGYFTA SYGGAVGYFTA

I ,

TYGGCIAYXGF SYGGAVGYLRG VYAGAVGHWSY

DAT DAA QDG TAA SDS

L-b

1a------1 l-*-L I

500

a- -b- 1

540 580 c-

-a

520 QIFCSAGGGIVADSQEEAEY QLYCSAGGGIVADSNEEAEY HLYCSAGGGIVADSEEAAEY IATVQAGAGVVLDSVPQSEA IATVRAOAGIVLDSVPQSEA VASIQAGAGIVADSVPEAEY VAAVQAGAGVVRDSNPQSEA ILTCKLAVVLFTIQLSTMNM

lb -

IFIVFE

Enteric pabB Sequences 543

1984). Most (43 of 46) of the amino acid identities occur in the C-terminal halves of PS and AS CoI. The absolute conservation of these residues within longer regions of less stringent conservation suggests that they are critical for enzyme structure and/or function.

Secondary-structure predictions (Garnier et al. 1978) were made for each of the eight amino acid sequences illustrated in figure 5. When the secondary-structure pre- dictions were aligned as in figure 5, several regions of consistent structure prediction were evident in many of the sequences. The consensus predictions for several regions are illustrated below the sequence alignments in figure 5. Structures are indicated only if six or more of the sequences predicted the same structure over four or more con- secutive residues. (This criterion also assures that at least one sequence of each group is represented in the consensus.) The exceptions to this guideline are the two p-turn segments at residues 338-340 and 459-46 1, where turns are predicted for three con- secutive residues. As with the sequence identities, most of the consistency in secondary- structure prediction lies in the C-terminal two-thirds of the sequence alignment. Ten of the identical residues lie within predicted u-helical structures, six are in regions predicted to be P-sheet structures, and two are in segments predicted to form P-turns. Ten additional conserved residues lie immediately adjacent to or one residue removed from segments predicted to form regular structure. Some of these are likely to be active-site residues that lie between regions of helices or sheets.

Discussion

The PS Co1 nucleotide and amino acid sequences reported here, along with the nucleotide and amino acid sequences of Escherichia coli, Salmonella typhimurium, and Klebsiella aerogenes PS Co11 reported elsewhere (Kaplan and Nichols 1983; Kaplan et al. 1985 ) , both establish the complete primary structure of PS from these organisms and elucidate the chromosomal organization of the two genes that encode the non- identical subunits of the PS enzyme complex. Comparison of PS and AS Co1 sequences illustrates the homologous and paralogous relationships among this family of ami- nobenzoate synthases.

If just the PS Co1 sequences are compared, it can be seen (table 1) that the amount of amino acid divergence in pairwise comparisons (23%-26%) is slightly higher than divergence of PS Co11 ( 15%-20% ) . Most amino acid replacements among the PS Co1 sequences occur within the N-terminal segment of the polypeptide, suggesting that, if the N-terminal portion of PS Co1 is essential to structure or function, then that function is not largely dependent on specific amino acid sequences. In contrast, amino acid replacements among the PS Co11 sequences occur in six regions that are essentially evenly distributed throughout the sequence alignment.

FIG. 5.-Alignment of PS Co1 (top three) and AS Co1 (bottom five) amino acid sequences. Hyphens indicate gaps introduced to increase the amount of identity between sequences. Numbering is by position in the alignment rather than in reference to any particular sequence. Boxes are drawn around residues identical in all eight sequences. Consensus secondary structures (predicted for six or more of the eight sequences) are shown below the sequence alignment where a indicates alpha helix, B indicates sheet, and T indicates beta turn. Decision constants of the Gamier prediction were as follows: u-helical, DC = -75 cnat; p-conformation, DC = -88 cnat; turn, DC = 0 cnat. Published sequences are from the following sources: E. coli PS CoI, Kaplan and Nichols (1983); E. coli AS CoI, Nichols et al. (198 1); S. typhimurium AS CoI, Yanofsky and vancleemput (1982); Bacillus subtilis AS CoI, Henner et al. (1984); Brevibacterium lacto- fermenturn AS CoI, Matsui et al. (1986); Saccharomyces cerevisiae AS CoI, Zalkin et al. (1984).

--- __ __ _ -- - --__

544 Goncharoff and Nichols

The genes encoding PS Co1 and Co11 (pabB and pabA, respectively), are unlinked in E. coli. pabB, at 40 min on the chromosome, appears to be the first gene encoded on a polycistronic transcript that contains at least two genes. pabA, located at 74 min on the chromosome, appears to be encoded as the last gene of a transcript containing other genes of unknown function (Kaplan et al. 1985). At present it is not known whether the expression of pabB and pabA is coordinated to produce similar levels of polypeptides for the PS complex.

In cross-species comparisons, it is evident that Co1 and Co11 arose from separate ancestral sequences. The amount of similarity between E. coli PS Co1 and AS Co1 (26% at the amino acid level) is less than the amino acid similarity present between PS Co11 and AS Co11 (44%). This is likely due to the fact that PS and AS Co11 are small proteins that have identical roles of transferring the amido group of glutamine to the Co1 subunit of each respective enzyme complex. The lower amount of similarity present between PS Co1 and AS Co1 may be due to differences in catalytic and regulatory functions.

We have elsewhere proposed that the C-terminal conservation of AS Co1 and PS Co1 sequences reflect common elements in the conversion of chorismate to aminoben- zoates and that the N-terminal divergence reflects differences in feedback inhibition or subunit interactions. Results from chemical modification experiments of Serratia marcescens AS Co1 have implicated a single histidine, one arginine, and one cysteine residue as essential for enzymatic activity ( Tso and Zalkin 198 1). If AS Co1 sequences alone are considered, no conserved histidine, arginine, or cysteine residues are found prior to position 365. A good candidate for an essential histidine residue is His-43 1, since it is the only histidine residue conserved among all the AS Co1 sequences and is found also in the PS Co1 sequences. Similarly, the presence of six absolutely conserved arginine residues at positions 365, 403, 408, 466, 479, and 503 suggests that one of these residues may correspond to the essential arginine identified by chemical modi- fication. The location of the proposed essential cysteine residue of S. marcescens AS Co1 had been determined elsewhere by sequencing a tryptic peptide containing this residue (Tso and Zalkin 198 1). Comparison of the S. marcescens peptide sequence to E. coli AS Co1 identifies the “essential” cysteine residue as that at position 4 10. However, the cysteine residue at 4 10 is not conserved among all the AS Co1 sequences; cysteine is replaced with alanine and serine in the Bacillus subtilis and Brevibacterium Zactofermentum sequences, respectively. The data suggest that a cysteine residue at this position is not an obligate requirement for the function of AS CoI. It is interesting that this cysteine residue in AS Co1 occurs near the highly conserved sequence- VDLARND-at positions 399-405. It is possible that the reactivity of the S. marcescens AS Co1 cysteine residue is a result of its proximity to an active site. Chemical modi- fication at this site might result in interference with the active site and concomitant loss of catalytic activity.

The N-terminus of B. Zactofermentum AS Co1 has been shown to contain se- quences essential for feedback inhibition by tryptophan (Matsui, et al. 1987). Matsui et al. (1987) have suggested that the sequence -LLES- (positions 38-41) is essential to the inhibition mechanism. Experiments indicating tryptophan feedback inhibition of all of the AS Co1 proteins (E. coli [ Ito et al. 19681, S. typhimurium [ Zalkin and Ring 19681, B. subtilis [Nester and Jensen 19661, B. Zactofermentum [ Matsui et al. 19871, and S. cerevisiae [ Miozzari et al. 1978 ] ) presented in figure 5 have been reported, and the -LLES- sequence is found in each (in the E. coli and S. typhimurium sequences at positions 38-41; in the B. subtilis and S. cervisiae sequence at positions 84-87).

Enteric pabB Sequences 545

While the identification of this short sequence suggests a role for N-terminal sequences in feedback inhibition, it does not explain the general lack of similarity among the N- terminal of AS Co1 from a variety of organisms. Thus, the low amount of similarity among the N-termini of AS Co1 suggests that, even though feedback inhibition by tryptophan has been conserved among these proteins, an extensive sequence require- ment for this function has not been retained.

It has been proposed that the development of new enzyme functions is most easily achieved by recruiting proteins that already exist and catalyze similar reactions (Jensen 1976). Indeed, laboratory experiments have been reported in which such recruitment had occurred under artificial selection (Clarke 1974; Lin et al. 1976). In this regard, it has been found that glutamine amidotransferases with different metabolic roles share similarities with PS Co11 and AS Co11 (Piette et al. 1984; Kaplan et al. 1985; Zalkin et al. 1985). These proteins constitute a family of enzymes that utilize the amide group of glutamine in the biosynthesis of various compounds. From the available data, it is not apparent that PS Co1 and AS Co1 belong to a family of related proteins. If such a family exists, likely candidates might include other chorismate- utilizing enzymes that may share catalytic properties with AS and PS. Candidates include enzymes such as the chorismate isomerase activities encoded by pheA and tyrA, isochorismate synthase encoded by entC, or chorismate lyase encoded by ubiC. The deduced amino acid sequences encoded by pheA and tyrA have been reported elsewhere (Hudson and Davidson 1984)) but no similarities have been observed in sequence alignments of these two proteins with PS Co1 and AS Co1 (data not shown). Nucleotide and amino acid sequence data are not yet available for entC or ubiC.

The similarity of reaction mechanism, nucleotide and amino acid sequences, and predicted polypeptide structures strongly suggests that PS and AS reflect a natural system of acquisitive evolution that has occurred following the duplication of genes encoding an ancestral aminobenzoate synthase complex. Following an initial gene duplication event, divergence of the Co1 and Co11 genes could have proceeded by modifications resulting in alteration of the reaction chemistry and subunit contact sites. Since PS and AS Co11 are members of a larger paralogous family, the duplication scheme is likely to have occurred more than once in the evolution of E. coli. It will be interesting to see whether the same is true for the gene encoding CoI.

Acknowledgments

We would like to thank Robert Lawther and Ronald Law for gifts of Salmonella typhimurium and KZebsieZZa aerogenes libraries, respectively. We thank Malcolm Winkler for assistance with the Intelligenetics sequence analysis software. We also thank Fernando Esclopis and Michael Skalon for technical assistance. This work was supported by Public Health Service grant AI1 8639 from the National Institutes of Health.

LITERATURE CITED

BIRNBOIM, H. C., and J. DOLY. 1979. A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res. 7: 15 13- 1523.

BUVINGER, W. E., L. C. STONE, and H. E. HEATH. 198 1. Biochemical genetics of tryptophan synthesis in Pseudomonas acidovorans. J. Bacterial. 147:62-68.

CLARKE, P. H. 1974. Evolution of enzymes for the utilization of novel substrates. Pp. 183-2 17 in M. J. CARLIDE and J. J. SKEHEL, eds. Evolution in the microbial world. Symposia of the Society for General Microbiology. Vol. 24. Cambridge University Press, London.

546 Goncharoff and Nichols

COSSART, P., E. A. GROISMAN, M. SERRE, M. J. CASADABAN, and B. GICQUEL-SANZEY. 1986. crp Genes of Shigella jlexneri, Salmonella typhimurium, and Escherichia coli. J. Bacterial. 167:639-646.

DALE, R. M. K., B. A. MCCLURE, and J. P. HOUCHINS. 1985. A rapid single stranded cloning strategy for producing a sequential series of overlapping clones for use in DNA sequencing: application to sequencing the corn mitochondrial 18s rDNA. Plasmid 13:3 l-40.

GARNIER, J., D. J. OSGUTHORPE, and B. ROBSON. 1978. Analysis of the accuracy and implications of predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.

GONCHAROFF, P., and B. P. NICHOLS. 1984. Nucleotide sequence of Escherichia coli pabB indicates a common evolutionary origin of p-aminobenzoate synthetase and anthranilate synthetase. J. Bacterial. 159:57-62.

GRANTHAM, R., C. GAUTIER, M. Gow, M. JACOBZONE, and R. MERCIER. 198 1. Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 9:r43-r74.

GUERRY, P., D. J. LEBLANC, and S. FALKOW. 1973. General method for the isolation of plasmid deoxyribonucleic acid. J. Bacterial. 116: 1064- 1066.

HAWLEY, D. K., and W. R. MCCLURE. 1983. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 11:2237-2255.

HENNER, D. J., L. BAND, and H. SHIMOTSU. 1984. Nucleotide sequence of the Bacillus subtilis tryptophan operon. Gene 34: 169- 177.

HORI, H., and S. OSAWA. 1979. Evolutionary change in 5s RNA secondary structure and a phylogenetic tree of 54 5s RNA species. Proc. Natl. Acad. Sci. USA 76:381-385.

HUANG, M., and J. PITTARD. 1967. Genetic analysis of mutant strains of Escherichia coli requiring p-aminobenzoic acid for growth. J. Bacterial. 93: 1938- 1942.

HUDSON, G. S., and B. E. DAVIDSON. 1984. Nucleotide sequence and transcription of the phenylalanine and tyrosine operons of Escherichia coli K12. J. Mol. Biol. 180: 1023-105 1.

IKEMURA, T. 198 1. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J. Mol. Biol. 151:389-409.

ITO, J., E. C. Cox, and C. YANOFSKY. 1968. Anthranilate synthetase, an enzyme specified by the tryptophan operon of Escherichia coli: purification and characterization of component I. J. Bacterial. 97:725-733.

JENSEN, R. A. 1976. Enzyme recruitment in evolution of new function. Annu. Rev. Microbial. 30:409-425.

KANE, J. F. 1977. Regulation of a common amidotransferase subunit. J. Bacterial. 132:419- 425.

KAPLAN, J. B., P. GONCHAROFF, A. M. SEIBOLD, and B. P. NICHOLS. 1984. Nucleotide sequence of the Acinetobacter calcoaceticus trpGDC gene cluster. Mol. Biol. Evol. 1:456-472.

KAPLAN, J. B., W. K. MERKEL, and B. P. NICHOLS. 1985. Evolution of glutamine amidotrans- ferase genes: nucleotide sequences of the pabA genes from Salmonella typhimurium, Klebsiella aerogenes and Serratia marcescens. J. Mol. Biol. 183:327-340.

KAPLAN, J. B., and B. P. NICHOLS. 1983. Nucleotide sequence of Escherichia coli pabA and its evolutionary relationship to trp(G)D. J. Mol. Biol. 168:45 l-468.

LIN, E. C. C., A. J. HACKING, and J. AGUILAR. 1976. Experimental models of acquisitive evolution. Bioscience 26:548-555.

MANDEL, M., and A. HIGA. 1970. Calcium-dependent bacteriophage DNA infection. J. Mol. Biol. 53: 159- 162.

MANIATIS, T., E. F. FRITCH, and J. SAMBROOK. 1982. Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

MATSUI, K., K. MIWA, and K. SANO. 1987. Two single-base-pair substitutions causing desen- sitization to tryptophan feedback inhibition of anthranilate synthase and enhanced expression of tryptophan genes of Brevibacterium lactofermentum. J. Bacterial. 169:5330-5332.

MATSUI, K., K. SANO, and E. OHTSUBO. 1986. Complete nucleotide and deduced amino acid

Enteric pabB Sequences 547

sequences of the Brevibacterium lactofermentum tryptophan operon. Nucleic Acids Res. 14: 10113-10114.

MAXAM, A. M., and W. GILBERT. 1980. Sequencing end-labeled DNA with base-specific chem- ical cleavages. Methods Enzymol. 65:499-560.

MESSING, J., R. CREA, and P. H. SEEBURG. 198 1. A system for shotgun DNA sequencing. Nucleic Acids Res. 9:309-32 1.

MESSING, J., and J. VIEIRA. 1982. A new pair of Ml3 vectors for selecting either DNA strand of double-digest restriction fragments. Gene 19:269-276.

MIOZZARI, G. F., P. NIEDERBERGER, and R. HUTTER. 1978. Tryptophan biosynthesis in Sac- charomyces cerevisiae: control of the flux through the pathway. J. Bacterial. 134:48-59.

NESTER, E. W., and R. A. JENSEN. 1966. Control of aromatic acid biosynthesis in Bacillus subtilis: sequential feedback inhibition. J. Bacterial. 91: 1594- 1598.

NICHOLS, B. P., M. BLUMENBERG, and C. YANOFSKY. 198 1. Comparison of the nucleotide sequence of trpA sequences immediately beyond the trp operon of KZebsieZla aerogenes, Salmonella typhimurium and Escherichia coli. Nucleic Acids Res. 9: 1743- 1755.

NICHOLS, B. P., M. VANCLEEMPUT, and C. YANOFSKY. 198 1. Nucleotide sequence of Escherichiu coli trpE: anthranilate synthetase Component I contains no tryptophan residues. J. Mol. Biol. 146:45-54.

PIETTE, J., H. NYUNOYA, C. J. LUSTY, R. CUNIN, G. WEYENS, M. CRABEEL, D. CHARLIER, N. GLANSDORFF, and A. PIERARD. 1984. DNA sequence of the curA gene and the control region of carAB: tandem promoters, respectively controlled by arginine and the pyrimidines, regulate the synthesis of carbamoyl-phosphate synthetase in Escherichiu coZi K-12. Proc. Natl. Acad. Sci. USA 81:4 134-4 138.

REINERS, J. J., L. J. MESSENGER, and H. ZALKIN. 1978. Immunological cross-reactivity of Escherichiu coli anthranilate synthetase, glutamate synthase and other proteins. J. Biol. Chem. 253:1226-1233.

SANGER, F., S. NICKLEN, and A. R. COULSON. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467.

SAWULA, R. V., and I. P. CRAWFORD. 1973. Anthranilate synthetase of Acinetobacter calcouc- eticus. J. Biol. Chem. 248:3573-3581.

STANLEY, J. 1983. Infectivity of the cloned geminivirus genome requires sequences from both DNAs. Nature 305:643-645.

TENG, C. P., B. GANEM, S. Z. DOKTOR, B. P. NICHOLS, R. K. BHATNAGAR, and L. C. VINING. 1985. Total synthesis of +/- 4-amino-4-deoxychorismic acid: a key intermediate in the biosynthesis of para-aminobenzoic acid and L-paru-aminophenylalanine. J. Am. Chem. Sot. 107:5008-5009.

Tso, J. Y., and H. ZALKIN. 198 1. Chemical modifications of Serrutia marcescens anthranilate synthase component I. J. Biol. Chem. 256:9901-9908.

VOGEL, H. J., and D. M. BONNER. 1956. Acetylornithinase of Escherichia coli: partial purification and some properties. J. Biol. Chem. 218:97-106.

YANISCH-PERRON, C., J. VIEIRA, and J. MESSING. 1985. Improved Ml 3 phage cloning vectors and host strains: nucleotide sequences of the M 13mp 18 and pUC 19 vectors. Gene 33: 103- 119.

YANOF~KY, C., and M. VANCLEEMPUT. 1982. Nucleotide sequence of trpE of S’almonelZu ty- phimurium and its homology with the corresponding sequence of Escherichia coli. J. Mol. Biol. 155:235-246.

ZALKIN, H., P. ARGOS, S. NARAYANA, A. A. TIEDEMAN, and J. N. SMITH. 1985. Identification of a trpG-related glutamine amide transfer domain in Escherichia coli GMP synthetase. J. Biol. Chem. 260:3350-3354.

ZALKIN, H., and D. KING. 1968. Anthranilate synthetase: purification and properties of com- ponent I from Salmonella typhimurium. Biochemistry 7:3566-3573.

Goncharoff and Nichols

ZALKIN, H., J. L. PALUH, M. VANCLEEMPUT, W. S. MOYE, and C. YANOFSKY. 1984. Nucleotide sequence of Saccharomyces cerevisiae genes TRP2 and T’P3 encoding bifimctional an- thranilate synthase:indole-3-glycerol phosphate synthase. J. Biol. Chem. 259:3985-3992.

WALTER M. FITCH, reviewing editor

Received February 26, 1988; revision received April 22, 1988