Rare codon priority and its position specificity at the 5′ of the gene modulates heterologous...

6
Rare codon priority and its position specificity at the 5 0 of the gene modulates heterologous protein expression in Escherichia coli Chinnathambi Thangadurai, Pichaimuthu Suthakaran, Pankaj Barfal, Balaiah Anandaraj, Satya Narayan Pradhan, Harith Kamil Boneya, Subramanian Ramalingam, Vadivel Murugan * Genetic Engineering Unit, Centre for Biotechnology, Sadar Patel Road, Anna University, Guindy, Chennai-600 025, India article info Article history: Received 24 August 2008 Available online 16 September 2008 Keywords: Rare codon Streptokinase AGG codon E.coli Over-expression abstract Rare codons and their effects in heterologous protein expression in Escherichia coli were addressed by many investigators. Here, we propose that not all rare codons of a foreign gene have negative effect but selective codon among them and its specific position in the downstream of the start codon modulates the expression. In our study, streptokinase (47 kDa), encoded by skc gene of Streptococcus equisimilis was expressed in E.coli. The analysis of relative codon frequency of skc gene in E.coli reveals the presence of 30% of rare codons in it. Nevertheless, E.coli managed to yield over-expression of this target protein. To explore the codon bias in expression, we have introduced the selective AGG codon at different positions of skc gene such as +2,+3,+5,+8,+9 and +11. The results revealed that at +2 position ‘‘AGG” aided over- expression while shifting to +3 and +5 positions it rendered nil expression. In contrary, shifting of AGG codon to later positions like +9 and +11 the inhibitory effect was reversed and resulted in over-expres- sion. The effect of ‘AGG’ rare codon was further studied in GFP expression. In conclusion, besides the choice of rare codons, their precise positions in the foreign gene dictate the level of protein expression. Ó 2008 Elsevier Inc. All rights reserved. The heterologous protein expression has brought radical changes in recombinant protein production. The large scale production of therapeutic, diagnostic and industrially important proteins or enzymes greatly depend on the protein expression strategies. The prime objective of heterologous protein expression is to obtain higher yield of target proteins [1]. But not all proteins are successfully expressed in Escherichia coli. Many factors such as vector and host, strength of promoters, inducers concentration, media composition etc are known to have influence in protein expression [2]. Despite the fact, it is difficult to draw a generalized conclusion which is universally applicable for the expression of any protein. Expression ability of each gene is unique in heterolo- gous host. Even though all the above variables were made as con- stant, in some cases the target protein expression was not achieved. In those cases, the nucleotide sequences or codon con- tents of the target genes were identified to play a role in deciding the expression status [3–5]. The codons that lie immediately downstream of the initiation codons play a decisive role in target protein expression [6–11]. The presence of ‘‘AAA” codon at +2 po- sition of the gene was reported to maximize the expression [12] whereas ‘‘NGG” codon at +2 position was found to have negative effect [13–15]. In addition to the preference for certain codons in early position of the gene, the codon usage that occurs in the gene in relation with the heterologous host was found to influence the expression level predominantly [16]. In this context the codon composition of the heterologous gene may be optimized for the expression host. For instance, the codon optimization of the archaeal gene was found to increase the protein yield in E.coli [6]. It was also noticed that the repeated CAT codons downstream of the NGG codon has synergistically enhanced the protein expression in E.coli [17]. In our study, we have over-expressed the Streptokinase (SK) in E.coli. It is 47 kDa protein encoded by skc gene of Streptococcus equisimilis which is a gram positive bacterium. Though skc gene is known to have many rare codons in its composition, over- expression was achieved instead of having negative effect. This gives an insight that although some codons (rare) are less fre- quently used in E.coli, all of them are not detrimental in target protein expression. To understand this, we have introduced the known inhibitory ‘‘AGG” codon in the early positions of skc gene which are otherwise not present in the skc gene. The substitution- driven effect of ‘AGG’ codon at early positions of Streptokinase (SK) and Green Fluorescent Protein (GFP) was experimentally analyzed. From our study it is clear that priority of AGG codon among the rare codons as well as its position at the proximity of the initiation codon plays a significant role in target protein expression. 0006-291X/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.bbrc.2008.09.024 * Corresponding author. Fax: +91 44 22350229. E-mail address: [email protected] (V. Murugan). Biochemical and Biophysical Research Communications 376 (2008) 647–652 Contents lists available at ScienceDirect Biochemical and Biophysical Research Communications journal homepage: www.elsevier.com/locate/ybbrc

Transcript of Rare codon priority and its position specificity at the 5′ of the gene modulates heterologous...

Biochemical and Biophysical Research Communications 376 (2008) 647–652

Contents lists available at ScienceDirect

Biochemical and Biophysical Research Communications

journal homepage: www.elsevier .com/locate /ybbrc

Rare codon priority and its position specificity at the 50 of the genemodulates heterologous protein expression in Escherichia coli

Chinnathambi Thangadurai, Pichaimuthu Suthakaran, Pankaj Barfal, Balaiah Anandaraj,Satya Narayan Pradhan, Harith Kamil Boneya, Subramanian Ramalingam, Vadivel Murugan *

Genetic Engineering Unit, Centre for Biotechnology, Sadar Patel Road, Anna University, Guindy, Chennai-600 025, India

a r t i c l e i n f o

Article history:Received 24 August 2008Available online 16 September 2008

Keywords:Rare codonStreptokinaseAGG codonE.coliOver-expression

0006-291X/$ - see front matter � 2008 Elsevier Inc. Adoi:10.1016/j.bbrc.2008.09.024

* Corresponding author. Fax: +91 44 22350229.E-mail address: [email protected] (V. Muru

a b s t r a c t

Rare codons and their effects in heterologous protein expression in Escherichia coli were addressed bymany investigators. Here, we propose that not all rare codons of a foreign gene have negative effectbut selective codon among them and its specific position in the downstream of the start codon modulatesthe expression. In our study, streptokinase (47 kDa), encoded by skc gene of Streptococcus equisimilis wasexpressed in E.coli. The analysis of relative codon frequency of skc gene in E.coli reveals the presence of30% of rare codons in it. Nevertheless, E.coli managed to yield over-expression of this target protein. Toexplore the codon bias in expression, we have introduced the selective AGG codon at different positionsof skc gene such as +2,+3,+5,+8,+9 and +11. The results revealed that at +2 position ‘‘AGG” aided over-expression while shifting to +3 and +5 positions it rendered nil expression. In contrary, shifting of AGGcodon to later positions like +9 and +11 the inhibitory effect was reversed and resulted in over-expres-sion. The effect of ‘AGG’ rare codon was further studied in GFP expression. In conclusion, besides thechoice of rare codons, their precise positions in the foreign gene dictate the level of protein expression.

� 2008 Elsevier Inc. All rights reserved.

The heterologous protein expression has brought radicalchanges in recombinant protein production. The large scaleproduction of therapeutic, diagnostic and industrially importantproteins or enzymes greatly depend on the protein expressionstrategies. The prime objective of heterologous protein expressionis to obtain higher yield of target proteins [1]. But not all proteinsare successfully expressed in Escherichia coli. Many factors such asvector and host, strength of promoters, inducers concentration,media composition etc are known to have influence in proteinexpression [2]. Despite the fact, it is difficult to draw a generalizedconclusion which is universally applicable for the expression ofany protein. Expression ability of each gene is unique in heterolo-gous host. Even though all the above variables were made as con-stant, in some cases the target protein expression was notachieved. In those cases, the nucleotide sequences or codon con-tents of the target genes were identified to play a role in decidingthe expression status [3–5]. The codons that lie immediatelydownstream of the initiation codons play a decisive role in targetprotein expression [6–11]. The presence of ‘‘AAA” codon at +2 po-sition of the gene was reported to maximize the expression [12]whereas ‘‘NGG” codon at +2 position was found to have negativeeffect [13–15].

ll rights reserved.

gan).

In addition to the preference for certain codons in early positionof the gene, the codon usage that occurs in the gene in relationwith the heterologous host was found to influence the expressionlevel predominantly [16]. In this context the codon compositionof the heterologous gene may be optimized for the expression host.For instance, the codon optimization of the archaeal gene wasfound to increase the protein yield in E.coli [6]. It was also noticedthat the repeated CAT codons downstream of the NGG codon hassynergistically enhanced the protein expression in E.coli [17].

In our study, we have over-expressed the Streptokinase (SK) inE.coli. It is 47 kDa protein encoded by skc gene of Streptococcusequisimilis which is a gram positive bacterium. Though skc geneis known to have many rare codons in its composition, over-expression was achieved instead of having negative effect. Thisgives an insight that although some codons (rare) are less fre-quently used in E.coli, all of them are not detrimental in targetprotein expression. To understand this, we have introduced theknown inhibitory ‘‘AGG” codon in the early positions of skc genewhich are otherwise not present in the skc gene. The substitution-driven effect of ‘AGG’ codon at early positions of Streptokinase(SK) and Green Fluorescent Protein (GFP) was experimentallyanalyzed. From our study it is clear that priority of AGG codonamong the rare codons as well as its position at the proximityof the initiation codon plays a significant role in target proteinexpression.

648 C. Thangadurai et al. / Biochemical and Biophysical Research Communications 376 (2008) 647–652

Materials and methods

Plasmid, genes and reagents. The expression vector pRSETB andBL21 (DE3) were acquired from Invitrogen technologies (Cat. No.V351-20). The restriction enzymes NdeI (Cat. No. #R0111L) andEcoRI (Cat. No. #R0101L) were acquired from New England Biolabs.The PCR reaction was performed using the Proof Reading Polymer-ase (Vent DNA Polymerase—Cat. No. #M0254S) obtained from NewEngland Biolabs. The Streptokinase gene (Skc) [GeneBank AccessionNo. K02986] was obtained as a gift from Dr. K.J. Mukherjee, Centrefor Biotechnology, Jawaharlal Nehru University. The GFP gene (Cat.No. 107387) was procured from Bangalore Genie, Bangalore, India.The Gene bank accession number of GFP is EU048697. LB mediumused for culturing of E.coli was obtained from Hi Media, Laborato-ries, Mumbai, India. Primers synthesis and DNA sequencing weredone at Microsynth Corporation, Switzerland.

Cloning of SK and GFP genes and site-directed mutagenesis. Thevector used for this study is pRSETB which is a T7 based expressionvector. The NdeI and EcoRI restriction sites of this vector wereexploited through out the cloning work of this study. All the con-structs were made only in between these two sites.

PCR based site-directed mutagenesis was carried out for makingthe codon substitution. Towards this, primers were designed andsynthesized with mismatch nucleotides at the necessary positionsof the genes (Tables 1 and 2).

The gene, skc encoding streptokinase was cloned in the pRSETBvector in between NdeI and EcoRI restriction sites. The recombi-nant construct is named as pRSET-SK. The SKM1, SKM2 andSKM3 constructs were generated by substitution of AGG codon at+2, +3 and +5 positions of the SK gene, respectively. The AGG codonwas then shifted from the previous positions to +8, +9 and +11 byinserting a common sequence, (CAT)6 immediately next to the initi-ation codon. The resultant constructs were named as SKM1-His,SKM2-His and SKM3-His, respectively. Similarly, the SKM1-Tyrand SKM3-Tyr were constructed by inserting (TAC)6 sequenceimmediately next to initiation codon of the SKM1 and SKM 3mutants (Fig. 1). The GFP M2, M3and M4 construct were generatedby inserting the AGG codon at the +2, +3 and +5 position of the GFPgene sequence (Fig. 1). The mutant constructs were confirmed afterDNA sequencing from Microsynth Corporation, Switzerland.

Relative codon frequency analysis. The relative codon frequencyanalysis for the SK gene was done using graphical codon usage ana-lyzer [18] using the option ‘each triplet position vs. usage table’ ofthe program. This tool is a component of ExPasy analysis toolwhich is accessible at http://www.gcua.schoedl.de/.

Table 1List of Primers

Sl. No Name of Primer

1. SK F12. SKM F13. SKM F24. SKM F35. SKM1 His-F6. SKM1 Tyr-F7. SKM2 His-F8. SKM3 His-F9. SKM3 Tyr-F10. GFP F111. GFPM F212. GFPM F313. GFPM F414. SK R1*

15. GFP R1**

The table lists the primers that were used for the sub-cloning and site directed mutagenprimer sequence. Only one reverse primer for SK indicated as *that was used in the combused in this study which is indicated as**. F represents forward primer and R represents

Expression analysis of the recombinant constructs. The constructswere transformed into BL21 (DE3) as per Mandel and Higa protocol[19] and the transformants were grown in LB medium till 0.6 OD atA600. The cultures were induced with 1 mM IPTG at 37 �C for 3 h.The expression profile of the SK and GFP samples were analyzedby resolving in 10% SDS–PAGE as per Laemmli protocol [20]. Fur-ther, the gels were stained with Coomassie Brilliant Blue G-250.

Results

Analysis of relative codon frequency and SK expression

The relative codon frequency analysis has revealed that theStreptokinase gene (skc) of S. equisimilis is harboring 30% of rarecodons for E.coli (Fig. 2) host. Among them, some of the codonswere used less than 20% and 10% by the host translational machin-ery which is shown in blue and red colored bars respectively in theFig. 2. Surprisingly, the gene was known to have consecutive rarecodons even at early positions like +3, +4 and +5 (GCT, GGA, CCT)and several positions in the later part of the gene. Despite the pres-ence of 30% rare codons, the gene favored over-expression of SK inE.coli BL21 (DE3) as shown in SDS–PAGE (Fig. 3A) which was fur-ther confirmed by Western blotting (Fig. 3B).

Mutation of skc gene by ‘AGG’ codon Substitution

The original codons of the skc gene at +2, +3 and +5 positionswere replaced with the ‘AGG’ codon by site-directed mutagenesis.These constructs were designated as SKM1, SKM2 and SKM3respectively. The effect of the ‘AGG’ codon on target proteinexpression is shown in Fig. 4A (lane 3, 6 & 8). The expressionanalysis revealed that the ‘AGG’ codon at +2 position aided over-expression, whereas at +3 and +5 positions it exhibited a negativeeffect in expression (nil expression).

Cloning and expression of SKM1-His, SKM2-His, SKM3-His

To understand the impact of early occurrence of ‘AGG’ codon ontarget protein expression we have shifted the ‘AGG’ codon to laterpositions of the gene. To achieve this, we have inserted (CAT) 6 inbetween the initiation and the +2 codons which in turn increasedthe physical distance of the ‘AGG’ codon from the initiation site.The insertion of (CAT)6 in all three constructs namely SKM1,SKM2 and SKM3 has shifted the ‘AGG’ codon to +8, +9 and +11positions (SKM1-His, SKM2-His and SKM3-His), respectively. The

Primer Sequence

50 GGGATTCCATATGATTGCTGGACCTGAG 30

50 GGGATTCCATATGAGGGCTGGACCTGAG 30

50 GGGATTCCATATGATTAGGGGACCTGAG 30

50 GGGATTCCATATGATTGCTGGAAGGGAGTGGCTGCTA 30

50GGGATTCCATATGCATCATCATCATCATCATAGGGCTGGACCTGAG 30

50GGGATTCCATATGTACTACTACTACTACTACAGGGCTGGACCTGAG 30

50GGGATTCCATATGCATCATCATCATCATCATATTAGGGGACCTGAG 30

50GGGATTCCATATGCATCATCATCATCATCATATTGCTGGAAGGGAGT GGCTGCTA 30

50GGGATTCCATATGTACTACTACTACTACTACATTGCTGGAAGGGAGT GGCTGCTA 30

50 GGGATTCCATATGGTGAGCAAGGGCGAG 30

50 GGGATTCCATATGAGGAGCAAGGGCGAG 30

50 GGGATTCCATATGGTGAGGAAGGGCGAGGAG 30

50 GGGATTCCATATGGTGAGCAAGAGGGAGGAGCTGTTC 30

50 CCGGAATTCTTATTTGTCTTTAGG 30

50 CCGGAATTCTCACTCGTCCATGCCGAG 30

esis of the SK and GFP in pRSETB vector. The restriction sites are italicized in eachination with all the forward primers. Similarly, only one reverse primer for GFP wasreverse primer.

Table 2Nucleotide sequence of the constructs

The nucleotide sequences from +1 to +11 codons of all the constructs of SK and GFP. The boxes that are colored with yellow denote the rare codons that originally present inthe skc gene. The boxes that are colored in blue denote the substitution of ‘AGG’ codon by site-directed mutagenesis.

pRSETB GFP (Over expression)

pRSETB SK

GFP M2 GFP M4GFP M3

SKM 3-Tyr

SKM 1 SKM 2 SKM 3

SKM1-His SKM1-Tyr SKM 3-HisSKM 2-His

(AGG at +2) (AGG at +3) (AGG at +5)

(AGG at +5)(AGG at +3)(AGG at +2)

Shifting of ‘AGG’ codon to later positions by insertion of(CAT)6 & (TAC)6 sequences respectively

Streptokinase

Influence of early codons demonstrated by ‘AGG’ codon substitution at early positionsby Site-directed mutagenesis

GFP

(AGG at +8) (AGG at +9) (AGG at +11)

(Over -expression)

(Over expression)

(Moderateexpression)

(Over -expression)

(Moderateexpression)

(Nil expression)

(Mildexpression)

(Mild expression)

(Nil expression)

(Over expression)

-(Over expression)

-His SKM1-Tyr SKM 3-SKM 2-

(AGG at +2) (AGG at +3) (AGG at +5)

(AGG at +5)(AGG at +3)(AGG at +2)

Shifting of ‘AGG’ codon to later positions by insertion of(CAT)6 & (TAC)6 sequences respectively

Streptokinase

Influence of early codons demonstrated by ‘AGG codon substitution at early positionsby Site-directed mutagenesis

GFP

(AGG at +8)

(Over expression)

(AGG at +9) (AGG at +11)

(Ov (Nil expression)

(Mildexpression)

(Mild expression)

(Nil expression)

Fig. 1. Cloning strategy and mutation studies to demonstrate the rare codon influence on target protein expression. The flowchart is showing the progress of this study.Initially, skc gene was cloned in the pRSETB vector and it is designated as pRSETB-SK. To understand the effect of rare codon AGG, it was inserted at +2, +3 and +5 positions ofthe over-expressing construct pRSETB-SK and the clones were designated as SKM1, SKM2 and SKM3 respectively. SKM1 exhibited over-expression where as there was noexpression in SKM2 and SKM3. In order to increase the physical distance of the AGG codon, (CAT)6 and (TAC)6 sequences were inserted in the SKM1, SKM2 and SKM3 andrecombinant constructs were designated SKM1-His, SKM1-Tyr, SKM2-His, SKM3-His and SKM3-Tyr, respectively. The shifting of the ‘AGG’ codon to later position (+9 and+11) by insertion of (CAT)6 and (TAC)6 sequences has restored the expression of SK. Similar experiments were conducted with GFP (GFP M2, GFP M3 and GFP M4) todemonstrate the inhibitory effect of ‘AGG, codon.

C. Thangadurai et al. / Biochemical and Biophysical Research Communications 376 (2008) 647–652 649

F R E Q U E N C Y

%

Fig. 2. Graph showing the relative codon frequency (RCF) of Streptokinase for E.coli. The RCF index of Streptokinase was analyzed by comparing the nucleotide sequence withthe codon table of E.coli. The graph shows the percentage of usage of every codon of SK by E.coli. The Blue and Red color bars are indicating codons that are used less than 20%and 10%, respectively. The flower bracket marks the presence of more than two consecutive rare codon and the Star denotes the presence of more than one rare codonoccurring consecutively.

Fig. 3. Over-expression of Streptokinase in BL21 (DE3) (A) The cell lysate sampleswere analyzed in 10% SDS–PAGE and stained with Coomassie Brilliant Blue G-250.The lane 1-protein marker, lane 2-vector control, lane 3-uninduced pRSETB-SK, lane4–Induced pRSETB-SK. The expression of the 47 kDa protein (SK) is indicated by anarrow. (B) The Western blot analysis of the SK expression of the pRSETB-SKconstruct. Lane 1-protein Marker, lane 2-vector control, lane 3-pRSETB-SK induced.The arrow indicates the SK (47 kDa) expression that was confirmed by anti-SKantibody.

650 C. Thangadurai et al. / Biochemical and Biophysical Research Communications 376 (2008) 647–652

expression analysis of these constructs in BL21 (DE3) is shown inFig. 4A (lane 4, 7 and 9). This revealed that increasing the distanceof ‘AGG’ codon to +9 and +11 (SKM 2-His and SKM3-His) hasrestored the over-expression. But, in SKM1-His, construct did notalter the results achieved earlier in SKM1.

Cloning and expression of SKM 1-Tyr and SKM 3-Tyr

The (TAC)6 sequence was inserted in between the initiationand the +2 codons of the SK mutant constructs namely SKM1and SKM3 which has shifted the ‘AGG’ codon to +8 and +11 posi-tions (SKM1-Tyr and SKM3-Tyr), respectively. The expressionanalysis (Fig. 4A–lane 5 and 10) of the constructs reveals thatincreasing the distance of the ‘AGG’ codon to +11 (SKM3-Tyr)has restored the expression. However, there was no significantalteration in expression of SKM1-Tyr which was similar to thatof SKM1.

Generation of GFP mutants and their expression profile

The importance of ‘AGG’ at the early position of the target genewas further analyzed with the mutation studies of the Green Fluo-rescent Protein (GFP). The original codons at the +2, +3 and +5 posi-tions of the gene were replaced with the ‘AGG’ codon and themutants were designated as GFP M2, GFP M3 and GFP M4, respec-tively. The GFP with original codons served as the control. Themutant and the original genes were cloned into the pRSETB vectorand expressed in BL21 (DE3) (Fig. 4B). The expression analysis re-vealed the similar profile as seen in SK expression. The mutation at

+2 positions did not affect the original expression, whereas themutation at the subsequent positions (+3 and +5) has minimizedthe protein expression.

Fig. 4. Expression analysis of SK and GFP mutants (A) The cell lysate samples wereanalyzed in 10% SDS–PAGE and stained with Coomassie Brilliant Blue G-250. Thelane 1-protein marker, lane 2-vector control, lane 3-SKM1, lane 4-SKM1-His, lane 5-SKM1 Tyr, lane 6-SKM2, lane 7-SKM2 His, lane 8-SKM3, lane 9-SKM3 His and lane10-SKM3-Tyr. The arrow indicates the protein band. (B) The cell lysate sampleswere analyzed in 10% SDS–PAGE and stained with Coomassie Brilliant Blue G-250.The lane 1-protein marker, ***lane 2-vector control, lane 3-uninduced GFP, lane 4-induced GFP, lane 5–Uninduced GFP M2, lane 6-induced GFP M2, lane 7-uninducedGFP M3, lane 8-induced GFP M3, lane 9-uninduced GFP M4 and lane 10-inducedGFP M4. The expression of the GFP (26 kDa) is indicated by an arrow.

C. Thangadurai et al. / Biochemical and Biophysical Research Communications 376 (2008) 647–652 651

Discussion

Expression of a gene in heterologous host is often reported to beimpaired due to the difference in codon usage of an organism [16].Every organism differs in their frequency of usage of codons. As no-ticed in heterologous expression of parasitic proteins in E.coli [21]the presence of rare codon makes ribosome to stall at that positionand causing premature chain termination. The occurrence of sev-eral consecutive rare codons may affect the synthesis of full-lengthprotein in E.coli [15]. The previous literature reports that in E.coli,the rare codons determine the rate of protein synthesis and poten-tially affects the protein production [16].

In contrast, the interesting fact was observed in our results. Inan attempt to express the streptokinase gene (skc) of S. equisimilisin a heterologous host, E.coli, over expression was achieved (Fig. 3).Strikingly, the SK gene (1.3 kb) was found to harbor 123/414 co-dons that are supposed to be rarely used by E.coli. Nevertheless,the over-expression of SK was obtained. Thus codon bias was notfound to be the bottleneck in over expression of Streptokinase;hence at this juncture it is intricate to conclude that the codon biasis really a threat to heterologous protein expression.

Though the rare codons like GCT, GGA and CCT are present at 3,4 and 5 early positions of the skc gene (Fig. 2), the expression levelof SK was not affected (Fig. 3). This infers that not all types of rarecodons are detrimental for target protein expression even if theyare present at early position of the gene. But we could not com-pletely rule out the negative effect of some of the rare codons likeNGG. The AGG codons were reported to be used very rarely forarginine in E.coli [14]. In our study, three constructs such asSKM1, SKM2 and SKM3 were prepared in which ‘AGG’ codon wassubstituted at the +2, +3 and +5 positions of skc gene, respectively.The results of the mutation studies became interesting that the‘AGG’ codon at +2 favors over-expression whiles the same codonat +3 and +5 positions resulted in nil expression. Rosenberg et al.[22] reported that insertion of five consecutive AGG codons at13th position of the lacZ gene from the initiation codon severelyinhibited the test protein synthesis. In another case [15] the inser-tion of five AGG codons in tandem after the 10th codon was alsoreported to affect protein synthesis. However, the inhibitory effectof the ‘AGG’ codon was profound only upon the insertion of morethan three consecutive AGG codons in their study. Similarly, theinsertion of six AGG codon cluster was reported to hinder proteinexpression but the inhibitory effect was reduced when they wereseparated from each other by GCG codons [11]. But in our study,the insertion of even one ‘AGG’ codon either at +3 or +5 positionseverely hampered the SK expression. This suggested that the‘AGG’ codon at close proximity to initiation codon has a pro-nounced effect in expression.

To further understand the ‘proximity effect’ of the rare codonon protein synthesis, we have increased the distance of the posi-tion of the ‘AGG’ codon from start codon. The insertion of six‘CAT’ codons in constructs SKM2 and SKM3 has moved the posi-tion of AGG codon from +3 and +5 to +9 and +11 positions inSKM2-His and SKM3-His clones, respectively. The restoration ofover-expression was achieved in the SKM2-His and SKM3-Hisclones. This leads to an argument that over-expression mightbe due to the effect of CAT codons and not by the spatialseparation of AGG codon, since the six-histidine residues thatare inserted at the 50 of the gene have been reported to favorover-expression of heterologous proteins in E.coli [23,24]. To ruleout that the reverse of (CAT)6 i.e., (TAC) 6 was inserted immedi-ately next to the initiation codon. The restoration of expression inSKM3-Tyr proves that the suppressive nature of AGG codon couldrender over-expression by increasing the space between the ini-tiation and the AGG codons (+11).

Similar experiments were conducted with another modelprotein, GFP. It is quite evident from the Fig. 4B, the minimalexpression was noticed in GFPM3 and GFPM4 mutants when com-pared to the GFP (parental) and GFPM2 mutant. This explained thatthe presence of ‘AGG’ codon at +3 or +5 position of GFP play aninhibitory role on the protein synthesis. This result is in accordancewith trend observed in SK expression.

Various investigators have done the insertion of ‘AGG’ codon atpositions like 5th, 10th, 13th, 203 etc, [11,15,22] but they testifiedthat the insertion of more than three consecutive ‘AGG’ codons wasinhibitory for the test protein expression. But our study revealsthat the insertion of even one ‘AGG’ codon in the position closeto ‘ATG’ namely +3 and +5 was found to be detrimental for the

652 C. Thangadurai et al. / Biochemical and Biophysical Research Communications 376 (2008) 647–652

target protein expression (SK and GFP). But AGG codon at +2 posi-tions in both proteins did not show negative effect.

AGG codon and its differential influence in heterologous pro-tein expression was noticed while it was placed in adjacent posi-tions (+2,+3 and +5) of target genes. Guarneros, 2003 [25] in hisearlier report said that the peptidyl tRNA drop-off at the AGG/AGA codon caused the inhibition of protein synthesis. But his la-ter investigations [26] reports that the peptidyl tRNA drop-offwas not due to the AGG/AGA codon since the drop-off is noticedeven with other arginine codons following the AGA codon. Inaddition, the author also reported that the inhibition caused byAGA codon at +2 position was overcome by the substitution offavorable codons like AAA/CUA at the +3 position. Our previouspublication [17] also strongly supports that the synergistic effectof CAT codons had overcome the inhibitory effect of NGG codon.Although enough investigations have been done to find therationale behind the AGG/AGA codon inhibition, they themselvescould not find a satisfactory correlation of the presence of AGG/AGA codon at the early position of the gene and their effect onprotein expression. Therefore, the peptidyl tRNA drop-off maynot be the determining factor that causes inhibition of targetprotein synthesis in our case.

Further the author Guarneros, 2007 [26] reports that the com-mon arginine codons (CGC/CGU) rendered low expression of b-Galwhen compared with the AGG and AGA codons at early positions.This suggests that the codons have an indirect impact on proteinexpression, because the codons are formed by the nucleotidesequence of the gene. The nucleotide sequence in turn influencethe target protein expression by the forming mRNA secondarystructure. Thus the change of codons ultimately alter the nucleo-tide sequence eventually results in the change in the mRNA sec-ondary structure hence modulates the target protein expression.Hence we could speculate that the over-expression caused bythe AGG codon at +2 position is reverted at the +3 and +5 positiondue to the change in mRNA secondary structure.

By virtue of its position, AGG causes the formation of a Shine-Dalgarno-like sequence when it is in +3 and +5 positions whereas+2 AGG does not. The de novo generation of SD-like sequence inthe downstream of start codon might be responsible for negativeexpression. In accordance with our result, previous investigationshad also shown that the presence of SD-like sequence in thedownstream box led to lowered expression [27,28]. These struc-tures are known to cause the increased binding of 30S ribosomesto themselves than to the real SD causing a stoppage of transla-tion [28]. Therefore, in our case the de novo generation ofShine-Dalgarno-like sequence as well as the derived secondarystructure of the mRNA together might be responsible for thedown regulation of protein synthesis.

In conclusion, though E.coli consider many of the codons as rare,not all the rare codons affect the heterologous protein expressionsignificantly, instead priority among those codons-like AGG as wellas the position at which they are present in a target gene werefound to have effect on recombinant protein expression.

Acknowledgments

We thank Dr. K.J. Mukherjee, Centre for Biotechnology, andJawaharlal Nehru University for generously gifting the Streptoki-nase gene. C. Thangadurai is a recipient of Council for Scientificand Industrial Research (CSIR) Fellowship, India.

Appendix A. Supplementary data

Supplementary data associated with this article can be found, inthe online version, at doi:10.1016/j.bbrc.2008.09.024.

References

[1] G. Georgiou, Expression of proteins in bacteria, in: J.L. Cleland, C.S. Craik (Eds.),Protein Engineering: Principles and Practice, Wiley Liss, New York, 1996, pp.101–127.

[2] S.C. Makrides, Strategies for achieving high-level expression of genes inEscherichia coli, Microbiol. Rev. 60 (1996) 512–538.

[3] H. Chen, Z. Xu, P. Cen, High-level expression of human beta-defensin-2 gene withrare codons in E.coli cell-free system, Protein Pept. Lett. 13 (2006) 155–162.

[4] L. Peng, Z. Xu, X. Fang, F. Wang, S. Yang, P. Cen, Preferential codons enhancingthe expression level of human b-defensin-2 in recombinant Escherichia coli,Protein Pept. Lett. 11 (2004) 339–344.

[5] X. Wu, U. Oppermann, High-level expression and rapid purification of rare-codon genes from hyperthermophilic archaea by the GST fusion system, J.Chromatogr. B. 786 (2002) 177–185.

[6] S. Kim, S.B. Lee, Rare codon cluster at 50 end influence heterologous expressionof archaeal gene in Escherichia coli, Protein Expr. Purif. 50 (2006) 49–57.

[7] H. Ohno, H. Sakai, T. Washio, M. Tomita, Preferential usage of some minorcodons in bacteria, Gene 276 (2001) 107–115.

[8] N. Puri, K.B.C. Appa Rao, S. Menon, A.K. Panda, G. Tiwari, L.C. Garg, S.M. Totey,Effect of the codon following the ATG start site on the expression of ovinegrowth hormone in Escherichia coli, Protein Expr. Purif. 17 (1999) 215–223.

[9] C.M. Stenstrom, L.A. Isaksson, Influences on translation initiation and earlyelongation by the messenger RNA region flanking the initiation codon at the 30

side, Gene 288 (2002) 1–8.[10] C.M. Stenstrom, E. Holmgren, L.A. Isaksson, Cooperative effects by the

initiation codon and its flanking regions on translation initiation, Gene 273(2001) 259–265.

[11] S. Varenne, D. Baty, H. Verhelji, D. Shire, C. Lazdunski, The maximum rate ofgene expression is dependent on the downstream context of unfavorablecodons, Biochemistry 71 (1989) 1221–1229.

[12] T. Sato, M. Terabe, H. Watanabe, T. Gojobori, C. Hori-Takemoto, K. Miura,Codon and base biases after initiation codon of the open reading frames in theEscherichia coli genome and their influence on translation efficiency, J.Biochem. 129 (2001) 851–860.

[13] E.I. Gonzalez de Valdivia, L.A. Isaksson, A codon window in mRNA downstreamof the initiation codon where NGG codons give strongly reduced geneexpression in Escherichia coli., Nucleic Acids Res. 32 (2004) 5198–5205.

[14] S.-I. Aota, T. Gojobori, F. Ishibashi, T. Maruyama, T. Ikemura, Codon usagetabulated from GenBank sequence data, Nucleic Acids Res. 16 (Suppl.) (1988)r315–r402.

[15] G.-F.T. Chen, M. Inouye, Suppression of the negative effect of minor argininecodons on gene expression; preferential usage of minor codons within the first25 codons of the Escherichia coli genes, Nucleic Acids Res. 18 (1990) 1465–1473.

[16] M. Robinson, R. Lilley, S. Little, J.S. Emtage, G. Yarranton, P. Stephens, A. Millican,M. Eaton, G. Humphreys, Codon usage can affect efficiency of translation ofgenes in Escherichia coli, Nucleic Acids Res. 12 (1984) 6663–6671.

[17] V.R. Pai, V. Murugan, A synergistic effect of suppressive CGG codon in +2position and downstream CAT repeats for efficient heterologous proteinexpression in Escherichia coli, Biochem. Biophys. Res. Commun. 371/3 (2008)380–384.

[18] M. Fuhrmann, A. Hausherr, L. Ferbitz, T. Schodl, M. Heitzer, P. Hegemann,Monitoring dynamic expression of nuclear genes in Chlamydomonas reinhardtiiby using a synthetic luciferase reporter gene, Plant Mol. Biol. 55 (2004) 869–881.

[19] M. Mandel, A. Higa, Calcium dependent bacteriophage DNA infection, J. Mol.Biol. 53 (1970) 159.

[20] U.K. Laemmli, Cleavage of the structural proteins during the assembly of thehead of bacteriophage T4, Nature 227 (1970) 680–685.

[21] A.M. Baca, W.G.J. Hol, Overcoming codon bias: a method for high-leveloverexpresion of Plasmodium and other AT-rich parasite genes in Escherichiacoli, Int. J. Parasitol. 30 (2000) 13–118.

[22] A.H. Rosenberg, E. Goldman, J.J. Dunn, F.W. Studier, G. Zubay, Effects ofconsecutive AGG codons on translation in Escherichia coli, demonstrated with aversatile codon test system, J. Bacteriol. 175 (1993) 716–722.

[23] J. Martin-Farmer, G.R. Janssen, A downstream CA repeat sequence increasestranslation from leadered and unleadered mRNA in Escherichia coli, Mol.Microbiol. 31 (1999) 1025–1038.

[24] A.K. Mohanty, M.C. Wiener, Membrane protein expression and production:effects of polyhistidine tag length and position, Protein Expr. Purif. 33 (2004)311–325.

[25] J.J. Olivares-Trejo, G. Bueno-Martinez, G. Guarneros, J. Hernandez-Sanchez, Thepair of arginine codons AGA AGG close to the initiation codon of the lambda intgene inhibits cell growth and protein synthesis by accumulating peptidyl-tRNAArg4, Mol. Microbiol. 49 (2003) 1043–1049.

[26] E. Zamora-Roma, L.R. Cruz-Vera, S. Vivanco-Dominguez, M.A. Magos-Castro, G.Guarneros, Efficient expression of gene variants that harbour AGA codons nextto the initiation codon, Nucleic Acids Res. 35 (2007) 5966–5974.

[27] C.A. Valentene, D.M.F. Prazers, J.M.S. Cabral, G.A. Monteiro, Translationfeatures of human a2b interferon production in Escherichia coli, Appl.Environ. Microbiol. 70 (2004) 5033–5036.

[28] Mary V. Mawn, Maurille J. Fournier, David A. Tirrell, Thomas L. Mason,Depletion of free 30s ribosomal subunits in Escherichia coli by expression ofRNA containing Shine-Dalgarno-like sequences, J. Bacteriol. 184 (2002) 494–502.