Intron bypass: a rapid procedure for eliminating introns from cloned genomic DNA and its application...

17
Voum 12 Nubr1 94NcecAisRsac Intron bypss: a rai procedure for elminating introns from doned genomic DNA and its appltion to a celulase gene Laura F.Steel, Thomas E.Ward and Allan Jacobson Department of Molecular Genetics and Microbiology, University of Massachusetts Medical School, Worcester, MA 01605, USA Received 17 April 1984; Revised and Accepted 20 June 1984 ABSTRACT We have devised a DNA cloning procedure in which the introns present in a genomic DNA fragment can be eliminated easily and rapidly. The technique combines the methods of cDNA and genomic cloning in a way which assures full-length representation of the intron-free transcript. Moreover, plasmids made by this technique can be designed to contain flanking untranscribed regions which may play a role in the regulation of expression. One strand of a linearized plasmid containing the 3'-end of a gene is used to prime cDNA synthesis from an annealed nRNA template. A second plasmid containing the 5'-end of the gene is linearized, denatured, and annealed to the extended 3'-end molecules and the resulting circular, partial duplexes are used to transform bacterial cells. Two different recombinant plasmids which contain DNA encoding the cellulase, exocellobiohydrolase I, from Trichodenna reesei have been constructed using this method. They both contain the entire translated region of the gene uninterrupted by introns. One plasmid contains additional DNA at the 5'-end, including approximately 150 bp 5' to the start of transcription. The inserts of both plasmids can be excised in one piece. I NTRODUCTION The cellulases are a family of enzymes which act together to digest cellulose to its component oligo- and mono-saccharides (1). These enzymes have potential value in a variety of applications and there has been considerable interest in the isolation and characterization of the genes which encode them (2-6). The cloning and sequencing of a gene for the most abundant cellulase of the fungus Trichoderma reesei, exocellobiohydolase I (CBHI), has been reported recently (7). The gene encodes a messenger RNA of approximately 1800 bases and contains two introns of 69 and 63 base pairs. These introns must be removed before the gene can be expressed from recombinant plasmids in bacterial or yeast hosts. The problem of intron removal is common to many efforts to express eucaryotic genes in heterologous systems where correct processing of the transcript is not possible. One solution to the problem is to construct cDNA clones, but commonly used procedures often lead to loss of sequence at either C) I RL Press Limited, Oxford, England. Nucleic Acids Research Volume 12 Number 14 1984 5879

Transcript of Intron bypass: a rapid procedure for eliminating introns from cloned genomic DNA and its application...

Voum 12 Nubr1 94NcecAisRsac

Intron bypss: a rai procedure for elminating introns from doned genomic DNA and itsappltion to a celulase gene

Laura F.Steel, Thomas E.Ward and Allan Jacobson

Department of Molecular Genetics and Microbiology, University of Massachusetts Medical School,Worcester, MA 01605, USA

Received 17 April 1984; Revised and Accepted 20 June 1984

ABSTRACTWe have devised a DNA cloning procedure in which the introns present in a

genomic DNA fragment can be eliminated easily and rapidly. The techniquecombines the methods of cDNA and genomic cloning in a way which assuresfull-length representation of the intron-free transcript. Moreover, plasmidsmade by this technique can be designed to contain flanking untranscribedregions which may play a role in the regulation of expression. One strand ofa linearized plasmid containing the 3'-end of a gene is used to prime cDNAsynthesis from an annealed nRNA template. A second plasmid containing the5'-end of the gene is linearized, denatured, and annealed to the extended3'-end molecules and the resulting circular, partial duplexes are used totransform bacterial cells. Two different recombinant plasmids which containDNA encoding the cellulase, exocellobiohydrolase I, from Trichodenna reeseihave been constructed using this method. They both contain the entiretranslated region of the gene uninterrupted by introns. One plasmid containsadditional DNA at the 5'-end, including approximately 150 bp 5' to the startof transcription. The inserts of both plasmids can be excised in one piece.

I NTRODUCTIONThe cellulases are a family of enzymes which act together to digest

cellulose to its component oligo- and mono-saccharides (1). These enzymeshave potential value in a variety of applications and there has beenconsiderable interest in the isolation and characterization of the genes whichencode them (2-6). The cloning and sequencing of a gene for the most abundantcellulase of the fungus Trichoderma reesei, exocellobiohydolase I (CBHI), hasbeen reported recently (7). The gene encodes a messenger RNA of approximately1800 bases and contains two introns of 69 and 63 base pairs. These intronsmust be removed before the gene can be expressed from recombinant plasmids inbacterial or yeast hosts.

The problem of intron removal is common to many efforts to expresseucaryotic genes in heterologous systems where correct processing of thetranscript is not possible. One solution to the problem is to construct cDNAclones, but commonly used procedures often lead to loss of sequence at either

C) I RL Press Limited, Oxford, England.

Nucleic Acids ResearchVolume 12 Number 14 1984

5879

Nucleic Acids Research

the 3m- or 5'-end (or both) due to incomplete first or second strand synthesisand subsequent treatment with Si nuclease (8). Protocols have been describedwhich circumvent these problems to some extent, primarily by eliminating theneed for SI nuclease digestion of the ds-cDNA insert at any stage of theprocedure (9-12). While they remain the methods of choice where especiallylong cDNA inserts are required for the preparation of a library from a mixedpopulation of messenger RNAs, they require many steps and still may lead to

slightly truncated cDNAs. An alternative solution is to "cut and paste",using cloned cDNA fragments to provide sequence spanning the intron junctionsand cloned genomic DNA fragments to provide the remaining sequence. Dependingon the length of the transcript, the number of introns, and the nature of thecDNA clones available, this approach can be laborious and time-consuming.

Here we report a procedure by which a cloned segment of genomic DNA isrendered intron-free easily and rapidly. In contrast to conventional cDNAcloning methods, there is no loss of sequence due to incomplete copying. Inaddition, flanking untranscribed genomic sequence which may play a regulatoryrole in the expression of the gene can be retained in association with thetranscribed region.

We have used this method to construct two different plasmids which encodeT. reesei CBHI. The T. reesei DNA insert in one plasmid begins approximately150 base pairs 5' to the presumed start of CBHI transcription. In the otherplasmid the insert begins at a point 17 base pairs upstream from the start ofCBHI translation. Both inserts have the same 3' terminus at the end of thetranscribed region.

MATERIALS AND METHODSMaterials

Trichoderma reesei, strain L27, and the CBHI genomic subclones pCBH-157and pCBH-164 were supplied by S. Shoemaker (Cetus Corp.). Restriction enzymeswere obtained from Boehringer Mannheim or New England Biolabs and usedaccording to the manufacturers' specifications. Reverse transcriptase wasfrom Life Sciences, Inc. DNA polymerase I (endonuclease-free) and nuclease S1were purchased from Boehringer Mannheim. Terminal transferase was obtainedfrom Ratliff Biochemicals, Inc. Oligo(dT)-cellulose was from CollaborativeResearch. Low melting temperature agarose was from Bio-Rad. a-32P-dCTP waspurchased from Amersham Corp. and unlabeled nucleotide triphosphates were fromP-L Biochemicals. Avicel (Type PH 105) was the gift of FMC Corp.RNA Isolation

T. reesii, strain L27, was grown to mid-log phase in broth containing

5880

Nucleic Acids Research

either glycerol or Avicel as a carbon source. CBHI synthesis is induced bygrowth i n Avi cel, but i s undetectabl e duri ng growth i n glycerol (13). Cel 1 s

were harvested after 25 hours of growth and frozen in liquid nitrogen. RNAwas isolated by a modification of the method of Chirgwin et al. (14). Frozencells were ground in a mortar and pestle and homogenized at room temperaturein a solution containing 5.0 M guanidine thiocyanate, 0.05 M Tris-HCl, pH 7.5,0.01 M EDTA, 0.5% sarcosyl, 2.0% B-mercaptoethanol, and 0.3% antifoam A. Thehomogenate was centrifuged to remove cellular debris and 0.75 volume of 95%ethanol was added to the supernatant. After incubation for at least 1 hour at-200C, the precipitate was collected by centrifugation, rinsed with 95%ethanol, dried under vacuum, and resuspended in sterile H20 containing 10 mMvanadyl-adenosine complex (15). The solution was extracted 3-4 times withphenol:chloroform:isoamylalcohol (25:24:1) and again ethanol precipitated andresuspended in sterile H20. At each resuspension step in the procedure,there was a considerable amount of insoluble material which was removed whennecessary by chilling the suspension on ice and centrifuging it for 15 minutesat 27,000g. The messenger RNA fraction was purified by chromatography onoligo(dT)-cellulose (16). Poly(A)+ RNA eluted from the column wasconcentrated by ethanol precipitation, resuspended in sterile H20, andstored frozen at -800C.Construction of the 3'-end cDNA clone, pcCBH-16F

Poly(A)+ RNA was isolated from cells induced for CBHI production andused in the construction of a cDNA library. cDNA was prepared by reversetranscription of oligo(dT)-primed RNA and made double-stranded using DNApolymerase I (17). After treatment with S1 nuclease, the ds-cDNA was tailedwith (dC) residues and annealed with PstI cut and (dG)-tailed pUC8 DNA (18).The annealed DNA was used to transform (19) E. coli strain JMB3 (18) and thetransformants harboring insert-containing plasmids were screened for thepresence of CBHI DNA by the filter hybridization technique of Grunstein andHogness (20). Nick-translated (21) insert from the CBHI genomic subclonepCBH-164 (7) was used as a probe. Several clones were isolated andcharacterized by comparison of their restriction maps to the known CBHIgenomic DNA sequence (7). pcCBH-16F contains an insert of approximately 950bp and extends almost completely to the 3'-end of the transcribed region ofthe gene.Construction of the 5'-end subclones, p157-HR and pl57-BR

Two 5'-end subclones of the CBHI gene were made from the clone pCBH-157which contains a T. reseei genomic HindIII fragment inserted in the HindIII

5881

Nucleic Acids Research

site of pBR322 (7). It encodes the 5' half of CBHI, including the firstintron, as well as approximately 150 bp 5' to the start of transcription.pCBH-157 was digested with HinclI and EcoRI and the resulting 420 bp fragmentwas isolated by electrophoresis on a 1% low melting temperature agarose gelfollowed by CETAB/butanol extraction (22). Similarly, a 957 bp fragment wasisolated from pCBH-157 digested with BanHlI and EcoRI. BarllI cuts withinpBR322 and the fragment contains 346 bp of pBR322 DNA in addition to 611 bp ofthe CBHI gene. The isolated fragments were ligated separately with HinclI andEcoRI digested pUC9 or BamHI and EcoRI digested pUC9 (18) and the ligationproducts were used to transform E. coli strain JM83. Plasmid DNA was preparedfrom overnight cultures of several individual transformants (24) and checkedby restriction analysis for the appropriate size and orientation of the clonedfragments in pUC9.Elongation of the 3'-end clone

The 3'-end cDNA clone, pcCBH-16F, was digested with EcoRI. The 3500 bpfragment was gel purified and 0.2 ug was combined separately with 10 ug ofpoly(A)+ RNA isolated from cells either induced or not induced for CBHIproduction. We estimate that there is a 2-fold excess of CBHI mRNA to itscomplementary sequence in the plasmid in the reaction with induced RNA. Thecombined RNA and DNA were ethanol precipitated and resuspended at a DNAconcentration of 10 ug/ml in 80% formamide, 0.4 M NaCl, 10 mM Pipes, pH 7.0.The mixture was heated to 800C for 5 minutes to dissociate the DNA and thenallowed to anneal at 570C, 540C, 520C, and 490C, for 45 minutes ateach temperature. Calculations based on the known G+C content of the DNA (7)and the annealing reaction conditions indicated that the optimal temperaturefor RNA/DNA hybrid formation should be 530C (25).

After hybridization, the reaction was chilled on ice, diluted with cold0.3 M sodium acetate, and ethanol precipitated twice to ensure removal of theformamide. Following the second ethanol precipitation, the RNA/DNA hybridswere resuspended for reverse transcription in a total volume of 100 ul of 50mM Tris-HCl, pH 8.3, 10 mM MgCl2, 70 mM KCl, 25 mM dithiothietol, 500 uMdATP, 500 uM dGTP, 500 uM dTTP, 500 uM dCTP, 240 units reverse transcriptase,and 50 uCi c-32P-dCTP (800 Cl/mmole). The reaction was incubated for 90minutes at 420C and then stopped by the addition of EDTA to a concentrationof 20 mM.

RNA was hydrolyzed and double-stranded structures dissociated by theaddition of NaCH to 50 mM and incubation for 1 hour at 650C. The remainingDNA, including the elongated, single-stranded 3'-end clone, was separated from

5882

Nucleic Acids Research

the reverse transcription reaction components and the hydrolyzed RNA bycentrifuging the reaction through a Sephadex-G50 column made in a 1 ml syringeand equilibrated with 10 mM Tris-HCl, pH 8.0, 0.1 mM EDTA. The extent ofelongation was evaluated by electrophoresis of the reaction products in a 1%agarose gel made in 30 mM NaOH, 2 mM EDTA (26).Annealing of the elongated 3'-end clone with the 5'-end subclones

Both of the 5'-end subclones were linearized by digestion with EcoRI andgel purified as described above. The products of both elongation reactionswere divided in half and each half was combined with an equimolar amount ofone of the linearized 5'-end subclones. The combined DNAs were ethanolprecipitated and resuspended for annealing in 20 ul of 4XSSC, resulting in afinal DNA concentration of 10 ug/ml. (lXSSC is 0.15 M NaCl, 0.015 M sodiumcitrate.) The DNA was heated to 1000C for 2 minutes and then allowed toanneal for 2-3 hours at 650C.Transformation

An aliquot of the annealing reaction (5-10 ul) was diluted with 75 ul of10 mM Pipes, pH 7.0, 10 mM CaCl2, 10 mM MgCl2 and added to 100 ul oftransformation competent JM83 cells. Cells were treated for transformation(19) and plated for growth on LB agar (27) containing 150 ug/ml of ampicillin.

RESULTSThis construction utilizes cloned DNA fragments to provide the 5'- and

3'-ends of a gene, and the mature mRNA to provide intron-free sequenceinformation bridging the two ends. The resulting DNA molecule containsfull-length and uninterrupted coding sequence, plus as much or as littleflanking genomic sequence as desired on either end.

The generalized procedure is outlined in Figure 1. Several variations ofthis general scheme are possible and will be discussed later. To start, a DNAfragment which carries sequence complementary to the 3'-end of the mRNA, butcontains no intron-derived sequence, is subcloned in an appropriate vector.(Throughout this paper the designations 3' and 5' will refer to theori entati on of the mRNA. ) Thi s subcl one i s 1 neari zed by cutti ng so that the3'-end of the insert remains attached to one end of the vector, and all insertsequence is removed from the other end of the vector. The cut plasmid isdissociated, hybridized with mRNA, and used to prime reverse transcription ofthe mRNA which has hybridized to the 3'-end insert. The RNA is thenhydrolyzed and annealed structures are dissociated by treatment with base.

A subclone containing sequence from the 5'-end of the gene is prepared in

5883

Nucleic Acids Research

51 3mR

V~~~~~~~~~0

anneal

Otrun~~fosfo rrnn

Figure 1. A general scheme for the intron bypass procedure. See text fordetal i s.

a way analogous to the 3'-end subclone. It may contain any amount of sequence5' to the start of transcription, but cannot extend into the coding regionbeyond the start of the first intron. The insert must have the same

orientation in the vector as the insert in the 3'-end subclone. The 5'-endsubclone is linearized so that the 5'-end of the insert remains attached toone end of the vector and all insert sequence is removed from the other end.

This linearized subclone is then dissociated and allowed to reanneal withthe elongated, dissociated 3'-end subclone. In cases where reversetranscription has extended the 3'-end subclone far enough, there will be aregion of complementary overlap with the 5'-end insert, and circular partialduplexes can form. These circular molecules are used to transform bacterialcells, and repair of the single-stranded gaps is allowed to occur in vivo.

A more detailed description of this procedure as it was applied to theconstruction of two different CBHI clones is presented below. The isolationand sequencing of a genomic DNA fragment encoding CBHI has been described(7). Features of the CBHI map relevant to the constructions described here

5884

Nucleic Acids Research

Hindmf HfndlirNincIr

EcoRI EcoRI HIindmIVS1 IVS2

5/ ~~~~mfRNA8 ~pcCBH-l16F

,p 1.57- HR

,_ , ~p 151- BR100 bp

Figure 2. Restriction map of the T. reseei CBHI gene and the locations ofregonsI ncluded in the subclones.7 TFeWi`vy arrow indicates the direction andapproximate extent of transcription of CBHI mRNA. The translational start andstop codons and the positions of the two introns are also shown (from ref.7). The genomic regions included in the two 5'-end subclones, pl57-HR andpl57-BR, and in the cDNA clone, pcCBH-16F, are indicated below the map.

are shown in Figure 2.Preparation of the 3'-end subclone

The 3'-end subclone should contain as much transcribed sequence aspossible, but cannot include any intron-derived sequence. It was convenientto use a cDNA clone rather than a genomic subclone for the CBHI plasmidconstructions. The clone pcCBH-16F contains an insert of approximately 950 bporiginating from the extreme 3'-end of the transcribed region of the gene.The sequence included in pcCBH-16F is indicated in Figure 2 and itsorientation with respect to the lac promoter and polylinker sites in thevector pUC8 is shown in Figure 3a.

Since the 3'-end clone was to be used as a primer in extending sequence tothe 5'-end of the mRNA, it had to be linearized in a way that eliminated the(dG):(dC) homopolymer tail at the 5' side of the insert. By digestingpcCBH-16F with EcoRI, it was possible to eliminate the 5' homopolymer tailwhile retaining approximately 800 bp of CBHI sequence attached at its 3' endto the vector. It is also important that EcoRI digestion leaves no CBHIsequence attached to the other end of the vector which might otherwisehybridize to the mRNA and interfere with reverse transcription.Preparation of the 5'-end subclones

In order to have the option of studying the expression of CBHI either fromits own promoter or from heterologous promoters, we elected to construct twodifferent 5'-end subclones. The subclone pl57-HR contains a 420 bpHincII-EcoRI fragment inserted into Hincll and EcoRI digested pUC9. The

fragment contains sequence beginning 17 bp upstream from the start of

5885

Nucleic Acids Research

Li.-

EcoRTAA

is* bp UOobp

b. c N , 1Hincir EcoRI

lib? 4o3 bp

cl.BoNiHr IEH EcoRI

?IaJ~~~'E4nHicll~~

346bp 1q9bp 4ZObp

Figure 3. Restriction maps of pcCBH-16F, pl57-HR, and pl57-BR. a) pcCBH-16F.(dC)-all ed ds-cDNA was inserted into PstI digested and (dG)-tailed pUC8.b) pl57-HR. A 420 bp HincII-EcoRI fragment from the genomic CBHI clone,pCBH-157 (7), was inserted into HincII and EcoRI digested pUC9. c) pl57-BR.A 957 bp BanHI-EcoRI fragment from the genomic CBHI clone, pCBH-157, wasinserted into BanmlI and EcoRI digested pUC9.

In all cases, inserted DNA is drawn as an open bar and pUC vector DNA as asolid line. pBR322 sequence present in pl57-BR is represented by stippling.The location and the direction of transcription from the pUC lac promoter areindicated. Only those polylinker sites relevant to the construictionsillustrated here are indicated, although others are present surrounding theinserts. The large arrow indicates the 5' to 3' orientation of transcriptionof CBHI mRNA.

translation of CBHI and extending into the coding region to a site 110 bp 5'to the start of the first intron (see Figure 2). The second 5'-end subclone,pl57-BR, contains a 957 bp insert in BamHI and EcoRI digested pUC9 andincludes all of the 5' genomic sequence up to the HindIlI site in the originalgenomic clone. By using a BaulHI-EcoRI fragment (which includes some pBR322sequence), rather than a HindIll-EcoRI fragment (containing only T. reseeiDNA), we were able to retain a PstI site in the 5' polylinker (see Figure7a). pl57-BR terminates at the same EcoRI site 110 bp 5' to the start of the

5886

Nucleic Acids Research

a b c d e f

4Aoo-

3.5 I.2.7 m -

Figure 4. Size of DNA produced in the elongation reaction. The materialpresent in each elongation reaction was treated with base and partiallypurified by Sephadex G-50 chromatography. An aliquot of each reaction waselectrophoresed in a 1% agarose gel made in 30 mM NaCH and 2 mM EDTA. The gelwas dried and autoradiographed by exposure to Kodak X-Onat AR film at -800Cfor 7 hours with a DuPont Cronex intensifying screen. The size markers are3'-end labeled, EcoRI digested pBR322, pcCBH-16F, and pUC8. a) pBR322; b)pcCBH-16F; c) pUC8; d) reaction with poly(A)+ RNA from cells induced forCBHI synthesis; e) reaction with poly(A) RNA from uninduced cells; f)reaction with no added RNA.

first intron as does p157-HR. The orientation of the p157-HR and p157-BRinserts in pUC9 is shown in Figures 3b and 3c. Both 5'-end subclones were

linearized at their unique EcoRI site at the 3' Junction of CBHI and pUC9sequence.Elongation of the 3'-end clone

RNA was isolated from cells either induced or not induced for CBHIsynthesis and purified by one passage over oligo(dT)-cellulose. Thelinearized 3'-end clone was dissociated and then annealed with total mRNAunder conditions which favored the formation of RNA/DNA hybrids. The DNA was

elongated by reverse transcription, utilizing the annealed mRNA as a

template. Radiolabeled precursors were used in the reaction so the extent of

5887

Nucleic Acids Research

elongation could be assayed by electrophoresis in a denaturing gel, and an

autoradiogram of such a gel is shown in Figure 4. The fully elongated,single-stranded pcCBH-16F fragment should have a length of approximately 4500bases (2700 bases of pUC8 and 1800 bases complementary to CBHI mRNA). A

prominent band of DNA of that length was present in the elongation reactionproducts when the linearized plasmid was hybridized with RNA isolated fromcells induced for CBHI (Figure 4, lane d). However, this band was absent whenRNA from uninduced cells was used (Figure 4, lane e). A considerable amount

of low molecular weight DNA was produced in both reactions, presumably primedfrom secondary structure present in the RNA population. Nothing wassynthesized when RNA was omitted from the reaction (Figure 4, lane f).Annealing of the elongated 3'-end clone to the 5'-end subclones

The products of the elongation reactions were treated with 50 mM NaOH to

hydrolyze the RNA and dissociate any DNA/DNA duplexes which may have formed.The elongated single strands were combined in annealing buffer with an

equimolar amount of one of the two linearized 5'-end subclones. The mixtureswere heated to dissociate all duplex DNAs, and then allowed to reanneal at low

DNA concentration to encourage circle formation when possible. The strandscould reanneal in several different combinations, but only when a sufficiently

elongated 3'-end subclone strand annealed to a strand of a 5'-end subclonecould a circular structure be formed. Since linear molecules transform poorly(28), there is a strong selection for full-length inserts in the plasmids of

resul ti ng transformants.Transformation and identification of colonies

The circular products of the annealing reaction are double strandedthroughout the pUC vector sequence, but contain single-stranded regions withinthe CBHI coding sequence. We have found no advantage to filling in thesesingle-stranded regions with PolI in vitro, and therefore the products of theannealing reaction were used for transfonnation without further treatment.

Plasmid DNA was prepared from the colonies which grew up aftertransformation with the annealed products of the reaction series initiatedwith RNA from induced cells. As Judged by supercoil size, these coloniescontained either one of the starting subclones (3'- or 5'-end) or a plasmid ofthe predicted size for the full-length construct. These plasmids have beennamed pcCBH-HR (made with pl57-HR) and pcCBH-BR (made with p157-BR). SinceCBHI mRNA was undetectable in uninduced cells, both by translation and byNorthern blot analysis (not shown), and no elongated strand was observed whenRNA from uninduced cells was annealed for reverse transcription (see Figure

5888

Nucleic Acids Research

5' Junction.Hind Ir ?StI HincA born HI

(pUc9) ATG ATT ACG CCA AGC TTG GCT GCA GGT CGA C7G(puC8) TAC TAA TGC

TTA A

3' JunctionAAT T

CA CTG GCC (pUC9)-GGGGGAC GTC GGT TCG AAC CGT GAC CGG (pUC8)

Pst I Hind m

Figure S. Mismatch at the 5' and 3' insert-vector junctions. The mismatchwhTch eisults from annealing a strand of the 5'-end subclone (in pUC9), with astrand of the elongated 3'-end cDNA clone (in pUC8) is drawn in detail (pUCvector sequence is from ref. 18). The dashes indicate continuing T. reeseiinsert sequence.

4), transformants arising from uninduced RNA control reactions were notanalyzed further.

In experiments where a large number of transformants were to be analyzed,the colonies were screened by colony filter hybridization prior to thepreparation of plasmid DNA. A central fragment of the gene, not included ineither the 3'- or 5'- starting subclones, was used as a probe. Clonesidentified in this way all contained plasmids with a supercoil size whichindicated that they contained full length inserts. In various experimentsusing the intron bypass procedure, we have obtained from 1 to 350 pcCBHtransformants per ug of starting poly(A)+ RNA. These represented from 1% to72% of the total transformants.Restriction analysis of the plasmids

Plasmids from several individual transformants were analyzed byrestriction digestion in order to assess the fidelity of the in vitro reversetranscription reaction as well as the in vivo filling-in reaction.

The 3'- and 5'-end subclones were constructed in the vectors pUC8 andpUC9, respectively. This leads to a small amount of mismatch at the ends ofthe strands when the single-stranded subclone fragnents are annealed with eachother, as shown in Figure 5. It was of particular interest to know if thismismatch was resolved after transformation in a way that preserves therestriction sites in these areas. The results of digestion of ten differentisolates of pcCBH-BR are shown in Figures 6a and 6b. Comparison with thepredicted map (Figure 7a) shows that all sites for PstI, HindIII, and BamHIare intact. Similarly, predicted sites for PstI and HindIII have beenretained 'in the pcCBH-HR plasmids (Figures 6c and 7b).

In the construction of both series of plasmids, DNA synthesis is initiated

5889

Nucleic Acids Research

M Pstb)

M Hind MIII M Eco RI M Bam Hi M

C) d)M Pst M Hindl111 M M Eco RI M

Figure 6. Restriction analysis of ten individual isolates of pcCBH-BR andpcCBH-HR. Plasmid DNA was prepared from cells by a rapid mini-prepprocedure. DNA was digested with restriction enzymes and electrophoresed inlX agarose gels made in 10 mM sodium acetate, 1 mM EDTA, 40 mM Tris-HCl, pH7.8. The gels were stained with ethidium bromide. HindIII digested X DNA(BRL, Inc.) was used as size markers. a) pcCBH-BR isolates digested with PstI(left) and HindIII (right); b) pcCBH-BR isolates digested with EcoRI (left)and BamHI (right); c) pcCBH-HR isolates digested with PstI (left) and HindIII(right); d) pcCBH-HR isolates digested with EcoRI.

from an EcoRI site, both in the original elongation reaction with reverse

transcriptase and in the filling-in of one of the single-stranded regions

after transformation. The presence of these EcoRI sites was also analyzed andthe results are shown in Figures 6b and 6d. In 9 out of 10 of the pcCBH-BRplasmids and all 10 of the pcCBH-HR plasmids, these sites are present as

predicted. In one pcCBH-BR isolate, however, one of the EcoRI sites has beenlost. Analysis of an additional 8 pcCBH-BR isolates indicates that this is a

rare mistake, as all 8 retained both EcoRI sites (not shown).In order to check for small deletions and rearrangements on a finer scale,

the plasmids were digested with both HincII and SstI and the resultingfragments were analyzed by electrophoresis in polyacrylamide gels. The

5890

a)

Nucleic Acids Research

a.HindE~~ElHOW "-m Eco RI SstIEc/RI

P4injm:co RIdtmairHI

lNoc ssti _.tz Huid I

2377

-366, 1090 937

162S0HindEbomlHI

654 Eco RI541 t 240, 254 548 645 . HincJI/56t]

b.

indE EcoRtft r Sam H I

pIac S9+tj|s | s,fi r HindE I Hind I/jjliac M

1836 Pst r915 937 i Hin4I

654 Eco RI240 ,254 , 548 645 HHincu/ySst I

Figure 7. Restriction maps of the full-length, intron-free plasmids, pcCBH-BRand pcCBH-HR. a) pcCBH-BR. b) pcCBH-HR. Vector and insert DNA are indicatedas in Figure 3. Regions of the inserted DNA denoted by slashes are thosepresent in either the 3'- or 5'-end subclones. All remaining polylinker sitessurrounding the inserts are indicated. Fragment sizes expected from digestionwith the indicated restriction enzymes are shown below the map and are derivedfrom the CBHI gene sequence (7).

results shown in Figure 8 indicate that, at this order of resolution, no suchdeletions or rearrangements can be detected.

We conclude from these data that the procedure leads to accurate cloningof sequence copied from the mRNA bridging the subcloned ends, apparentlywithout deletions or rearrangements. As in other cDNA cloning procedures(29), errors may occur at a low frequency at some stage in the process.

DISCUSSIONMany potentially useful genes are isolated as genomic DNA fragments and

contain introns which must be removed before the encoded protein can beexpressed from recombinant plasmids in bacteria or yeast. We have developed a

DNA cloning procedure which utilizes the sequence infonnatlon in messenger RNAto bypass the introns found in genomic DNA and is an alternative to thedifficult task of making full-length, intron-free plasmids by conventional

5891

I

Nucleic Acids Research

a) b)

1857-1060 -

929-

383--

121

Figure 8. Double digestion of DNA from ten individual isolates of pcCBH-BRand pcCBH-HR. The same plasmid DNA preparations as used in Figure 6 weredigested with HincII and SstI and electrophoresed in 5X polyacrylamide gelsmade in 90 mM Tris, 90 mM boric acid, 2 mM EDTA. The gels were stained withethidium bromide. Size markers are BstNI digested pBR322. a) pcCBH-BRisolates digested with HincII and SstI; b) pcCBH-HR isolates digested withHincII and SstI.

techniques. The gene region cloned by this procedure is not limited to thatfound in the mature transcript, and flanking regulatory or processing sitescan be included in the final plasmid. The procedure is rapid and was designedto entail a minimum amount of technical manipulation. It requires no

purification of individual mRNAs or intermediate structures since products

other than those drawn in Figure 1, which are formed at each step, are either

eliminated or are inactive in succeeding reactions.In using this procedure, considerable attention must be given to the

construction of the starting 3'- and 5'-end subclones. They will determinethe amount of flanking gene sequence, if any, included in the final plasmid as

well as the arrangement of restriction sites surrounding the insert. With the

CBHI clones, our goal was to generate plasmids with inserts which containedthe entire protein coding region and were intron-free, but which could bemoved to other vectors, eventually to maximize and control expression. For

the pcCBH-BR clones, the 5'-end subclone was made from a BauHI-EcoRI fragmentof the original genomic clone and included some pBR322 sequence. By choosingthis fragment rather than the HindIII-EcoRI CBHI gene fragment, we were ableto retain the PstI site in the 5' polylinker. There are no internal PstI

5892

Nucleic Acids Research

sites in the CBHI gene and the inserts in both the pcCBH-BR and pcCBH-HRplasmids are excisable in one piece with PstI. By constructing two different5'-end subclones for CBHII, we were able to obtain clones which either retainedthe presumed CBHI promoter region or which began just 5' to the start of CBHItranslation. Therefore, we expect that it will be possible to express CBHIeither from its own promoter or from any other promoter of choice.

The starting subclones should also be designed to maximize the extent ofsequence available for hybridization, initially for priming cDNA synthesisand, later in the procedure, for forming circles. By using a cDNA clone forthe 3'-end starting clone, as we did for the CBHI construction, it may bepossible to take advantage of a long region of complementarity for hybridizingto mRNA and priming the elongation reaction. If a genomic 3' region is used,it must be 3' to the last intron, and the region of complementarity might be

considerably reduced. However, the use of cDNA clones does introduce thepossibili'ty of loss of some sequence at the extreme 3'-end of the transcribedregion in the final plasmid, as noted earlier for conventional cDNA cloningprocedures. The extent of overlap between the elongated strand from the3'-end and the 5'-end insert strand (for circle formation prior to

transformation) will depend on the length of the DNA and the restriction sitespresent in the 5'-end subclone.

Because of the particular order and location of restriction sites in theCBHI gene, we found it helpful to utilize the oppositely oriented polylinkers

of pUC8 and pUC9 to make the 3'- and 5'-end subclones, respectively. The

slight mismatch generated at the ends of the strands when these two vectorswere annealed was repaired in vivo after transfonnation. In other

applications of this procedure, it may be more advantageous to use the samevector for the 3'- and 5'-end subclones. In fact, with a fortuitousarrangement of restriction sites in the original genomic clone, separate

subcloning of the two ends may be unnecessary. The sites must only be locatedso that all inserted genomic DNA sequence 3' of the beginning of the firstintron can be removed from the vector by one set of restriction cuts, and allinserted sequence 5' of the end of the last intron can be removed by anotherset of cuts.

With an abundant RNA such as CBHI mRNA, the intron bypass procedureproduced many transformants even though we did not maximize the efficiency ofevery step. Starting with 3.3 ug of Poly(A) RNA, we were able to obtain inone experiment, over 1600 total transfonnants, 72X of which contained pcCBH-HRplasmids. For less abundant RNAs it may be necessary to fractionate the mRNA

5893

Nucleic Acids Research

in some way (e.g. by size selection or hybrid selection) or to purify to someextent the annealed circular molecules in order to increase the efficiency oftransformation. There is a considerable amount of material present after thefinal annealing step which may interfere with transformation. This includesthe low molecular weight DNA produced during reverse transcription (see Figure4) as well as the numerous double stranded products of the annealing reactionwhich cannot form circles. However, in our application, a large number oftransformants was not as desirable as a high percentage of pcCBH-HR orpcCBH-BR transformants, and we were more concerned with reducing thebackground due to transformation by the starting 3'- or 5'-end subclones.This was accomplished by gel purification of the restriction digestedsubcl ones.

Although we have used this procedure to bypass introns found in astructural gene, it can be used to join any two cloned DNA segments which canbe bridged by a single RNA molecule. For example, it could be applied to thecloning of intergenic regions of RNA viruses.

ACKNOWLEDGEMENTSThis work was supported by a grant from Cetus Corporation. We thank

M. Innis, D. Gelfand, and S. Shoemaker for sharing sequence information priorto publication. We thank Susan Longwell for her help with the preparation ofthe manuscript.

REFERENCES1. Mandels, M. (1981) Ann. Reports Ferm. Processes 5, 35-78.2. Whittle D J Killburn, D.G., Warren, R.A.J., and Miller, R.C., Jr.

(1982) dene i, 139-145.3. Cornet, P., Tronik, D., Miller, J., and Aubert, J. (1983) FEMS Lett. 16,

137-141.4. Cornet, P., Millet, J., Beguin, P., and Aubert, J. (1983) Bio/technology

1, 589-594.5. Wilson, D.B. and Collmer, A. (1983) Bio/technology 1, 594-601.6. Teeri, T., Salovuori, I., and Knowles, J. (1983) Bio/technology 1, 696-699.7. Shoemaker, S., Schweickart, V., Ladner, M., Gelfand, D., Kwok, S., Myambo,

K., and Innis, M. (1983) Bio/technology 1, 691-696.8. Efstratiadis, A. and Villa-Komaroff, L. (1979) in Genetic Engineering,

Stelow, J.K. and Hollaender, A., Eds. Vol. 1, p. 15-36, Plenum Press, NewYork.

9. Rabbitts, T.H. (1976) Nature 260, 221-225.10. Land, H. Grez, M., Hauser, H., Lindenmaier, W., and Schutz, G. (1981)

Nucl. Acids Res. 9, 2251-2266.11. Okayama, H. and Berg, P. (1982) Molec. and Cell. Biol. 2, 161-170.12. Heidecker, G. and Messing, J. (1983) Nucl. Acids Res. 14, 4891-4906.13. Shoemaker, S.P., Raymond, J.C. and Bruner, R. (1981) in Trends in the

Biology of Fermentations for Fuels and Chemicals, Hollaender, A. et al,

5894

Nucleic Acids Research

Eds. Plenum Press, New York.14. Chirgwin, J.M., Przybyla, A.E., MacDonald, R.J. and Rutter, W.J. (1979)

Biochemistry 18, 5294-5299.15. Berger, S.L. and Birkenmeier, C.S. (1979) Biochemistry 18, 5143-5149.16. Aviv, H. and Leder, P. (1972) Proc. Natl. Acad. Sci. USA 69, 1408-1412.17. Wickens, M.P., Buell, G.N., and Schimke, R.T. (1978) J. Biol. Chem. 253,

2483-2495.18. Vieira, J. and Messing, J. (1982) Gene 19, 259-268.19. Mandel, M. and Higa, A. (1970) J. Molec. Biol. 53, 159-162.20. Grunstein, M. and Hogness, D. (1975) Proc. Natl. Acad. Sci. USA 72,

3961-3965.21. Rigby, P.W.J., Dieckmann, M., Rhodes, C., and Berg, P. (1977) J. Molec.

Biol. 113, 237-251.22. Langridge, J., Langridge, P., and Bergquist, P.L. (1980) Anal. Biochem.

103, 264-271.23. DeVries, F.A.J., Collins, C.J., and Jackson, D.A. (1976) Biochim. Biophys.

kta 435, 213-227.24. Birnboim, H.C. and Doly, J. (1979) Nucl. Acids Res. 7, 1513-1523.25. Thomas, M., White, R.L., and Davis, R.W. (1976) Proc. Natl. Acad. Sci. USA

73, 2294-2298.26. McDonnell, M.W., Simon, M.N., and Studier, F.W. (1977) J. Molec. Biol.

110, 119-146.27. Miller, J.H. (1972) Experiments in Molecular Genetics, Cold Spring Harbor

Laboratory, New York.28. Benzinger, R., Enquist, L.W., and Skalka, A. (1975) J. Virol. 15, 861-871.29. Lomedico, P., Rosenthal, N., Efstratiadis, A., Gilbert, W., Kolodner, R.,

and Tizard, R. (1979) Cell 18, 545-558.

5895