The chromosome 9 ALS and FTD locus is probably derived from a single founder
Sequence rearrangement and duplication of double stranded fibronectin cDNA probably occurring during...
-
Upload
independent -
Category
Documents
-
view
4 -
download
0
Transcript of Sequence rearrangement and duplication of double stranded fibronectin cDNA probably occurring during...
Volume 8 Number 13 1980 Nucleic Acids Research
Sequence rearrangement and duplication of double stranded fibronectin cDNA probably occuringduring cDNA synthesis by AMV reverse transcriptase and Escherichia coli DNA polymerase I
John B.Fagan, Ira Pastan and Benoit de Crombrugghe
Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda,MD 20205, USA
Received 8 April 1980
ABSTRACT
Two cloned cDNAs derived from the mRNA for cell fibronectin have beensequenced, providing evidence that transcription with AMV reverse tran-scriptase or Escherichia coli DNA polymerase I may not always result indouble stranded cDNA that is exactly homologous with its mRNA template.Instead, the sequences of these cloned cDNAs are consistent with the dupli-cation and rearrangement of sequences during synthesis of double strandedcDNA.
INTRODUCTION
We have constructed a set of recombinant plasmids with inserts com-
plementary to the fibronectin mRNA (1). Cell fibronectin plays important
roles in cell adhesion and in maintaining normal cell morphology (2-4).
Our primary purpose in constructing recombinant plasmids containing fibro-
nectin cDNA sequences was to develop a specific hybridization probe which
could be used to quantitate the changes in fibronectin mRNA levels that
occur during avian sarcoma virus (ASV) transformation of chick embryo fi-
broblasts (CEF). This work has been reported elsewhere (1).
In examining the structures of 2 cloned cDNAs derived from the mRNA
for cell fibronectin, we have obtained evidence that transcription cata-
lyzed by AMV reverse transcriptase or E. coli DNA polymerase I may not
always result in double stranded cDNA which is exactly homologous with
its mRNA template. Instead we have obtained DNA sequence data suggesting
that duplication and rearrangement of sequences occurred during the syn-
thesis of double stranded cDNA by AMV reverse transcriptase and E. coli
DNA polymerase I.
©) IRL Press Umited, 1 Falconberg Court, London W1V 5FG, U.K. 3055
Nucleic Acids Research
MATERIALS AND METIHODS
Plasmid Construction
The construction and identification of recombinant plasmids with in-
serts complementary to fibronectin mRNA are described elsewhere (1). In
short, single stranded cDNA was transcribed from a fibronectin mRNA tem-
plate, purified from CEF as reported (5). The first strand was synthesized
with AMV reverse transcriptase using oligo-(dT)jO as primer. This single
stranded cDNA was made double stranded with DNA polymerase I. After diges-
tion with Sl nuclease to create flush ends, synthetic decanucleotide
"linkers" carrying the sequence recognized by the restriction endonuclease
hlind III (6, 7) were ligated to the double stranided cDNA with T4 DNA ligase.
The plasmid pBR322 and the cDNA linker complex were each digested with Hind
III to expose complementary single stranided ends. After treatment of the
linearized plasmid with bacterial alkaline phosphatase to remove phosphate
from the 5' ends and thereby prevent self-ligation, the cDNA linker complex
was ligated to the plasmid with T4 DNA ligase. E. coli were transformed
with the resultant population of recombinant plasmids and transformants
carrying plasmids with sequences complementary to fibronectin mRNA were
identified as described (1).
Preparation of DNA Fragments
The cDNA inserts of the plasmids pFN200 and pFN600 were excised from
the plasmids with Hlind III and isolated by centrifugation of 120 pg of DNA
through 5-20 percent sucrose gradients in 10 mM Tris-Cl pH 7.5, 1 mM Na-
EDTA for 14 hours at 26.5 x 103 rpm at 40C in a Beckman SW 27 rotor.
DNA Sequencing
The strategies for sequencing the inserts to pFN200 and pFN600 are
presented in Fig. 1. The labeling of fragments with 32P by polynucleotide
kinase, the preparation and electrophoretic isolation of fragments labeled
at a single end and the base specific cleavage reactions were according to
Maxam and Gilbert (8). Thin sequencing gels (0.4 mm) were prepared and run
as described by Sanger and Coulson (9).
RESULTS AND DISCUSSION
When the inserts of two plasmids containing fibronectin cDNAs were
characterized by DNA sequence analysis, we found that they contained long
inverted repeat sequences. The complete sequence of the insert from pFN200and the partial sequence of that from pFN600 are presented in Fig. 2. The
partial sequence of the insert from pFN600 has been previously reported (1).
3056
Nucleic Acids Research
RI HindilI Hindill HaelilI _~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v
Hincli Taq I Hinf I Hinf I Alu I Hincli
-_ ~ ~~~~~3
a b
Figure 1
Panel a
Sequencing strategy for the insert from pFN200. The Hind III sitesdefine the ends of the cDNA insert. The Hae III and R I sites are inflanking pBR322 sequences. Two kinase reactions were carried out. In thefirst, the purified insert was kinased, the strands separated electrophore-tically and both strands sequenced. In the second, the whole plasmid wasdigested with R I, kinased and redigested with Hae III. The labeled RI-Hae III fragment was then isolated and sequenced.
Panel b
Sequencing strategy for the insert from pFN600. Four kinase reactionswere carried out. In the first, the insert was cut with Hinf I, kinasedand recut with Hinc II. In the second, the initial digestion was with AluI and the second digestion with Hinc II. In the third, the initial diges-tion was with Taq I and the second with llinc II. In the fourth, the uncutinsert was kinased at each end and then cut with Taq I. Bars a and bindicate the regions of the insert for which the sequence is presented inFig. 2.
The sequence organization of these inserts and the relationships be-
tween their structures are presented in Fig. 3. The salient features of
these structures are: (1) each insert consists of three domains, a central
domain of unique sequence, flanked by two identical sequences that are
exact inverted repeats of each other. The sizes of the central and f lank-
ing domains are 28 and 80 base pairs, respectively, for pFN200 and '300
and 161 base pairs, respectively, for pFN600. Thus, each insert is sym-
metrical and the flanking domains of each strand of these inserts are com-
plementary and could base-pair to form stem and loop structures. (2) The
leftward inverted repeat, the central domain, and 6 base pairs of the
3057
Nucleic Acids Research
kLL BL5 ...AGCTTTGGCACTTACAGTATAAAAATAATCACTGATCATAATTACACCAAATTCCTCTTTG
TCAACTGCCCACTAAGTGTCTTCAATACATTTTATTCCCATTTAAAAACACTTAGTGGGCAGTTGA
_ E RIkXLLCAAAGAGGAATTTGGTGTAATTATGATCAGTGATTATTTTTATACTGTMGTGCCAAAGCT...3'
.~~ ~ ~ ~ ~ kL.1.L. .I~ ~ ~ ~ BLRI HNI
5'... AGCTTTGGCACTTACAGTATAAAAATAATCACTGATCATAATTACACCAAATTCCTCTTTGTCAACTGCCCACTA
AGTGTCTTCAATACATTTTATTCCCATTTAAAAACACTTGAAGGTCAGGGGAACAAAACTGATAAATAACAGTAGGAGAT
ACTAAATCACAAACTGGTGGGGGATCAGAACGTCGAGGGGGTGGGAGAGAGTTGGAATTGAAAGGAAACCATACTATGCA
llwFI U±IIEL ALU *GACTC.... J"140 BASES ... TGATGTTTAAAATATGCACAGTCCTGATTCTTTCTCCATGATCCTGTAGCTTTAGT
.+ ' ' ' tATCTCATACTGTTATTTATCAGTTTTGTTCCCCTGACCTTCAAGTGTTTTTAAATGGGAATAAAATGTATTGAAGACACT
U±NCU RL RL kLLTAGTGGGCAGTTGACAAAGAGGAATTTGGTGTAATTATGATCAGTGATTATTTTTATACTGTAAGTGCCAAAGCT...3'
Figure 2
The sequences of the cDNA inserts from pFN200 (panel a) and pFN600(panel b). The inserts were excised from the plasmids by digestion withHind III and purified by sucrose gradient centrifugation, as described (1).Sequence analysis was carried out as described in MATERIALS AND METHODSusing the strategies presented in Fig. 1. The arrows above the sequenceindicate the limits of the inverted repeats. * indicates the one base inthe sequenced portion of the insert of pFN600 whose identity is not clear.+ indicates the single base in the rightward inverted repeat which is un-ambiguously different from the corresponding base in the leftward repeat.Eco R I digestion was carried out in 20 mM Tris Cl pH 8.5, 2 mM MgC12,allowing cleavage at the sequence AATT.
rightward repeat of pFN200 comprise the first 114 base pairs of the inver-
ted repeat of pFN600. This portion of pFN200 is repeated exactly in pFN600.The remaining 47 base pairs of the inverted repeat of pFN600 are not homol-
ogous to the sequence of the pFN200 insert. Thus, the extreme ends of
these two inserts are exactly homologous, while the sequence of the central
region of the pFN600 insert is unrelated to that of the pFN200 insert. The
3058
Nucleic Acids Research
-300 161pFN600 lI
114 4780+28+6
pFN
80 p 28 80
Fi gure 3
Diagram depicting the relationship between the sequences of the pFN200and pFN600 inserts. See text for description.
insert of a third plasmid which is derived from the same region of the
fibronectin mRNA has been characterized by restriction analysis (data not
shown) and does not contain inverted repeats.
There are three possible sources for the inverted repeats that we have
observed: (1) These structures could be present in the fibronectin mRNA.
(2) These structures could have been generated by sequence rearrangements
and duplication during cDNA synthesis. (3) These inverted repeats could
have been generated during plasmid replication. Although the sequence
data does not conclusively rule out any of these possibilities, two lines
of reasoning argue that these inverted repeats were most likely generated
during cDNA synthesis.
First, one of the three plasmids lacks an inverted repeat structure.
The presence of inverted repeats is neither a general property of the
products of this region of the fibronectin mRNA nor a general property of
this particular cDNA preparation. Second, if the inverted repeat sequence
of pFN200 were actually present in fibronectin mRNA, then it would be
expected that the homology between the pFN200 and pFN600 inserts would be
oriented in such a way as to be consistent with sequential transcription
from the same region of fibronectin mRNA. Instead, sequences homologous
to pFN200 are found at the extreme ends of the pFN600 insert and are orga-
nized in such a way as to preclude their sequential transcription from
the same region of the fibronectin mRNA from which pFN200 was transcribed.
Two possibilities for generating these inserts in the absence of sequence
3059
Nucleic Acids Research
rearrangements and duplications during cDNA synthesis are: (1) The
cDNAs for these two plasmids might have been transcribed from different
fibronectin mRNA species, one species having an inverted repeat colinear
with the pFN200 insert and one species having an inverted repeat colinear
with the pFN600 insert. A third mRNA species might be postulated to ac-
count for the plasmid whose insert lacks an inverted repeat. (2) A single
fibronectin mRNA species might contain four copies of the inverted repeat
sequence common to pFN200 and pFN600. These copies would be found twice
in two different orientations, once colinear with the sequence of the
pFN200 insert and once colinear with that of the pFN600 insert. The alter-
native to these possibilities is that the inverted repeat structures of
the pFN200 and pFN600 inserts were generated by duplication and rearrange-
ment of sequences present only once in the fibronectin mRNA. We prefer
this second alternative, since precedents exist for such nonsequential
transcription (10, 11, 12). This could occur during synthesis of the
first strand by AMV reverse transcriptase, during synthesis of the second
strand by E. coli DNA polymerase I, or possibly during replication of the
plasmid.
One mechanism by which these inserts could be generated during the re-
verse transcriptase reaction is presented in Fig. 4. During the reaction,
a region of fibronectin mRNA exceptionally high in A and U, such as that
found in the loop domains of pFN200 and pFN600, would result in unstable
base-pairing between the mRNA template and the nascent cDNA molecule. This
would allow the nascent chain to dissociate from the mRNA template and fold
back on itself to form a loop with a short base-paired stem as shown in Fig.
4c. The short base-paired stem could then be extended by reverse trans-
criptase to form the longer stem shown in Fig. 4d. The formation of fold-
back structures such as these is not uncommon during the reverse transcrip-
tase reaction (13, 14). In fact, the routine procedure for synthesizing
double stranded cDNA depends on " self-priming" from loops such as these for
the second strand reaction (15). Furthermore, the tendency for reverse
transcriptase to " jump" from one template to another is well-documented
(11), and is an essential component of the mechanism for proviral DNA syn-
thesis. The likelihood that jumping occurred during cDNA synthesis was
increased by the fact that actinomycin D was not present in these reac-
tions. It would be predicted from the model presented in Fig. 4 that for-
mation of the pFN200 insert would depend upon the presence in the fibronec-
tin mRNA of short, complementary sequences flanking the 28 bases which
3060
Nucleic Acids Research
a. _0 AAGUGU ACACUU (A)n
Reverse Transcriptase
TTCACA TGTGAA (dT)n
b. AAGUGU ACACUU (A)n
Region High in A+U
Dissociation of Nascent cDNAand Loop Formation
C. { \ ACACTT/TGTGAA (dT)n
Reverse Transcriptase
d\ACACTT ----------- _-
d. V r TGTGAA (dT)n
Denaturation with NaOHand Self Priming
e
.------- TTCACA TGTGAA (dT)nIDNA Polymerase I
(* I AAGTGT I ACACTT
* --. -TTCACA TGTGAA (dT)n
Figure 4
Model for the generation of inverted repeats during the reverse trans-criptase reaction. A detailed description is presented in the text.
correspond to the central loop domain of the pFN200 insert and are AT rich.
Such complementary sequences are found in exactly the predicted locations
within the insert of pFN600, which we assume to accurately reflect the se-
quence of the corresponding region of fibronectin mRNA. These complemen-
tary sequences are underlined in Fig. 2b. Within the insert of pFN600 the
28 base-pair domain, corresponding to the loop region of the pFN200 insert,
is flanked by the sequences 5' AAGTGT 3' and 5' ACACTT 3', which are exact-
ly complementary. We have included these sequences in the model presented
in Fig. 4. In the final steps of this model, the stem and loop structure
of Fig. 4d would be denatured, as shown in Fig. 4e, to become the template
for synthesis of double stranded cDNA having the inverted repeat structures
3Q61
Nucleic Acids Research
of pFN200 and pFN600.
Fig. 5 presents a mechanism by which these inserts could be generatedduring the DNA polymerase I reaction. During transcription of self-primed
single stranded cDNA into double stranded cDNA by DNA polymerase, a region
exceptionally high in A and T would result in unstable base-pairing and
would increase the probability of slippage, as shown in Fig. 5b. Slippagewould be followed by transcription from a new point on the template, as
shown in Fig. 5d, resulting in a double stranded cDNA with an inverted
repeat structure. As in the reverse transcriptase model presented in Fig.
4, the DNA polymerase model of Fig. 5 predicts the existence and location
of the complementary hexanucleotide sequences 5' AAGTGT 3' and 5' ACACTT 3'
which we have found to flank that 28 base-pair domain of the pFN600 insert
I First Strand Synthesisfrom mNNA Template
I with Reverse Transcriptase
a. TTCACA - TGTGAA (dT)n
I Self-Priming
b. ACACTTTGTGAA (dT)n
DNA Polymerase I
C. ACACTT -___--TGTGAA (dT)n
Region High in A+TSlippage and Self Priming
at New Site
TTCACA TGTGAA (dT)n
IDNA Polymerase I
,4Z JIAAGTGT I ACACTTle.
------- TTCAGA - TGTGAA (dT)n
Fi gure 5
Model for the generation of inverted repeats during the DNA polymeraseI reaction. A detailed description is presented in the text.
3062
Nucleic Acids Research
which corresponds to the central loop domain of the pFN200 insert.
O'Hare et al. (10) have reported the existence of a similar, although
shorter, inverted repeat at the end of a cloned cDNA derived from ovalbumin
mRNA. They suggest similar mechanisms for generating inverted repeats of
this nature. These authors also point out the similarity of these mecha-
nisms to that proposed for generation of the mini-insertion element IS2-6
(12). It is worthwhile pointing out that on the basis of this mechanism,
the inverted repeats of pFN200 and pFN600 could also have been generated
during replication of the recombinant plasmids.
Other investigators have observed that the error rate for transcrip-
tion by reverse transcriptase is relatively high, and have suggested that
this may be one reason for the high rate of mutation of avian tumor viruses
to defective forms (16). Our results do not provide an accurate measure of
transcriptional fidelity for reverse transcriptase. However, the high de-
gree of homology between pFN200 and pFN600 suggests that transcription was
relatively accurate for these short sequences. The predominant alteration
that we have observed to occur during cDNA synthesis was not inaccurate
transcription, but sequence rearrangement and duplication. The resultant
cloned cDNA is, therefore, not suitable for use in determining the sequence
of the mRNA template from which it was derived. It is, however, quite
suitable for use as a hybridization probe for mRNA quantitation. Results
of these studies are presented elsewhere (1).
REFERENCES
1. Fagan, J.B., Sobel, M.E., Yamada, K.M., de Crombrugghe, B., and Pastan,I. In preparation.
2. Yamada, K.M., Yamada, S.S., and Pastan, I. (1976) Proc. Natl. Acad.Sci. USA 73, 1217-1221
3. Willingham, M.C., Yamada, K.M., Yamada, S.S., Pouyssegur, J., andPastan, I. (1977) Cell 10, 375-380
4. Ali, I.U., Hautner, V.M., Lanza, R., and Hynes, R.O. (1977) Cell 11,115-126
5. Fagan, J.B., Yamada, K.M., de Crombrugghe, B., and Pastan, I. (1979)Nucleic Acids Research 6, 3471-3480
6. Scheller, R.H., Dickerson, R.E., Boyer, H.W., Riggs, A.D., andItakura, K. (1977) Science 196, 177-180
7. Bahl, C.P., Marian, K.J., Wu, R., Stawinski, J., and Narang, S.A.(1977) Gene 1, 81
8. Maxam, A.M. and Gilbert, W. (1977) Proc. Natl. Acad. Sci. USA 74, 560-564
9. Sanger, F. and Coulson, A.R. (1977) FEBS Lett 87, 107-11010. O'Hare, K., Breathnach, R., Benoist, C., and Chambon, P. (1979) Nucle-
ic Acids Research 7, 321-324
3063
Nucleic Acids Research
11. Gilboa, E., Mitra, S.W., Goff, S., and Baltimore, D. (1979) Cell 18,93-100
12. Ghosal, D. and Saedler, H. (1978) Nature 275, 611-61713. Taylor, J.M., Faras, A.J., Varmus, H.E., Levinson, W.E., and Bishop,
J.M. (1972) Biochemistry 11, 2343-235114. Leis, J.P. and Iiurwitz, J. (1972) Proc. Natl. Acad. Sci. USA 69, 2331-
233515. Efstratiadis, A., Kafatos, F.C., Maxam, A.M., and Maniatis, T. (1976)
Cell 7, 279-28816. Gopinathan, K.P., Weymouth, L.A., Kunkel, T.A., and Loeb, L.A. (1979)
Nature 278, 857-859
3064