Secondary Structure of the 5' End of Bacteriophage MS2 RNA. Methoxyamine and Kethoxal Modification

10
Eur. J. Biochem. 102, 595-604 (1979) Secondary Structure of the 5’ End of Bacteriophage MS2 RNA Methoxyamine and Kethoxal Modification Dirk ISERENTANT and Walter FIERS Laboratory of Molecular Biology, State University of Ghent (Received July 3, 1979) To refine the secondary structure model of the 5‘ end of the bacteriophage MS2 genome, 32P- labeled MS2 RNA was partially digested with TI RNase or with Cm-RNase and the 5’-end fragment was isolated, renatured and submitted to treatment with methoxyamine or kethoxal. The resulting modified RNA was digested with TI RNase and the products were separated by minifingerprinting. Methoxyamine-induced modification of exposed cytidines was detected by differential mobility of modified oligonucleotides, while kethoxal-induced alteration of exposed guanosines was monitored by resistance to TI ribonuclease digestion. The positions of the modified residues are discussed in terms of an improved secondary structure model proposed for the 5’ end of the viral RNA. The structure itself is discussed in relation to sequence conservation and biological function. The secondary structure of RNA has been the subject of intensive study because of its important functional role in diverse biological activities [l - 51. Of all the RNAs, the structure of tRNA has been ex- amined in greatest detail: its small size and known nucleotide sequence permitted chemical modification studies, which gave an idea of its secondary structure in solution [6 - 81, and by X-ray diffraction analysis of crystals a tertiary structure at 0.25-nm resolution was established [9- 111. The secondary structure of 5-S ribosomal RNA has also been the subject of detailed analysis [12,13]. Longer RNAs, however, such as the genomes of RNA viruses, have remained intractable to crystallization and could not easily be studied by chemical modification due to their com- plexity. Chemical modification studies are based on selec- tive reaction of nucleotides not involved in secondary or tertiary structure interactions ; e.g. exposed cytidine residues react readily with the reagent methoxyamine [7] and exposed guanosine residues are specifically __ This is paper XL in our series Studies on the Bacteriophage MS2. Paper XXXIX is the preceding paper by Min Jou et al. Ahbreviations. Cm-RNase, pancreatic ribonuclease A with an r;-carboxymethyl group on lysine-41; kethoxal, /(-ethoxy-a-keto- butyraldehyde; C*, methoxyamine-modified cytidine; G*, kethoxal- modified guanosine. Etqwes. (IUB Recommendations, 1978) Ribonuclcase TI (EC 3.1.27.3); pancreatic ribonuclease A (EC 3.1.27.5). altered by kethoxal [14]. When methoxyamine-treated RNA is digested with TI RNase, oligonucleotides con- taining a modified cytidine will have different mobility during separation. Kethoxal treatment renders the RNA resistant to TI RNase cleavage at those positions where guanosine is modified ; hence, altered guano- sines can be identified by the different TI digestion pattern. There is excellent agreement between the results of X-ray studies on crystallized tRNA and the results of modification studies on tRNA in solution [15], proving that the chemical methods are indeed valuable for structural investigations. The chemical approach is not directly applicable to large phage RNAs, because digestion patterns of long RNAs would be too complex to allow identification of modified residues. However, after partial degrada- tion, fragments can be isolated that have retained secondary structure and these can be analyzed by chemical modification. After the elucidation of the complete nucleotide sequence (3569 nucleotides) of the bacteriophage MS2 RNA genome 116- 181, the secondary structure of the molecule could be examined in more detail. Models for the secondary structure were proposed which were based on experimental data (susceptibility to single-strand-specific nucleases ; isolation of specific complexes) and partly on calculated estimates of thermodynamic stability. Such a structure model can be tested and refined by chemical modification of

Transcript of Secondary Structure of the 5' End of Bacteriophage MS2 RNA. Methoxyamine and Kethoxal Modification

Eur. J. Biochem. 102, 595-604 (1979)

Secondary Structure of the 5’ End of Bacteriophage MS2 RNA Methoxyamine and Kethoxal Modification

Dirk ISERENTANT and Walter FIERS

Laboratory of Molecular Biology, State University of Ghent

(Received July 3, 1979)

To refine the secondary structure model of the 5‘ end of the bacteriophage MS2 genome, 32P- labeled MS2 RNA was partially digested with TI RNase or with Cm-RNase and the 5’-end fragment was isolated, renatured and submitted to treatment with methoxyamine or kethoxal. The resulting modified RNA was digested with TI RNase and the products were separated by minifingerprinting. Methoxyamine-induced modification of exposed cytidines was detected by differential mobility of modified oligonucleotides, while kethoxal-induced alteration of exposed guanosines was monitored by resistance to TI ribonuclease digestion. The positions of the modified residues are discussed in terms of an improved secondary structure model proposed for the 5’ end of the viral RNA. The structure itself is discussed in relation to sequence conservation and biological function.

The secondary structure of RNA has been the subject of intensive study because of its important functional role in diverse biological activities [l - 51. Of all the RNAs, the structure of tRNA has been ex- amined in greatest detail: its small size and known nucleotide sequence permitted chemical modification studies, which gave an idea of its secondary structure in solution [6 - 81, and by X-ray diffraction analysis of crystals a tertiary structure at 0.25-nm resolution was established [9- 111. The secondary structure of 5-S ribosomal RNA has also been the subject of detailed analysis [12,13]. Longer RNAs, however, such as the genomes of RNA viruses, have remained intractable to crystallization and could not easily be studied by chemical modification due to their com- plexity.

Chemical modification studies are based on selec- tive reaction of nucleotides not involved in secondary or tertiary structure interactions ; e.g. exposed cytidine residues react readily with the reagent methoxyamine [7] and exposed guanosine residues are specifically

_ _ This is paper XL in our series Studies on the Bacteriophage MS2.

Paper XXXIX is the preceding paper by Min Jou et al. Ahbreviations. Cm-RNase, pancreatic ribonuclease A with an

r;-carboxymethyl group on lysine-41; kethoxal, /(-ethoxy-a-keto- butyraldehyde; C*, methoxyamine-modified cytidine; G*, kethoxal- modified guanosine.

E t q w e s . (IUB Recommendations, 1978) Ribonuclcase TI (EC 3.1.27.3); pancreatic ribonuclease A (EC 3.1.27.5).

altered by kethoxal [14]. When methoxyamine-treated RNA is digested with TI RNase, oligonucleotides con- taining a modified cytidine will have different mobility during separation. Kethoxal treatment renders the RNA resistant to TI RNase cleavage at those positions where guanosine is modified ; hence, altered guano- sines can be identified by the different TI digestion pattern. There is excellent agreement between the results of X-ray studies on crystallized tRNA and the results of modification studies on tRNA in solution [15], proving that the chemical methods are indeed valuable for structural investigations. The chemical approach is not directly applicable to large phage RNAs, because digestion patterns of long RNAs would be too complex to allow identification of modified residues. However, after partial degrada- tion, fragments can be isolated that have retained secondary structure and these can be analyzed by chemical modification.

After the elucidation of the complete nucleotide sequence (3569 nucleotides) of the bacteriophage MS2 RNA genome 116- 181, the secondary structure of the molecule could be examined in more detail. Models for the secondary structure were proposed which were based on experimental data (susceptibility to single-strand-specific nucleases ; isolation of specific complexes) and partly on calculated estimates of thermodynamic stability. Such a structure model can be tested and refined by chemical modification of

596 Secondary Structure of MS2 RNA

‘exposed’ residues. This approach is documented in the present communication for the 5‘ end of MS2 RNA.

MATERIALS AND METHODS

Materials

TI RNase was purchased from Sankyo Co. (Tokyo, Japan); pancreatic RNase A was from Sigma (St Louis, Mo., U.S.A.); kethoxal was from Serva (Heidel- berg, F.R.G.) ; methoxyamine hydrochloride was from Eastman Organic Chemicals (Rochester, N.Y., U.S.A.) and was purified according to the procedure of Hjeds [19]. Poly(ethy1eneimine) plates were ob- tained from Macherey-Nagel Co. (Duren, F.R.G.).

Preparation and Partial Digestion of Labeled M S 2 R N A

32P-labeled MS2 RNA was prepared as described previously [20]. The labeled RNA was partially digested with T1 RNase or with Cm-RNase [23] and the fragments were separated first on a neutral poly- acrylamide gel and then by two-dimensional gel electrophoresis [20,22]. The 5‘-end fragment was eluted from the gel and precipitated after addition of 20pg carrier RNA. The RNA was heated for 10 min at 55 “C in 50 p1 buffer containing 5 mM MgClz, 0.1 mM NaCl, 10 mM Tris, pH 7.4, 0.1 mM EDTA and cooled at room temperature to renature any possibly denatured fragments or regions.

Chemical Modification

Modification with methoxyamine was carried out as described by Cashmore [7] except that the reaction mixture was incubated at a lower temperature to prevent breathing. 20 pg RNA was dissolved in 50 pl methoxyamine solution containing 3 M methoxy- amine, titrated to pH 5.5 with NaOH, and 10 mM MgClz. The reaction was carried out for 24 h at 25 OC. The kethoxal modification reaction was based on the procedure described by Litt [14]. The reaction mixture (100 pl) contained 20 pg RNA, 100 mM so- dium cacodylate, pH 7, 10 mM MgC12 and 15 mM kethoxal. The reaction was run for 90 min at 25 “C and the RNA was precipitated. Kethoxal-treated RNA could be used directly after precipitation, but methoxyamine-modified RNA required an additional purification. After precipitation, the RNA was redis- solved in 50 p1 water, applied on DEAE-cellulose paper and electrophoresed at pH 3.5 to remove the excess reagent. The RNA (which had not moved) was eluted with 250 pl triethylamine/potassium hy- droxide, pH 10, neutralized with 50 p1 acetic acid and precipitated.

Fingerprinting

The oligonucleotides resulting from T I ribonu- clease digestion of modified RNA were separated by minifingerprinting on poly(ethy1eneimine) thin-layer plates [23]. In each case, a T1 digestion pattern of the untreated 5’-terminal fragment was used as a reference. The composition of each modified TI oligonucleotide was determined by digestion with pancreatic RNase A and separation of the products on a two-dimensional chromatography system on poly(ethy1eneimine) plates (6.6 cm x 10 cm) as described by Volckaert and Fiers [24] (to obtain a better separation of the cytidines in methoxyamine-modified products, 25 ”/, HCOOH in- stead of 22% was used in the first dimension).

RESULTS

Methoxyamine Modification

The largest 5‘-terminal fragment used in the meth- oxyamine reaction was 11 7 nucleotides long [20]. After modification, the RNA was digested with TI RNase and the resulting oligonucleotides were sepa- rated by minifingerprinting involving homomixture f i in the second dimension of separation [23]. The diges- tion pattern is shown in Fig. 1 B and can be compared with the TI digestion pattern of the untreated 5‘-end fragment shown in Fig. 1 A. Modified oligonucleotides run faster in both dimensions and the affected oligo- nucleotides can be identified on this basis. The spots seen above those containing modified residues are probably due to the minor modification form [7], as analysis shows that the oligonucleotides in a pair of associated spots have exactly the same base com- position. Oligonucleotides containing modified res- idues were further characterized by double digestion with pancreatic RNase A and separation of the prod- ucts by two-dimensional chromatography [24]. On this system the modified cytidine-containing products formed a separate series, moving in the first dimension slightly slower than the uridine series. Fig. 2 shows the separation of the double-digestion products derived from oligonucleotide 452, U-C*-A-A-C*-U-U-C-C- U-G (where C* indicates a modified cytidine). It should be noted that if there were two or more identical cytidine-containing double-digestion products, we would not be able to determine the exact position of the modified residue, because the activity of the modified products would be too low to allow partial digestion. The composition of all the T1 oligonucleo- tides in Fig. 1 B and the identification of modified bases are summarized in Table 1 ; the location of the modified bases in the RNA fragment is shown in Fig. 5.

The high degree of modification of oligonucleotides 351 (residues 8-17) and 221 (residues 84-89) may

D. Iserentant and W. Ficrs

Fig. 1 . TI R N a x digestion pattern qf the unmodified ( A ) andmrrho.~yumine-modifi'ed ( B ) S'-end,fiugment (n = I 1 7) q f M S 2 R N A . 32P-labcled MS2 RNA was partially digested with TI RNase and the 5'-end fragment was isolated and treated with methoxyamine as described in the text. The modified RNA was digested with TI RNase [20] and the oligonucleotides were separated on poly(ethy1eneimine) thin-layer plates [23]. Modified products are indicated with an asterisk and cyclic products are marked with an exclamation point. The nucleotide composition of each oligonucleotide was determined as described in the legend to Fig.2; these results are summarized in Table 1

Fig. 2. Double-digestion analysis ojmodified oligonucleotide 452from Fig. I B. Modified 452 was digested with pancreatic RNase A and the products were separated on a poly(ethy1eneimine) thin-layer plate (6.6 cm x 10 cm) in two dimensions [24]; the eluent in the first dimension was 25% formic acid and the eluent in the second dimension was 1 M formic acid adjusted to pH 4.3 with pyridine. The products are identified in the right panel; modified products are denoted with an asterisk. Owing to streaking in the second dimension, probably causcd by the two modification forms [7], C* is difficult to see, but its place on the homomixture [j fingerprint proves undoubtedly that 452 is doubly modified. The weak spot below C is an A-C contamination, presumably due to some streaking of oligonucleotidc 351 C* ! (see Fig. 1 B)

be explained by the occurrence of C16 and C8*, re- spectively, within a hairpin loop (Fig. 5). Also, the 3'-terminal oligonucleotide, 544 (residues 104 - 11 7), is easily modified. This is probably due to the modifica- tion of C114. However, because the end of the stem occurs next to a loop, it is possible that the secondary structure at base pair G' . C113 is breathing, which would result in the modification of C'13. There are two cytidines, C2* and C31, in oligonucleotide 452

(residues 26- 37) which are not involved in secondary structure interactions but which are modified only to a limited extent. According to the model proposed in Fig. 5 , these residues would be tucked in at a bend in the secondary structure and perhaps would not be readily accessible to methoxyamine; alternatively, they may be partially involved in tertiary structure interactions. In shorter fragments where the same oligonucleotide 452 occurs in a single-stranded re-

598 Secondary Structure of MS2 RNA

Table 1, Quantifi'cation and identification of oligonucleotides resulting from TI RNuse digestion of the methoxyamine-modified 5'-terminal fragment (n = 117) o f M S 2 R N A Each oligonucleotide corresponds to a spot in Fig. 1 B. Our standard oligonucleotide nomenclature is according to a code of three digits, indicating the number of uridine, cytidine, and adenosine residues, respectively [20]. Modified cytidines are marked with an asterisk. The percentage modification was calculated on the basis of decreased recovery of the oligonucleotide relative to the control (calculation on the basis of the modified product was not feasible, because it occurs in more than one spot [7])

Oligo- Composition Amount Modification nucleotide" ~- ~-

calculated control after modification

mol

000 00 1 010 01 1 020 100 101 110 21 2 220

1;;; c *

1::: c *

1 ::: c* 1623 C*

232

41 1

1:;; C*AzC*

623

G A-G C-G A-C-G C-C-G U-G U-A-G U-C-G Sr C-U-G C-U-A-A-U-G

C-U-A-U-C-G U-C-C-U-G

C-U-A-C-C-A-U-G A-C-C-C-C-U-U-U-C-G

U-C-U-U-U-A-G C-U-C-A-A-C-U-U-C-C-U-G

A-A-U-U-C-C-A-U-U-C-C-U-A-G

C-C-A-U-U-U-U-U-A-A-U-G

10 2 1 1 1 1 2 2 1 1 1

1 1

1 1

1

1

9 2.5 1.3 1 1.2 1 2.4 2.3 1.1 0.8 1.5

1 0.4

1.1 0.8

1.1

1

~~ -

10.5 3.2 1.3 1 I 1.4 2.2 2.0 0.9 0.8 0.5 0.7 0.6 0.1 0.9 1.2 0.5 0.3 0.3 0.3 1 0.2

%

-

-

0 0

17 -

-

1 3 b 18 0

67

40 76

0 38

73

0

* The modified product in the double digcst is also indicated. Modification percentages below 15 Oligonucleotide 232 was not modified; the recovery of 232 was low because part of it had remained in the cyclic form.

were considered too low to be reliable.

gion, the modification is higher, viz. 63';/. The lower yield of oligonucleotide 232 (residues 75 - 82) in the distribution shown in Table 1 is probably due to cyclization of the terminal phosphate : no modified spot corresponding to 232 was found. Furthermore, in other modification experiments on RNA fragments of the same length, 232 did appear in molar quantities. However, oligonucleotide 232 is extensively modified when an RNA fragment containing 89 nucleotides (or less) is treated; according to the model (Fig. 5), it would indeed occur in a single-stranded region in such fragments. Oligonucleotide 21 2 (residues 43 - 48) and 221 (residues 84- 89) run together on the separa- tion system, so that the decreased recovery could solely be due to modification of 221. Analysis of the modified product showed indeed that it corresponded only to 221. The modified form of 020 (residues 200-102) was not found; it probably reached the wick during separation and hence could not be deter- mined. In most experiments, C4' and C62 were not (or only slightly) accessible to modification. According to the rules of Tinoco et al. [25] for determining sec- ondary structure by free-energy considerations, these

residues should occur within an interior loop between bases 47-49 and 61 -63. However, it seems that the base pair G4* . C6' is stable and that the destabilizing effect of the U47 0 U63 and C49 0 U61 combinations is negligible (the symbol 0 between two bases indicates that they occur opposite each other in the helix but that according to currenLideas about base pair inter- actions the union does not add to the stability of the structure, i.e. AG = 0). This situation is analogous to the zero free energy of a C 0 C combination flanked by two G . C base pairs [26]. Alternatively, the res- idues in this interior loop could be unavailable be- cause of their involvement in tertiary structure inter- actions. These results are in good accord with the partial digestion data for the 5'-terminal fragment, which indicate that this region is relatively inaccessible to nuclease digestion [20].

Ke th oxal Mod fica t ion

The 5'-terminal fragment available for kethoxal modification was 82 nucleotides long. The longer fragment (1 17 nucleotides) was not used in these

D. lserentant and W. Fiers 599

Fig. 3. T I RNasc, digrsfion products of the unmodified ( A ) cnid kethoxal-modified ( B ) 5'-endficgment i n = 82) q f M S 2 R N A . The "P-labeled 5'-end fragment was isolated, treated with kethoxal, and digested with TI RNase. The digestion products were separated as described in the legend to Fig. 1 . Fused oligonucleotides, indicating kethoxal modification, are marked with an asterisk. Cyclic products are denoted with an exclamation point. The composition of each oligonucleotide was determined as described in the legend to Fig.4; the quantitative results are summarized in Table 2

experiments as it was a rather rare product under the applied conditions of partial ribonuclease digestion of total viral RNA, and, furthermore, because the additional information expected from kethoxal modi- fication of the longer segment was limited.

Ribonuclease TI does not cleave after kethoxal- modified guanosine residues, so that in a minifinger- print of TI-digested kethoxal-treated RNA, new spots appear which are formed by linkage of two or more TI oligonucleotides. These new products migrate as if the modified guanosine was replaced by a uridine. The TI minifingerprints of the unmodified and modi- fied RNA fragment are compared in Fig.3. Analysis of oligonucleotides and identification of modified residues are summarized in Table 2. Modified spots were characterized by two-dimensional chromatog- raphy on poly(ethy1eneimine) thin-layer plates (1 0 cm x 10 cm). Demodification occurs when these spots are eluted from the minifingerprint thin-layer plate with triethylamine bicarbonate, pH 10. The mobility of the cytidine, uridine and unmodified guanosine series in this system is comparable to their mobility on double-digestion plates. The odd spot which moves differently is the product containing the previously modified guanosine. Since all these spots behave as in a normal pancreatic digest, their composition could be deduced from their position on the plate. Fig.4 shows the analysis of the modified oligonucleotide 011"-232 from Fig.3B. The occurrence of fusion oligonucleotide 220* - 452 (Fig. 3 B and Table 2) in-

dicates that G2' is altered; this result rules out an alternative secondary structure involving U4 . G2' and Gs . U24 base pairs. According to the molarity of oligonucleotides 110 and 001 found in the distribution, G40 probably also is modified, although the new spot corresponding to the putative fusion product 110* - 001 was not observed in the digestion pattern; this product is expected to migrate to the same position as oligonucleotide 220 and therefore might have been masked. In some experiments we found a spot con- taining the fusion product 220* - 452* - 1 1 O* - 001. These links can be explained by breathing of the nu- cleotides forming the base of the stem, which exposes residue G37 to modification. It seems therefore very likely that residue G2' is indeed modified. Whether G6' is modified is not certain, but a modified G6' would give rise to the fusion oligonucleotide 010* - 001, which would constitute the segment opposite 110*-001 in the interior loop. 010*-001 should migrate as C-U-A-G, an oligonucleotide that was present in the digestion pattern as an impurity. If G69 is modified, the reaction rate is too low to permit identification of the linked product. Residue G74 occurs in the kethoxal-treated fragment in a single- stranded region and is therefore modified. The mobility shift exhibited by oligonucleotide 351 (residues 8 - 17) is undoubtedly due to cyclization of this oligonucleo- tide; in other experiments on RNA fragments of com- parable length, modification of 351 was not observed. Possible linkages of pppG could not be traced, be-

600 Secondary Structure of MS2 RNA

Table 2. Quontificotion and identification of Tl oligonucleotides of'the kethoxal-modified 5'-end fragment (n = 82j of M S 2 R N A Each oligonucleotide corresponds to a spot in Fig. 3 B. Linked oligonucleotides, indicating guanosine modification, are marked with an asterisk. The percentage modification is calculated on the basis of the recovery of each oligonucleotide relative to the control

Oligo- Composition Amount Modification nucleotide

calculated control after modification _ _ _ _ ~

mol "/,

000 G I 4 1 5 9 0 00 1 A-G 2 2 5 2 2 12 010 C-G 1 1 6 1 2 25 01 1 A-C-G 1 1 1 0 9 18 100 U-G 1 1 3 0 9 31 110 U-C-G 1 1 7 1 0 41 21 2 C-U-A-A-U-G 1 1 1 1 4 0 220 U-C-C-U-G 1 0 9 0 7 22 232 C-U-A-C-C-A-U-G 1 0 9 0 7 22 351 A-C-C-C-C-U-U-U-C-G 1 0 8 0 5 38 41 1 U-C-U-U-U-A-G 1 1 1 3 0 452 C-U-C-A-A-C-U-U-C-C-U-G 1 1 0 7 30 623 C-C-A-U-U-U-U-U-A-A-U-G 1 1 1 1 0 01 1 * - 232 A-C-G*-C-U-A-C-C-A-U-G 0 2 220* - 452 U-C-C-U-G*-C-U-C-A-A-C-U-U-C-C-U-G 0 2

A-C-G 1 + G 0 4

__ _ _ _ ~ _ _ ~ - ~~

s I a

+ 1' 22 % HCOOH 5M Urea

Fig.4. Douhlc-digestion analjxis o~kethox-aI-modi/ird Tl product 01 I * - 232./rom Fig.3 B. 01 I * - 232 was digested with pancreatic RNase A and the products were separated as described previously [24]. The eluent in the first dimension of the two-dimensional separation system was 22% formic acid plus 5 M urea and the eluent in the second dimension was a buffer adjusted to pH 4.3 with pyridine and containing 1.1 M formic acid and 4 M urea. The products are identified on the right panel

cause it was not possible to detect pppG in our system. According to the model, we expected a fused product pppG-G-G-U-G after kethoxal treatment ; however, such a product would run off the plate during separa- tion in the first dimension. As oligonucleotides 212 (residues 43 - 48), 623 (residues 49- 60) and 41 1 (res- idues 61-67) were not found to be linked, G48 and G60 are not modified. This result is in accordance with the lack of methoxyamine modification of C49 and CbZ, as discussed above.

On the basis of these results and their agreement with the cleaving points by single-strand-specific nu- cleases in the partial digestions, we conclude that the

proposed model for the secondary structure of the 5' end of MS2 RNA is largely correct, not only for the isolated 5'-end fragment but also for this region in the complete RNA.

DISCUSSION

Improved Secondary Structure Model for the 5' End of M S 2 R N A

In previous publications [17,20], tentative models for the secondary structure of the MS2 RNA 5'-ter- minal fragment were proposed which were based on

D. lserentant and W. Fiers 60 1

,u - u u b 'COG

1. lG2

Fig. 5. Propo.sed secondary structure of the 5' end of the bacteriophage MSZ R N A genome. The nucleotide sequence is taken from De Wachter et al. [20]. A broken circle around a cytidine indicates partial reactivity (< 50%) with methoxyamine and unbroken circles indicate extensive rcactivity ( > SOo/;) . Guanosines which react with kethoxal are boxed. C74 modification is not indicated, because it was observed in the fragment of 82 nucleotides, where it occurs in a single-stranded region. Solid arrows are directed at T I RNase cutting points under conditions of partial digestion [20]; dashed arrows point to sites cleaved by Cm-RNase [21]. The number of feathers on each arrow indicates an increasing susceptibility of the site to cleavage [18]. The symbol 0 between two bases indicates that the union does not add to the stability of the secondary structure (ie. AG = 0) (cf. text)

TI and Cm-ribonuclease cleavage points and estimates of thermodynamic stability [25]. Now chemical modi- fication studies on isolated fragments have made it possible to test and to improve on these models.

It is reasonable to assume that the secondary structure in a fragment will be largely the same as the corresponding domain in the macromolecule (local hairpin formation is strongly favored over inter- action with a faraway segment). Of course, in the total viral RNA additional tertiary interactions may be present. It may a priori seem more logical to treat the whole viral RNA with a modification reagent, but this is not technically feasible. Indeed, upon partial enzymatic digestion of MS2 RNA, many hundreds of fragments are generated, and painstaking techniques have been worked out to resolve these. Upon partial chemical modification, the composition of the enzymatic digest would be not only different but also far more complex.

The improved secondary structure model based on the chemical modification data on the isolated 5'-terminal fragment is shown in Fig. 5. The fact that it is in complete agreement with the partial enzymatic digestion data for total MS2 RNA [17] further validates our experimental approach. The results confirm the presence of helices A and B (Fig.5), al- though the presence of the kethoxal fusion oligonu-

cleotide 220* - 452 indicates that the guanosine res- idue in oligonucleotide 220 (residues 21-25) is situated in a bulge and is not protected by a G . U interaction as proposed earlier [17,20]. In major con- trast to the structure proposed by De Wachter et a]. [20], the tRNA-like extra loop just beyond helix B no longer appears in our present model (Fig.5). This follows from our finding that oligonucleotide 232 (residues 75 - 82), which is situated within this extra loop, is not accessible to the methoxyamine reagent. In addition, oligonucleotide 221 (residues 84- 89), protected by secondary structure in the previous model, shows a rather high degree of methoxyamine modification, indicating that it is situated within an interior loop. These observations led to the present model with one large helix from residue 74 to residue 11 3. Indeed, the new model provides an explanation for methoxyamine modification of oligonucleotide 020 (residues 100- 102) and also accounts for the absence of the kethoxal fusion oligonucleotides 01 1 * - 232 (residues 72-82) and 221*-110*-I01 (residues 84-95), which by the previous model would be expected to occur. As mentioned above, our improved model based on the chemical modification data re- mains in good agreement with the RNase cleavage data obtained not only for the isolated fragment [20], but also with the TI and Cm-RNase cleavage points

602 Secondary Structure of MS2 RNA

50 60

10 20 30

I30

9 L i

4 0 50 60

200

70 80

a I- 1.152 - _ _ _ _ _ - _ _ _ - - _ - _ - - - - - _ _ _

I80 I 9 0

70 80

260

I 4 0 I 5 0 Q R

MS2

210 90

MS2 90 I00

I- ig. 0. " V c c ~ / c w t i t / i ~ .sccjuc/ic'c (J/ t/ie j '+ / id region of'huctrriophrrge QB K N A ur7d M S 2 R N A . The Q/1 sequence was reported by Weissmann et i l l .

[29]. Homology between thc sequences is indicated in boxes. If the position of a 'mutation' in the MS2 sequence was uncertain, it was added to the longest homologous sequence. Sequence homology in the regions spanning nucleotides 1-21 and nucleotides 100- 121 of QB was first reported by Adams and Cory [30]

C A C*G C*G C *G C*G COG

I A*U 21, PPPG G G G*C A C

A

U*A LH A*U A*U A*U G*C

A A U*A U G

A C*G C*G A G*C

89 A*u I 3 4 U U C G G*C U G G U

B

C Fig. 7. Sccorzrkwy .sIru('Iurc moclcJl .for serluences in the S'-lerminul region q' Qp R N A . There is a high degree of sequence homology (Fig. 6) and secondary structure homology (Fig. 5 ) between the 5'- terminal regions of MS2 RNA and Qp RNA; structures (A), (B) and (C) can be compared with the three helices shown in Fig.5. The boxed sequences occur in both models in single-stranded regions. Structure (A) has already been proposed by Bi lker et al. [35]

in the 5'-end fragment obtained by ribonuclease treat- ment of the total RNA ([2]; cf. Fig. 5). This confirms our assumption that the structure of the isolated 5'-end fragment is essentially the same as the confor- mation of this region in the total structure of the viral

RNA. Moreover, all the improvements in structural folding indicated by the present study result in a higher thermodynamic stability. We can therefore conclude that chemical modification is indeed a valid and useful tool for secondary structure analysis.

Conservation of Structure; Comparison kvith (28 R N A

For as far as the untranslated 5'-terminal nucleo- tide sequence is known in all three cases, the genomes of bacteriophages MS2, R17 and f2 all have an iden- tical 5'-end primary structure [28], resulting in the same secondary structure model. Moreover, compari- son of the 5'-terminal sequence of these group I phage genomes with that of bacteriophage QB [29] reveals clusters of sequence homology [30] separated by long, unique insertions (Fig. 6). All these insertions are situated either at the base (between residues 29 - 30, 37-38 and 67-68) or at the hairpin loop (between residues 85-86 and 89-90) of helices in our model (Fig. 5). Even the shorter insertions in helices A and B (Fig. 5) are mostly located in regions of weak secondary structure. The structure models that can be con- structed with the homologous sequences in Q/l RNA (Fig. 7) are comparable to the MS2 helices, notwith- standing the mutations and insertions. This is espe- cially true for the hairpin structure shown in Fig. 7A, which is nearly identical to the first MS2 hairpin (Fig.5A). The other helices (Fig.7B and C), while still similar to the corresponding structures in MS2 RNA, show more differences. In the Qfi model shown in Fig. 7 B, the structure is shifted three bases in com-

D. lserentant and W. Fiers 603

parison to the MS2 helix but the U-U-A sequence at the hairpin loop is retained as is the U 0 U interaction in the top part of the helix. The structure in Fig.7C is shifted over one base and the hairpin loop is replaced by an interior loop containing the same A-U-C sequence. From these structure comparisons we can conclude that not only is there a clearly recognizable relationship between secondary structure and sequence conservation, but that, in the case of the 5’-terminal region, maintenance of secondary structure seems to be an important factor in sequence conservation. When sequence changes are necessary (as is the case in helices 7B and 7C for the sliil’~ I’l-om non-coding to coding function) they occur in regions where they have little effect on the secondary structure. The cumulating effect of multiple sequence changes can result in rearrangement of the secondary structure, but the general characteristics of the individual helices are conserved. I t may be noted that point mutations be- tween MS2, R17 and f2 RNA are also restricted by the secondary structure [28].

Biological Function qf the Secondary Structure

It is clear that the severe resistance against varia- tion in the 5’-terminal structure of group I phage RNA implies an important biological role. One possibility is the recognition of the initiation signal of the A pro- tein cistron. Indeed, the ribosome binding site starts at C113 [31], which is situated at the end of the proposed structure model. A long distance interaction between residues 114- 127 and a region of the A protein gene has been proposed [ 5 ] . The 5’-terminal secondary structure forms almost instantaneously during syn- thesis of the RNA by the replicase [32,33]. In contrast, the ribosome binding site remains single-stranded until the RNA chain is at least 885 nucleotides long. This situation is analogous to the one in QP [33] . In this model the A protein is readily translated on nascent chains, as suggested by Robertson [34], using the 5‘ end as a ‘signal structure’ for ribosome binding. Once the interaction between the ribosome binding site and the distal region in the A protein gene is form- ed, initiation of translation can only occur when the structure is breathing. This can happen without destroying the 5’-terminal structure and without affecting the ribosome binding signal.

The secondary structure is also important for recognition by the replicase ; indeed, the same model can be constructed for the 3‘ end of the minus (com- plementary) strand. This structure is quite different from the tRNA-like conformation at the 3‘ end of the viral RNA [ 5 ] , which suggests that the 3’-terminal structures may play a role in the relative distribution of plus and minus strands. The very stable first helix at the 5’ end or its equivalent in the negative strand

may have a role in the initiation of RNA-dependent RNA synthesis. The biological function of helices B and C (Fig. 5 ) in regard to their conservation in Q/j is not clear. They could, for example, be important in folding the molecule in the exact tertiary structure of have a function in the encapsidation of the RNA.

This research was supported by a grant from the Foricl., voor Kollrktkf Fundamenteel Onderzoek and by the Gc~concertec~r~lc~ Akries of the Belgian Ministry of Science. One of us (D.I . ) thanks the I.W.O.N.L. for a fellowship.

REFERENCES

1. 2. 3.

4. 5 .

6. 7.

8.

9.

10.

11 .

12.

13.

14. 15.

16.

17.

18.

19. 20.

21. 22. 23.

24 25

26.

Clarck, B. F. C. (1975) Biochem. S i c . Tram. 3, 645-649. Fox. G. E. Sr Woese, C. R. (1975) J , Mol. E i d u t . 6 , 61 -76, Bobst, A. M., Bobst, E. W. Sr Philips, D. J . (1976) J . Goi.

Wong, D. Sr Paranchych. W. (1976) Virologj.. 73. 476-488. Fiers, W. (1979) in Comprehensiw Virologj. (Fraenkel-Conrat.

H. Sr Wagner, R., eds) vol. 13, pp. 69-204, Plenum. New York.

Virol. 32, 177 - 188.

Mizutani, T. (1971) J . Biocheni. (Tokyo) 6Y. 641 -650. Cashmore, A. R., Brown, D. M. Sr Smith, J . D. (1971) J . Mol.

Chang, S. E., Cashmore, A. R. Sr Brown, D. M . (1972) J . Mol.

Kim, S. H., Quigley. G.. Suddath. F. L. Sr Rich, A. (1971) Proc.. Not1 Aca~l . Sci. U.S.A. 65, 841 -845.

Ladner, J . E.. Finch, J. T.. Klug, A. Sr Clarck, B. F. C. (1972) J . Mol. Biol. 72, 99- 101.

Ladner, J . E., Jack, A,, Robertus, J . D., Brown, R. S.. Rhodes. D., Clarck, B. F. C. Sr Klug, A. (1975) Proc.. Nut1 Accril. S i . U.S.A. 72, 4414-4418.

Bellemare, G., Jordan, B. R. & Monier, R. (1972) Bioi~/?in7ic

Aubert, M., Bellemare, G. Sr Monier, R. (1973) Biochimic~.

Litt, M. (1971) Biochemi.str~. 10, 2223-2227. Robertus, J . D., Ladner, J. E.. Finch. J . T., Rhodes, D.. Brown.

R. S., Clarck, B. F. C. Sr Klug. A. (1974) Nwlcic Acid5 Rrs. 1.

Min Jou, W., Haegeman, G., Ysebaert, M. Sr Fiers. W. (1972)

Fiers, W., Contreras, R., Duerinck, F.. Haegeman. G.. Merre- gaert, J . , Min Jou, W., Raeyniaekers, A. Volckaert. (i.. Ysebaert, M., Vandekerckhove, J., Nolf. I . Sr Van Montagu. M. (1975) Nature (L.ond.1 256, 273-278.

Fiers, W., Contreras, R., Duerinck, F., Haegeman, G., Iscren- tant, D., Merrcgaert, J., Min Jou, W., Molemans. F.. Raey- maekers, A., Vandenberghe, A, , Volckaert. G. Sr Yhebaert,

Bid. 59. 359-373.

R i ~ l . 68, 455 -464.

(P11r.i.~) 54, 1453-1466.

(Pciri,s) 55, 135-142.

927 - 932.

Nature (Loid.] 237, 82-88.

M. (1976) Nature (Land.) 260. 500-507. Hjeds, H. (1965) Acta Cltem. Scciricl. 19. 1764-1765. De Wachter, R., Merregaert, J., Vandenberghe, A.. Contreras.

Contreras, R. Sr Fiers, W. (1971) FEBS Lett. 16, 281 -283. De Wachter, R . Sr Fiers, W. (1972) Anal. Bioclimni. 4Y. 184- 197. Volckaert, G., Min Jou, W. Sr Fiers, W. (1976) Anal. Bioc.ltem.

Volckaert, G. Sr Fiers, W. (1977) Anal. Biocheni. 83, 228-239. Tinoco, I . , Borer, P. N., Dinger, B., Levin, M. D., Uhlenbeck. 0. C., Crothers, D. M. Sr Gralla, J . (1973) Not . N n i . Biol.

Gralla, J . Sr Crothers, D. M. (1973) J . Mol . Biol. 78. 301 -319.

R. Sr Fiers, W. (1971) Eur. J . Biochem. 22, 400-414.

72, 433 - 446.

246, 40-41.

604 D. lserentant and W. Fiers: Secondary Structure of MS2 RNA

27. Reference deleted. 33. Staples, D. H., Hindley, J., Billeter, M. A. Sr Weissmann, C. 28. Min Jou, W. & Fiers, W. (1976) J . Mol. Biol. 106, 1047- 1060. (1971) Nut. New Biol. 234, 202-204. 29. Weissmann, C., Billeter, M. A,, Goodman, H. M., Hindley, J. 34. Robertson, H. D. (1975) in R N A Phages (Zinder, N., ed.)

Sr Weber, H. (1973) Annu. Rev. Biochern. 42, 303-328. pp. 113 - 146, Cold Spring Harbor Laboratory, Cold Spring 30. Adams, J. M. &Cory, S . (1970) Nature (Lond.) 227, 570-574. Harbor, New York. 31. Argetsinger-Steitz, J. (1969) Nature (Lond.) 224, 957-964. 35. Billeter, M. A,, Dahlberg, J. E., Goodman, H. M., Hindley, J . 32. Weissmann, C., Feix, G. Sr Slor, H. (1968) Cold Spring Harbor & Weissmann, C . (1969) Nature (Lond.) 224, 1083-1086.

Symp. Quanf. Biol. 33, 83- 100.

D. Iserentant and W. Fiers, Labordtorium voor Moleculaire Biologie, Fakulteit der Wetenschappen, Rijksuniversiteit te Gent. K.L. Ledcganckstraat 35, 8-9000 Gent, Belgium