Crystal Structure of the Moloney Murine Leukemia Virus RNase H Domain

11
JOURNAL OF VIROLOGY, Sept. 2006, p. 8379–8389 Vol. 80, No. 17 0022-538X/06/$08.000 doi:10.1128/JVI.00750-06 Copyright © 2006, American Society for Microbiology. All Rights Reserved. Crystal Structure of the Moloney Murine Leukemia Virus RNase H Domain David Lim, 1 G. Glenn Gregorio, 2,3 Craig Bingman, 2,3 Erik Martinez-Hackert, 3 Wayne A. Hendrickson, 2,3 and Stephen P. Goff 2,3 * Integrated Program in Cellular, Molecular and Biophysical Studies, 1 Howard Hughes Medical Institute, 2 and Department of Biochemistry and Molecular Biophysics, 3 College of Physicians and Surgeons, Columbia University, New York, New York 10032 Received 12 April 2006/Accepted 17 May 2006 A crystallographic study of the Moloney murine leukemia virus (Mo-MLV) RNase H domain was performed to provide information about its structure and mechanism of action. These efforts resulted in the crystallization of a mutant Mo-MLV RNase H lacking the putative helix C (C). The 1.6-Å resolution structure resembles the known structures of the human immunodeficiency virus type 1 (HIV-1) and Escherichia coli RNase H. The structure revealed the coordination of a magnesium ion within the catalytic core comprised of the highly conserved acidic residues D524, E562, and D583. Surface charge mapping of the Mo-MLV structure revealed a high density of basic charges on one side of the enzyme. Using a model of the Mo-MLV structure superim- posed upon a structure of HIV-1 reverse transcriptase bound to an RNA/DNA hybrid substrate, Mo-MLV RNase H secondary structures and individual amino acids were examined for their potential roles in binding substrate. Identified regions included Mo-MLV RNase H 1-2, A, and B and residues from B to D and its following loop. Most of the identified substrate-binding residues corresponded with residues directly binding nucleotides in an RNase H from Bacillus halodurans as observed in a cocrystal structure with RNA/DNA. Finally, superimposition of RNases H of Mo-MLV, E. coli, and HIV-1 revealed that a loop of the HIV-1 connection domain resides within the same region of the Mo-MLV and E. coli C-helix. The HIV-1 connection domain may serve to recognize and bind the RNA/DNA substrate major groove. Reverse transcriptase (RT) of retroviruses synthesizes a double-stranded DNA copy of the single-stranded viral RNA genome (2, 49). RT contains two enzymatic domains: a DNA polymerase domain that can use either RNA or DNA as a template and a ribonuclease H (RNase H) domain that is required for degradation of genomic RNA in RNA/DNA hy- brids. The RT of the Moloney murine leukemia virus (Mo- MuLV) is a monomeric enzyme, and the isolated domains can be expressed separately with retention of their respective ac- tivities. Mutations that disrupt the functions of either domain render the virus incapable of replication (45, 48). RNase H plays critical roles at several stages of reverse transcription. During the course of reverse transcription, the minus-strand DNA is primed by a host tRNA annealed to the primer binding site, while the plus-strand DNA is primed by the polypurine tract (PPT), a fragment of the genome. RNase H is required for removal of the tRNA primer, for formation and removal of the PPT, and for removal of the RNA genome to allow plus- strand DNA synthesis. RNase H action is also required for DNA translocations of both minus- and plus-strand DNA in- termediates. The structures of several RNases H have been determined, including those from Escherichia coli, human immunodefi- ciency virus type 1 (HIV-1; both alone as a subdomain and also in the context of the RT holoenzyme), Thermus thermophilus HB8, archaeal RNase HII, and, more recently, Bacillus halo- durans RNase H bound to an RNA/DNA hybrid (10, 21, 26, 27, 30, 32, 37, 50). The structures of these enzymes are quite similar to each other. Moreover, the crystal structures of other enzymes that possess nuclease activity show tertiary folding that is very similar to that of the RNases H. These include the catalytic domain of HIV-1 integrase, phage Mu transposase, RuvC (an endonuclease that cleaves Holliday junctions), E. coli exonuclease I, and the exonuclease domain of E. coli DNA polymerase I (1, 3, 8, 12, 39). A high-resolution Mo-MLV RNase H structure should extend the basis for comparisons among the family members and provide a physical explanation for the distinct characteristics of the Mo-MuLV polymerase and RNase H activities. All of the RNases H and the aforementioned nucleases contain a conserved catalytic triad of acidic residues that co- ordinate one or two Mg 2 or Mn 2 cations. RNases H contain four conserved acidic residues that coordinate divalent cation binding, though only the first three seem to be required for activity (25). Both a one-metal-ion and a two-metal-ion mech- anism of transesterification have been postulated for the RNases H (7, 10, 15, 28, 35, 38, 44, 51). The cocrystal structure reported by Nowotny et al. reveals two magnesium ions bound within the active site (37). These authors proposed a two- metal-ion-dependent mechanism of action, with one magne- sium to activate a nucleophile and the other to stabilize the transition state. Retroviral RNases H from different viruses have broadly similar activities, although there are important differences, in- cluding their recognition of substrate structure and sequence (14). Amino acid sequence alignments show that Mo-MLV RNase H and E. coli RNase H both contain a positively * Corresponding author. Mailing address: 701 West 168th St., HHSC 1310c, College of Physicians and Surgeons, Columbia Univer- sity, New York, NY 10032. Phone: (212) 305-3794. Fax: (212) 305- 5106. E-mail: [email protected]. 8379 on February 10, 2016 by guest http://jvi.asm.org/ Downloaded from

Transcript of Crystal Structure of the Moloney Murine Leukemia Virus RNase H Domain

JOURNAL OF VIROLOGY, Sept. 2006, p. 8379–8389 Vol. 80, No. 170022-538X/06/$08.00�0 doi:10.1128/JVI.00750-06Copyright © 2006, American Society for Microbiology. All Rights Reserved.

Crystal Structure of the Moloney MurineLeukemia Virus RNase H Domain

David Lim,1 G. Glenn Gregorio,2,3 Craig Bingman,2,3 Erik Martinez-Hackert,3Wayne A. Hendrickson,2,3 and Stephen P. Goff2,3*

Integrated Program in Cellular, Molecular and Biophysical Studies,1 Howard Hughes Medical Institute,2 and Department ofBiochemistry and Molecular Biophysics,3 College of Physicians and Surgeons, Columbia University, New York, New York 10032

Received 12 April 2006/Accepted 17 May 2006

A crystallographic study of the Moloney murine leukemia virus (Mo-MLV) RNase H domain was performedto provide information about its structure and mechanism of action. These efforts resulted in the crystallizationof a mutant Mo-MLV RNase H lacking the putative helix C (�C). The 1.6-Å resolution structure resembles theknown structures of the human immunodeficiency virus type 1 (HIV-1) and Escherichia coli RNase H. Thestructure revealed the coordination of a magnesium ion within the catalytic core comprised of the highlyconserved acidic residues D524, E562, and D583. Surface charge mapping of the Mo-MLV structure revealeda high density of basic charges on one side of the enzyme. Using a model of the Mo-MLV structure superim-posed upon a structure of HIV-1 reverse transcriptase bound to an RNA/DNA hybrid substrate, Mo-MLVRNase H secondary structures and individual amino acids were examined for their potential roles in bindingsubstrate. Identified regions included Mo-MLV RNase H �1-�2, �A, and �B and residues from �B to �D andits following loop. Most of the identified substrate-binding residues corresponded with residues directlybinding nucleotides in an RNase H from Bacillus halodurans as observed in a cocrystal structure withRNA/DNA. Finally, superimposition of RNases H of Mo-MLV, E. coli, and HIV-1 revealed that a loop of theHIV-1 connection domain resides within the same region of the Mo-MLV and E. coli C-helix. The HIV-1connection domain may serve to recognize and bind the RNA/DNA substrate major groove.

Reverse transcriptase (RT) of retroviruses synthesizes adouble-stranded DNA copy of the single-stranded viral RNAgenome (2, 49). RT contains two enzymatic domains: a DNApolymerase domain that can use either RNA or DNA as atemplate and a ribonuclease H (RNase H) domain that isrequired for degradation of genomic RNA in RNA/DNA hy-brids. The RT of the Moloney murine leukemia virus (Mo-MuLV) is a monomeric enzyme, and the isolated domains canbe expressed separately with retention of their respective ac-tivities. Mutations that disrupt the functions of either domainrender the virus incapable of replication (45, 48). RNase Hplays critical roles at several stages of reverse transcription.During the course of reverse transcription, the minus-strandDNA is primed by a host tRNA annealed to the primer bindingsite, while the plus-strand DNA is primed by the polypurinetract (PPT), a fragment of the genome. RNase H is requiredfor removal of the tRNA primer, for formation and removal ofthe PPT, and for removal of the RNA genome to allow plus-strand DNA synthesis. RNase H action is also required forDNA translocations of both minus- and plus-strand DNA in-termediates.

The structures of several RNases H have been determined,including those from Escherichia coli, human immunodefi-ciency virus type 1 (HIV-1; both alone as a subdomain and alsoin the context of the RT holoenzyme), Thermus thermophilusHB8, archaeal RNase HII, and, more recently, Bacillus halo-

durans RNase H bound to an RNA/DNA hybrid (10, 21, 26, 27,30, 32, 37, 50). The structures of these enzymes are quitesimilar to each other. Moreover, the crystal structures of otherenzymes that possess nuclease activity show tertiary foldingthat is very similar to that of the RNases H. These include thecatalytic domain of HIV-1 integrase, phage Mu transposase,RuvC (an endonuclease that cleaves Holliday junctions), E.coli exonuclease I, and the exonuclease domain of E. coli DNApolymerase I (1, 3, 8, 12, 39). A high-resolution Mo-MLVRNase H structure should extend the basis for comparisonsamong the family members and provide a physical explanationfor the distinct characteristics of the Mo-MuLV polymeraseand RNase H activities.

All of the RNases H and the aforementioned nucleasescontain a conserved catalytic triad of acidic residues that co-ordinate one or two Mg2� or Mn2� cations. RNases H containfour conserved acidic residues that coordinate divalent cationbinding, though only the first three seem to be required foractivity (25). Both a one-metal-ion and a two-metal-ion mech-anism of transesterification have been postulated for theRNases H (7, 10, 15, 28, 35, 38, 44, 51). The cocrystal structurereported by Nowotny et al. reveals two magnesium ions boundwithin the active site (37). These authors proposed a two-metal-ion-dependent mechanism of action, with one magne-sium to activate a nucleophile and the other to stabilize thetransition state.

Retroviral RNases H from different viruses have broadlysimilar activities, although there are important differences, in-cluding their recognition of substrate structure and sequence(14). Amino acid sequence alignments show that Mo-MLVRNase H and E. coli RNase H both contain a positively

* Corresponding author. Mailing address: 701 West 168th St.,HHSC 1310c, College of Physicians and Surgeons, Columbia Univer-sity, New York, NY 10032. Phone: (212) 305-3794. Fax: (212) 305-5106. E-mail: [email protected].

8379

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

charged �-helix (the C-helix) and loop that are absent inHIV-1 and the avian sarcoma-leukosis virus RNases H (Fig. 2)(10, 22, 23, 26, 50). Modeling with the E. coli enzyme suggeststhat the C-helix facilitates contacts with the RNA/DNA sub-strate, and functional studies confirm that the C-helix contrib-utes to nucleic acid binding (24). Whereas Mo-MLV RT �Cretains low in vitro RNase H activity, viruses encoding theMo-MLV RT �C mutant enzyme do not replicate within cells(6, 46). Such results suggest that the C-helix has a specific invivo role other than simple nonspecific RNA/DNA substratebinding. Mutational analysis of the C-helix in Mo-MLV RT hasidentified specific residues that are important for both poly-merase and RNase H activity (33). That study demonstratedthe importance of the C-helix in the efficiency and specificity ofPPT recognition and cleavage.

The isolated Mo-MLV RNase H domain, expressed inde-pendently of the DNA polymerase domain, retains high levelsof nuclease activity (20, 42, 45), but the independently ex-pressed Mo-MLV RNase H lacking the C helix (�C) lacksdetectable activity (46). The independently expressed HIV-1RNase H domain, which naturally lacks the C-helix, also lacksenzymatic activity. Insertion of the E. coli C-helix into thesingle-domain HIV-1 RNase H, remarkably, restores its activ-ity (29, 43). Reconstitution of nuclease activity can also occurby the addition of the entire p51 subunit or the thumb andconnection subdomains in trans (20, 41). The presence of fu-sion partners or epitope tags during expression can also permitthe preparation of separate HIV-1 RNase H domains withenzymatic activity in some cases (13, 16). These observationssuggest that the C-helix may stabilize the isolated RNase Hdomains or its interaction with substrate.

The study undertaken here was initially aimed toward crys-tallizing the wild-type Mo-MLV RNase H. These efforts failedbut did result in the successful crystallization and structuredetermination of Mo-MLV RNase H �C. An analysis of thisstructure and a comparison with the known structures ofHIV-1 RT and other RNases H are presented here. This struc-ture has already been important in studies with the completeMo-MuLV RT. A previously determined full-length Mo-MLVRT structure exhibited a high degree of disorder, especially inthe region of the RNase H domain (D. Das and M. M. Geor-giadis, unpublished data). The full-length Mo-MLV RT struc-ture, however, was ultimately resolved with the use of thehigh-resolution RNase H �C structure described here (9).

MATERIALS AND METHODS

Plasmid construction. Many different Mo-MLV RNase H constructs weresynthesized. The constructs consisted of N-terminally truncated RNases H tolonger RNases H that extended through the entire connection domain. Con-structs of Mo-MLV RNase H were PCR amplified from pNCS, a proviral DNAclone of the Mo-MLV. Amplified DNAs of the final construct (residues 498 to671) were treated with restriction enzymes resulting in a BamHI-to-EcoRI frag-ment of 522 bp (restriction sites were added into DNA primers). This constructwas identical to the most stable RNase H domain as previously defined (45, 46).The RNase H construct was reiterated by using pNCS �C as a template, resultingin a 489-bp fragment. pNCS �C contains a deletion of the C-helix of theMo-MLV RNase H (deletion of residues I593 through L603); synthesis of thisclone was previously described (46). The fragments were ligated into bacterialexpression vector, pGEX-3x, which contains glutathione S-transferase (GST)and the BamHI-SmaI-EcoRI cloning site 3� of GST to form fusion proteins.Clones were sequenced to verify the fidelity of polymerase amplification andcloning.

Oligonucleotides. The 5� primer for PCR amplification of the final constructwas 5�-CGGGATCCTGGCCGAAGCCCACGGAACCCGA-3�. The 3� primerused for both constructs was 5�-CGGAATTCTCTAGAGGAGGGTAGAGGTGTCTGG-3�. Oligonucleotides were synthesized at the Howard Hughes ProteinChemistry Core Facility.

Protein purification. Relevant bacterial expression plasmids included apGEX-3x plasmid that contained our most stable and enzymatically active Mo-MLV RNase H (45, 46) and a second plasmid that contained the RNase H �Cconstruct. Each was used to transform a BL-21-CodonPlus(DE3)-RIL or -RIL-Xstrain of E. coli (-RIL-X is methionine auxotrophic). Standard growth conditionswere utilized for nonselenomethionyl protein preparations, whereas selenome-thionyl protein preparations utilized a nonauxotrophic protocol (even though thecell strains were auxotrophic) (11). Cells were induced with 0.1 mM IPTG(isopropyl-�-D-thiogalactopyranoside), harvested, and resuspended in 200 mMNaCl–50 mM Tris-HCl (pH 8)–1 mM EDTA–5 mM dithiothreitol (DTT). Pro-tease inhibitors were added (aprotinin, 1 �g/ml; leupeptin, 1 �g/ml; pepstatin, 1�g/ml; phenylmethylsulfonyl fluoride, 100 �g/ml), and the cells were then lysedwith lysozyme and sonication. The suspension was next centrifuged to removecellular debris. Clarified supernatants were collected and rocked in a 50% slurryof glutathione-Sepharose beads for 30 min at 4°C. Beads were centrifuged andwashed three times with resuspension buffer and three times with Factor Xabuffer (250 mM NaCl, 50 mM Tris-HCl [pH 7.5], 1 mM CaCl2). For eachmilliliter of glutathione-Sepharose (100% bed volume), 50 �g of Factor Xa(Boehringer Mannheim) was added, and the beads were incubated at 4° for 16 h.Beads were then centrifuged, and supernatants were collected and modified bythe addition of 100 �g of phenylmethylsulfonyl fluoride/ml, 5 mM DTT, and 2mM EDTA.

The protein solution was dialyzed into 100 mM NaCl–10 mM PIPES (pH6.5)–0.1 mM EDTA–5 mM DTT, and the proteins were then purified over amono-S, cation-exchange chromatography column (Pharmacia). Protein solu-tions were then concentrated and run through a Superdex 200 gel filtrationcolumn (Pharmacia). Proteins were analyzed by sodium dodecyl sulfate-polyacryl-amide gel electrophoresis and Coomassie blue staining and were then concen-trated and quantitated by the Bradford method. Protein solutions were �95%pure, and small samples were submitted for mass spectrometry and limitedN-terminal sequencing to verify mass and protein identity. Cloning intopGEX-3x resulted in the addition of glycine to the N terminus of the viral RNaseH sequence after Factor Xa treatment. Mass spectrometry of selenomethionylproteins revealed �99% selenomethionine incorporation.

Gel filtration resulted in the isolation of one fraction for wild-type RNase Hbut, unexpectedly, two fractions for RNase H �C (data not shown). Wild-typeRNase H eluted just under the 20-kDa range, which was consistent with itsexpected size of 19 kDa. The larger �C fraction eluted in the 35-kDa range,which was consistent for an RNase H �C dimer. The smaller �C fraction elutedunder the 20-kDa range, which was closer to the expected 17.7-kDa size for a �Cmonomer. About 10 to 15% of the RNase H �C protein consistently eluted inthe supposed dimer form. Further investigation of possible Mo-MLV RNase H�C dimerization has been hampered thus far by the poor quality of its X-raydiffraction data (explained below).

Crystallization and data collection. Proteins were concentrated to 7- to 15-mg/ml solutions depending on the yield and used in crystal screens (HamptonResearch) by hanging-drop vapor diffusion at 20°C. Crystallization attempts withconcentrated wild-type Mo-MLV RNase H resulted in no crystals. Crystals,however, were obtained for RNase H �C.

Crude crystals for both �C monomers and dimers were initially isolated indrops equilibrated against a reservoir solution of 30% PEG 4000 and 0.2 Mammonium sulfate. Starting drops contained equal volumes of stock proteinsolution (7 to 10 mg/ml; 10 mM HEPES [pH 7], 150 mM NaCl, 5 mM DTT, 0.1mM EDTA) and reservoir solution. Single, large crystals were regularly grownfor �C dimers with a reservoir solution that contained buffer between pH 5.2 and6.0, NaCl, and polyethylene glycol (PEG; molecular weight of 1,450, 3,000, or4,000). Crystallization occurred either with or without the presence of ammo-nium sulfate. The concentrations of protein, PEG, and ammonium sulfate wereall correlated: the higher the protein concentration, the less PEG and/or ammo-nium sulfate required. Small crystals were visible within a few hours and grew toterminal size by 5 days. Seeding was not required for single crystal growth of �Cdimers. Typical crystals were rhomboidal, with dimensions of 0.4 by 0.15 by 0.08mm. Similar-sized crystals of selenomethionyl RNase H �C dimers were ob-tained under the same conditions.

Crystals were diffracted at X4A at the National Synchrotron Light Source atBrookhaven National Laboratory. The �C dimer crystals had unit cell parame-ters of a � b � 40.60 Å, c � 158.03 Å, � � � � 90°, and � 120°. The solvent

8380 LIM ET AL. J. VIROL.

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

content was 40.5%, and the unit cell volume was 225,585 Å3. The number ofmonomers per asymmetric unit was 2 but unconfirmed, and the space group ofthe crystals was either P31 or P32. Although morphologically very beautiful, the�C dimer crystals were not good candidates for structural work because of theirweak diffraction. A full four-wavelength data set was collected on these crystals,but the data were weak and possessed high Rsym values. Statistically, the crystalswere unsatisfactory for MAD (for multiwavelength anomalous diffraction) phas-ing beyond 3.7 Å and refinement to 3 Å.

The �C monomer crystallization failed to yield single, large crystals. Instead,crystallization was limited to needle bundles, with the largest needles occurringwith a reservoir solution that contained pH 5 buffer composed of NaCl, zincsulfate, PEG 1500, PEG MME 550 (MME referring to monomethyl ether),ammonium sulfate, 2-propanol, and 2-morpholinoethanesulfonic acid (MES).Small nucleations were visible within 1 day and grew to terminal size by 1 week.Seeding did not result in single crystal formation or any improvement in needlesize. Similar crystallization results were obtained with a selenomethionyl deriv-ative of RNase H �C monomers. The needle bundles were not amenable forcrystallographic studies, but individual needles broken from the clusters provedto diffract exceptionally well.

The �C monomer crystal was of space group P1 and had unit cell param-eters of a � 32.19 Å, b � 33.96 Å, c � 34.40 Å, � � 78.11°, � � 69.77°, and � 64.80°. The solvent content was 30%, and the unit cell volume was 31,852Å3. The number of monomers per asymmetric unit was 1. A complete four-wavelength MAD data set was collected from one exceptional �C monomercrystal needle.

For cryo experiments during X-ray diffraction, �C monomer crystals weresoaked in 15% PEG 4000–20% PEG 400–0.1 M ammonium sulfate–150 mMNaCl–1% 2-propanol–1.25% PEG MME 550–5 mM MES (pH 6.5)–0.5 mM zincsulfate for 5 min. Crystals were then flash frozen in liquid nitrogen and main-tained at 100°K in a nitrogen Oxford Cryosystem. The MAD data set wascollected at the NSLS Beamline X4A on a CCD Quantum 4 detector.

Protein structure accession numbers. Structure coordinates have been depos-ited in the RCSB Protein Data Bank and assigned RCSB identification codercsb038154 and PDB identification code 2HB5.

RESULTS AND DISCUSSION

Structure determination. Early work showed that the iso-lated wild-type Mo-MLV RNase H domain could be readilyexpressed in bacteria and exhibited high RNase H activity (45,46). Crystallization efforts with a variety of polypeptides iden-tical or closely related to these constructs were not successful.Similar efforts with �C variants of the domain, however, re-sulted in the preparation of large and useful crystals (see Ma-terials and Methods). The most stable RNase H �C domainisolated for crystallization was identical to one previously con-structed (46). The selenomethionyl variant of the Mo-MLVRNase H �C was expressed as a fusion protein with GST inmethionine auxotrophic bacteria and isolated by using gluta-thione-linked Sepharose beads, Factor Xa cleavage release,and further purification through cation-exchange and gel fil-tration columns. Successful RNase H �C crystallization wasachieved by standard crystallization screening methods. Crys-tals were diffracted by using four wavelength data sets andanalyzed using MAD phasing. Diffraction data and structuredetermination were performed by using standard crystallogra-phy software programs. Refinement resolved the structure toan 1.6-Å resolution. Pertinent statistics for structure determi-nation are provided in Table 1.

Polypeptide folding. The Mo-MLV RNase H �C structurereveals a central five-stranded, mixed �-sheet (four paralleland one antiparallel) surrounded by four �-helices (Fig. 1A toC). Despite only sharing 26% amino acid identity with E. coliRNase H and 18.5% identity with HIV-1 RNase H, Mo-MLV

TABLE 1. Diffraction data and structure refinement

Parameter a Data valueb

Low remote �nflection Peak High remote

(Å) 0.98793 0.97938 0.97908 0.96866Bragg spacings (Å) 20-1.6 20-1.6 20-1.6 20-1.6Reflections

Measured 49,195 58,838 59,200 59,307Unique 16,018 15,794 15,825 15,861

Completeness (%) 95.3 (89.5) 95.0 (90.5) 95.2 (91.0) 95.3 (90.2)I/�(I) 13.7 (5.6) 15.9 (10.3) 15.9 (10.1) 15.6 (9.4)RSym (%) 4.6 (24.6) 4.3 (12.6) 3.7 (14.1) 3.7 (15.7)RWork (%) 19.55RFree (%) 22.09� cutoff 0No. of reflections used in refinement 15,825No. of reflections (work/test) 14,232/1,593No. of protein atoms 1,229No. of waters 148No. of metal ions 1Average B-factors (Å2) 19.4072rmsd from bond length (Å) 0.004Ideal stereochemistry, bond angle (°) 1.18rmsd B-factors (Å2)

Main chain bond/angles 1.255/1.941Side chain bond/angles 2.354/3.438

Ramachandran dihedral angles (%)Residues in favored regions 97.92Residues in allowed regions 100

a rmsd, root mean square deviation. Rsym � ( �Ih ��Ih��)/ Ih, where �Ih� is the average intensity over symmetry equivalent. Rwork � ��Fo� � �Fc��/ �Fo�. Rfree isequivalent to Rwork but is calculated for a randomly chosen 9.4% of reflections, which were omitted from the refinement process.

b Numbers in parentheses refer to the values for the outer shell (1.66 to 1.60 Å).

VOL. 80, 2006 RNase H STRUCTURE 8381

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

RNase H �C has an overall structure very similar to that ofboth enzymes (Fig. 2 and 3). Several gross differences betweenthe RNase H structures are apparent. Both the sequence align-ment and the ribbon diagrams show that Mo-MLV RNase H�C has a longer �-strand 1 and �-helix E and a shorter�-strand 3 compared to the other two RNases H. The longer�-strand 1 actually bends toward �-helix E, making the loopbetween �1 and �2 lie prominently on the side of the entirefive-stranded �-sheet where �-helix E resides alone. In E. coliRNase H the loop between �1 and �2 actually bends away

from �-helix E, while in HIV-1 RNase H, the loop bends onlyslightly toward �-helix E. In Mo-MLV RNase H �C, the �1-to-�2 transition region is held by hydrogen bonding betweenthe backbone amide and carbonyl oxygen of Q530 to the back-bone carbonyl oxygen and amide of Q533, respectively. It islikely that this loop and surrounding region closely interactwith the RNA/DNA hybrid minor groove based on the E. coliRNase H structure modeled with the substrate (50). The samebending toward �-helix E is observed for the transition regionbetween Mo-MLV RNase H �C �2 and �3. In this region, the

FIG. 1. Ribbon diagrams of the Mo-MLV RNase H �C in three different perspectives. The structure shows a central five-stranded, mixed�-sheet surrounded by four �-helices. The N terminus (NH3-) to the C terminus (-COOH) is color coordinated from dark blue, green, yellow,orange, and finally red. The magenta-shaded loop represents the unsolved His-loop between �5 (darker yellow) and �E (red), residues P636 toK640. Strictly for ribbon diagram purposes only, we have represented this loop as a best fit from the E. coli RNase H structure. The gray sphererepresents magnesium. The figure was made by using MOLSCRIPT and Raster 3D (31, 34). (A) Mo-MLV RNase H is oriented as if the fingers,palm, and thumb of the polymerase domain are above the RNase H (top of page) and opened in right-handed fashion with the fingers on the left,the palm in the middle, and the thumb on the right. (B) Mo-MLV RNase H is rotated �145° around the x axis compared to Fig. 1A. Thisorientation best shows the putative active site and substrate binding face of the protein. (C) Mo-MLV RNase H is rotated �45° around the y axiscompared to Fig. 1B. This orientation best displays the magnesium ion coordination site. (D) Close-up of magnesium ion coordination in the sameperspective as depicted in Fig. 1C. Conserved residues aspartate 524 and 583, glutamate 562, and two water moiety oxygens (labeled W1 WAT andW2 WAT [represented in red]) interact with the magnesium ion. The magnesium ion is coordinated in tetragonal bipyramidal fashion (theadditional water molecule not shown here).

8382 LIM ET AL. J. VIROL.

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

side chain hyroxyl group of T541 forms a hydrogen bond withthe side chain hydroxyl group of T542 causing a sharp bend in�2 toward �-helix E. No functional assignment has yet beendetermined for this region through mutational analysis.

The autotrace structure is complete with the exception of 15amino acids, located in three different regions. These includethe N-terminal six amino acids (residues beginning at Mo-MLV RT I498; sequence GILAEA, including an additionalglycine by virtue of the Factor Xa cleavage site); the regionfrom P636 to K640 (sequence PGHQK); and the region S668to L671 (sequence STLL, the C-terminal four amino acids).The five residues P636 to K640 comprise the loop region be-tween �-strand 5 and �-helix E and contain a histidine (H638)conserved in all RNases H. This region is also known as the“His-loop” and was not resolved in the initial crystal structureof the isolated HIV-1 RNase H (10). This loop is considered tobe highly flexible, thus impeding structure determination for

certain crystals of RNase H. The electron density map formonomeric RNase H �C contains clearly defined regionsfor much of the missing sequences (data not shown). For thestrict purpose of providing a continuous ribbon structure, theMo-MLV �C His-loop was manually traced as a best fit fromthe E. coli RNase H structure for Fig. 1 and 4. This region isrepresented in magenta in both figures and should not beassumed to portray the true His-loop structure.

Analysis of the E. coli RNase H structure revealed an ex-tensive hydrophobic core mainly centered between �-helices Aand D. This core is protected and surrounded by electrostaticinteractions at the helical termini, as well as the hydrogen bondnetworks each from the �-sheet and �-helical interactions(A�B, B�D, and C�D). For Mo-MLV RNase H �C, there isa similar hydrophobic core between �-helices A and D, but itis only bounded by one electrostatic interaction on the N-terminal end of the helices. This electrostatic interaction is

FIG. 2. Structural alignment of Mo-MLV, E.coli, HIV-1, and B. halodurans (Bh) RNases H. Each Mo-MLV RNase H secondary structure ismarked above in a color scheme of secondary structures similar to that shown in Fig. 1A to C. The putative helix C is marked with a dotted blueline above, and all deleted residues from the Mo-MLV RNase H sequence are underscored in orange. Residues that contact the RNA strand arelabeled in blue, while residues contacting DNA are highlighted in red. Residues contacting both RNA and DNA are shown as red on blue.Hydrophobic core residues are highlighted in gray. Highly conserved catalytic aspartate and glutamate residues are highlighted in yellow (Mo-MLVresidues 524, 562, 583, and 653). The three conserved regions are boxed.

VOL. 80, 2006 RNase H STRUCTURE 8383

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

mediated by the side chain carbonyl oxygen of N613 forming ahydrogen bond with the side chain ε-amino group of R560. Theside chain carbonyl oxygen of E616 also forms a hydrogenbond with a terminal side chain amino group of R560. Betweenthe two helices, there are a series of hydrophobic residues thatinteract with one another. Mo-MLV �C also has an extensivehydrophobic core on the other side of �-helix A at the interfacewith �-strands 1, 2, and 3. Structural analysis shows that bothE. coli and HIV-1 RNases H do not possess such an extensivehydrophobic network between �-strands 1, 2, and 3 and theN-terminal half of �-helix A. This extended hydrophobic coreof Mo-MLV RNase H �C may be unique for the murineretroviral RNases H. Table 2 summarizes all residues involvedwith the hydrophobic core region of Mo-MLV RNase H. Inaddition, the sequence alignment in Fig. 2 highlights in gray allhydrophobic core residues identified in this structure, as well asRNases H from E. coli, HIV-1, and B. halodurans.

Magnesium ion coordination site. Higher-resolution diffrac-tion revealed a magnesium ion coordinated in a tetragonalbipyramidal fashion within the putative RNase H active site(Fig. 1D). Three water molecule oxygens and the conservedacidic residues of the catalytic core (D524, E562, and D583)interact with the magnesium ion in its observed location. These

conserved acidic residues and a conserved aspartate in �-helixE are highlighted in yellow in Fig. 2. Of note, this region of thestructure is accessible to surrounding solvent and faces towardthe putative substrate-binding side of RNase H (Fig. 1, 4, and5). More recent data from Nowotny et al. revealed that RNaseH crystallized in the presence of RNA/DNA substrate con-tained two magnesium ions within the catalytic core (37). Thesecond magnesium ion (not observed in our structure) is highlycoordinated in position by the substrate, itself, within the co-crystal structure. The substrate-guided coordination of thissecond magnesium ion is postulated to designate the substratespecificity of the RNases H to cleave RNA/DNA hybrid struc-tures. The magnesium ion observed in our structure corre-sponds to the nonactivating magnesium (metal ion B as des-ignated by Nowotny et al.). This magnesium ion is believed tostabilize the transition state intermediate during catalysis.

Substrate binding regions are highly basic in the surfacemodel of RNase H �C. A molecular surface model of Mo-MLV RNase H �C was rendered and analyzed with respect tosurface charges that might potentially interact with substrate(Fig. 4) (36). The surface reflects the solvent-accessible regionsof the protein and is labeled blue in basic regions, red in acidicregions, and white where there is no net electrostatic potential

FIG. 3. Structure comparisons. The figure was made by using MOLSCRIPT and Raster 3D (31, 34). (A) Stereo images of superimposition ofMo-MLV RNase H (blue) with crystal structure of E. coli RNase H (green). Note the gap in the Mo-MLV structure for residues P636 to K640,which account for the unsolved His-loop. The gray sphere represents the magnesium ion coordinated within the active site. The orientation ofstructures is as depicted in Fig. 1B to best display the active site and substrate binding face. (B) Superimposition of Mo-MLV RNase H (blue) withcrystal structure of HIV-1 RNase H (red).

8384 LIM ET AL. J. VIROL.

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

charge. The catalytic site that contains the four conservedcarboxylate groups is clearly acidic and resides slightly withinthe interior of the protein. Here the two magnesium cationsbind to mediate nuclease activity (magnesium not shown). Thesurface that is proposed to interact with the RNA/DNA hybridis highly basic, with many arginines and lysines responsible forthe electrostatic charge. The predominant positively chargedbinding pocket is consistent with the model of RNase H pri-marily binding negatively charged DNA and RNA backbonephosphates as a basic mode of substrate recognition. The restof the shown surface of Mo-MLV RNase H �C in Fig. 4 isfairly neutral, whereas the back surface (not shown) contains

small regions of basic charge, but none as large and as dense asthe putative substrate binding region.

The most apparent surface residues composing the posi-tively charged surface were identified as R534, Q559, R560,R585, K609, K612, N613, and K614. Most of these residuescorrespond to previously identified substrate-binding residuesin E. coli, HIV-1, and B. halodurans identified by crystal struc-ture or biochemical analysis (Fig. 2) (4, 5, 17, 24, 25, 37, 47). Tofurther characterize these surface residues, a substrate-bindingmodel was constructed as described below.

Superimposed model of Mo-MLV RNase H �C complexedwith RNA/DNA. Since there was strong structural homologybetween Mo-MLV RNase H �C and the RNases H from E.coli and HIV-1, structural coordinates were submitted to theDali Server of the European Bioinformatics Institute of EMBL(www.ebi.ac.uk/dali/) to obtain structural alignments of theRNases H (18, 19). Rotation coordinates were obtained tosuperimpose the E. coli RNase H and Mo-MLV RNase H �Cstructures upon a base structure of HIV-1 RT bound to anRNA/DNA hybrid composed of sequences from the PPT (40).This allowed the construction of structural models of E. coliRNase H and Mo-MLV RNase H �C complexed with anRNA/DNA substrate. The superimposed models showed thatmany of the identified regions and residues with known func-tions in HIV-1 RNase H had conserved or homologous resi-dues in the E. coli and Mo-MLV structures correctly aligned inspace (Fig. 2, 3, and 5). This allowed the identification and

FIG. 4. (A) Electrostatic potential surface computed with GRASP. Blue represents positive potential, red represents negative potential, andwhite represents neutral potential. The catalytic or active site is electrostatically negative, whereas putative substrate binding region is positive. Thelocations of the surface amino acids implicated in substrate interactions are labeled accordingly. Two different orientations differing by roughly�50° rotation around the y axis are shown. (B) Worm diagrams of RNase H �C in the same orientations as those shown in the correspondingdiagrams above in panel A. The worm color (white) does not reflect electrostatic potential. All figures were made by using GRASP (36).

TABLE 2. Secondary structures and residues of hydrophobiccore of Mo-MLV RNase H �C

Secondary structure Residuesa

�-Strand 1...............................................H519, W521, T523, S527�-Strand 2...............................................A536, A538, V540�-Strand 3...............................................I547, W548, K550, L552�-Helix A � loop ..................................A558, A561, L563, I564, A565,

L566, A569, L570, A573�-Strand 4...............................................L578, V580�-Helix B ................................................A587, A591�-Helix D................................................I617, L620, L624�-Strand 5...............................................L630, I632

a Residues H519, T523, S527, and K550 utilize side-chain methyl groups inhydrophobic core interactions.

VOL. 80, 2006 RNase H STRUCTURE 8385

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

characterization of potential binding domains and residueswithin the Mo-MLV �C structure with a high degree of con-fidence. These findings, however, must be interpreted withcaution since they are not actual cocrystal structures. In addi-tion, since the Mo-MLV RNase H �C is only a single-domainstructure, the superimposed model constructed here may notreflect how wild-type Mo-MLV RNase H binds to its substratein the context of full-length RT. Finally, the actual PPT sub-strate crystallized by Sarafianos et al. displayed a number ofweakly paired, unpaired, and mismatched bases. Such phe-nomena may be a result of the inherent structure within thesequences of the PPT and the manner in which it binds HIV-1RT. Thus, the data from that study may not accurately reflecthow random-sequence, RNA/DNA hybrid substrates actuallybind RT and become cleaved by the RNase H domain.

Figure 5 illustrates the overall structural model of Mo-MLVRNase H �C bound to substrate. The perspective shown herebest illustrates how the RNA/DNA substrate interacts with thestructure and how the RNA strand fits into the active site forcleavage. Closer examination of the active site residues showthat the structural alignments have superimposed all threestructures such that the catalytic site carbonyl oxygen atomsare all within 2.5Å of their homologous atom in each of the

other two structures (data not shown). It is clear from Fig. 5that the major portion of Mo-MLV RNase H preferentiallybinds the minor groove of the RNA/DNA substrate. Bindingseems to take place in two overall locations. First, DNA isrelatively close and accessible to �-helices A and B and theloop preceding �-helix D. The second binding region appearsto interact with the RNA strand. RNA is close and accessibleto �-strand 1 and �-helices A and B.

Substrate binding determinants of Mo-MLV RNase H �C.An analysis of the superimposed model of Mo-MLV RNase H�C was performed to identify those regions and their residuesthat may be important for RNA/DNA interactions. Four dif-ferent regions from the primary structure were identified aspotential binding sites for Mo-MLV RNase H �C. These in-clude the end of �-strand 1 to the beginning of �-strand 2, thefirst few residues of �-helix A, the first few residues of �-helixB, and the loop region prior to the beginning of �-helix D(summarized in Table 3). In the region from �-strand 1 to�-strand 2, two residues directly face into the minor groove ofthe RNA/DNA hybrid. L529 resides on �-strand 1 and facesinto the base pairs of the hybrid substrate. It is the only hy-drophobic residue that is directly exposed into the substrate.The side chain, terminal amino group of R534 resides at adistance of 3.25 Å from a phosphate group of the DNA back-bone. For the purposes of identification, this phosphate will belabeled phosphate (�4). DNA nucleotide (�4) lies 4 nucleo-tides 3� from the base pair that contains the “scissile phos-phate” (the 5� phosphate that is retained by the RNA nucle-otide after cleavage).

In the N-terminal region of �-helix A, three residues makepotential contacts with a DNA phosphate group that is locatedjust 3� of phosphate (�4). This phosphate will thus be labeledthe DNA phosphate (�5). S557, Q559, and R560 all clusterclose to one another along the DNA strand. The side chainhydroxyl of S557, the side chain carbonyl oxygen of Q559, andthe terminal side chain amino group of R560 are only 3.46,3.59, and 5.55 Å, respectively, from the DNA phosphate group

FIG. 5. Model of Mo-MLV RNase H �C binding RNA/DNA sub-strate based upon superimposed modeling upon the HIV-1 RT co-crystal structure. DNA is represented in red, while RNA is representedin blue. The color scheme, gray sphere, and magenta-shaded His-loopare depicted as described in Fig. 1. The model is oriented to bestdisplay the interface between RNase H and RNA/DNA substrate. Thefigure made by using MOLSCRIPT and Raster 3D (31, 34).

TABLE 3. Amino acids potentially involved in substratebinding for Mo-MLV RNase H �C

Aminoacid

Structurelocation Substrate binding (location)a

CorrespondingHIV-1 RTp66 residue

L529 �1-�2 Possible hydrophobic interactionwith minor groove

R448

R534 �1-�2 DNA phosphate (�4 bp) –b

S557 �A DNA phosphate (�5 bp) T473Q559 �A DNA phosphate (�5 bp) Q475R560 �A DNA phosphate (�5 bp) K476R585 �B 2� �OH group of RNA

(�2 or �3 bp)Q500

Y586 �B DNA phosphate (�6 bp) Y501K609 �B-D loop Possible interaction with RNA

backbone phosphates–b

K612 �B-D loop DNA phosphate (�6 bp) –N613 �D DNA phosphate (�5 bp) –b

a Based upon a model of the structure of Mo-MLV RNase H �C superim-posed upon HIV-1 RT bound to RNA/DNA substrate (40). The base pair withthe scissile phosphate is designated �1; the base pairs located 3� of the scissilephosphate in DNA are �1, �2, �3, etc. The base pairs located 5� of the scissilephosphate are �2, �3, �4, etc.

b –, The corresponding HIV-1 residue did not contact the substrate or nocorresponding HIV-1 residue was identified.

8386 LIM ET AL. J. VIROL.

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

(�5) (data not shown). All of these polar groups potentiallyform electrostatic interactions with the DNA phosphatemoiety.

The N-terminal region of �-helix B contains R585 and Y586that may interact with the RNA and DNA strands, respec-tively. The terminal side chain amino groups of R585 lie closeto the 2�-hydroxyl of the RNA ribose sugar. This RNA nucle-otide complements the DNA nucleotide just 5� of the DNAnucleotide (�4). The side chain �OH group of Y586 lies closeto a DNA phosphate group just 3� of DNA phosphate (�5).This phosphate will thus be referred to as DNA phosphate(�6). The distance between this tyrosyl hydroxyl group and theDNA phosphate (�6) oxygen is 3.25Å. Interestingly, Blain andGoff have previously synthesized and studied a Mo-MLV RTmutation that substituted phenylalanine for Y586 (Y586F) (4,5). Mutant Y586F resulted in a nonreplicative virus with themutant RT having only 5% of the wild-type RNase H activityin an in situ RNase H assay. In another study, Zhang et al.determined that the Mo-MLV Y586F mutant RT resulted in a17-fold-higher rate of substitution mutations during polymer-ization especially along adenine-thymine tracts (52). It is pos-tulated that Y586 not only binds to substrate but also conformsand bends RNA/DNA substrate to optimize DNA polymerasefidelity. Given that RT has no proofreading abilities, RNase Hdomain binding determinants must have had important influ-ences in the accurate genomic replication and survival of ret-roviruses as a whole.

For �-helix D and its proceeding loop, K612 lies close to DNAphosphate (�6), whereas N613 actually lies closer to DNA phos-phate (�5). N613 is the first residue of the D-helix and lies closerto the residues of �-helix B identified around DNA phosphate(�5) (see Fig. 4 [GRASP diagram]). The terminal side chainamino group of K612 resides 2.86Å from DNA phosphate (�6),whereas the terminal, side chain amino group of N613 resides3.59 Å from DNA phosphate (�5).

One last residue of note is K609, which resides in the loopregion between �-helices B and D. The terminal side chainamino group points toward the RNA strand phosphate groupsat a distance between 6.5 to 7.5 Å. Although this is a fairly largedistance, one must take into consideration the absence of theC-helix. Overlay of the E. coli RNase H in this same regionshows the homologous loop between �-helices C and D andthe homologous residue (K96) to reside a few angstroms closerto the RNA backbone phosphate (data not shown). E. coliresidue K99, however, overlaps right over its homologue, Mo-MLV K612, suggesting the only differences between the two isthe displacement of the loop and the missing C-helix. K609,therefore, may have a role in RNA backbone phosphate con-tacts.

The residues of Mo-MLV RNase H �C identified here aspotentially involved in binding substrate are highlighted in red(for DNA interaction) and in blue (for RNA interaction) inFig. 2. Figure 2 also highlights in similar fashion residues iden-tified to bind substrate for RNases H of E. coli, HIV-1, and B.halodurans. Most of the Mo-MLV RNase H residues identifiedby our model are in concordance with the previous data for thevarious species of RNases H. This suggests that our modelaccurately represents Mo-MLV RNase H binding to substrate.Such a model may be used for future mutagenesis, biochemicalanalysis, and viral replication studies of Mo-MLV. We have

tabulated the identified Mo-MLV RNase H residues, theirrespective substrate binding locations, and the correspondingHIV-1 RT residues in Table 3. In similar fashion, Table 4 liststhe identified HIV-1 p66 connection and RNase H domainresidues, their respective substrate binding locations, and thecorresponding Mo-MLV RT residue. Both RNase H struc-tures of Mo-MLV and HIV-1 possess the appropriate or ho-mologous residue for each specified substrate interaction.Each sequence contained a few substrate-binding residues thathad no obvious corresponding residue, or a nonbinding corre-sponding residue, in the other species. Such differences inbinding determinants may partially be explained by the role ofthe HIV-1 p66 connection domain.

The connection domain of HIV-1 RT p66 structurally re-places the missing C-helix of the RNase H domain. Previously,a functional analysis was conducted of Mo-MLV RNase Hfocusing on the role of the C-helix in viral reverse transcription(33). In particular, four point mutations in this region—H594A, I597A, R601A, and G602A—resulted in mutant vi-ruses with reduced or no replication competency. Isolation ofreverse-transcribed DNA from cells infected with these mutantviruses revealed either a reduction or a loss of reverse tran-scription ability. Sequence analysis of viral DNA ends pro-duced by replicating C-helix mutant viruses (H594A, I597A,and G602) showed defects that resulted from the loss of RNaseH cleavage specificity. The loss of in vivo RT activity combinedwith the inability to form the correct viral DNA ends may be aresult of the loss of substrate binding capabilities normallymediated by the RNase H C-helix. Mutations of the homolo-gous region in E. coli substantiate this hypothesis (24).

To see how HIV-1 RT compensates for the functional ca-pacities of the Mo-MLV RNase H C-helix, the HIV-1 RT p66connection domain was added to the superimposed structuresof E. coli, Mo-MLV, and HIV-1 RNases H bound to substrate.Only this domain was added since all other domains of HIV-1

TABLE 4. Amino acids involved in substrate binding forp66 HIV-1 connection and RNase H domainsa

Aminoacid

Structurelocationb Substrate binding (location)c

CorrespondingMo-MLV RT

residue

G359 p66 connection DNA phosphate (�7 bp) –d

A360 p66 connection DNA phosphate (�7 bp) –d

H361 p66 connection DNA phosphate (�6 bp) –d

R448 �1 RNA base and sugar groups(�1 and �2 bp)

L529

T473 �A DNA phosphate (�5 bp) S557N474 �A RNA phosphate bond (�1 bp) –d

Q475 �A DNA sugar group (�4 bp), alsoRNA base (�2 bp), alsoRNA sugar group and2� �OH (�1 bp)

Q559

K476 �A DNA phosphate (�5 bp) R560Q500 �B RNA phosphate bond (�2 bp) R585Y501 �B DNA phosphate bond (�5 bp) Y586I505 �B DNA phosphate (�6 bp) –d

H539 �5-�E loop RNA phosphate (�1 bp) –d

a Based upon structure of HIV-1 RT bound to RNA/DNA substrate (40).b Secondary structures refer to the HIV-1 RNase H domain.c The base pair with the scissile phosphate is designated �1. Base pairs located

3� of scissile phosphate in DNA are �1, �2, �3, etc. Base pairs located 5� are�2, �3, �4, etc.

d –, The corresponding Mo-MLV residue did not contact the substrate or nocorresponding Mo-MLV residue was identified.

VOL. 80, 2006 RNase H STRUCTURE 8387

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

RT were too distant to be of any relevance for this particularanalysis. HIV-1 RT p66 residues K353 to T365 comprise a loopwithin the connection domain. This loop is structurally locatedin the exact region where the E. coli RNase H C-helix resides(also the assumed Mo-MLV RNase H C-helix location) (Fig.6). Closer analysis of the sequence shows a cluster of threepositively charged residues—K353, R356, and R358—that aresimilar to Mo-MLV RNase H residues R599, R600, and R601(data not shown). The positively charged residues of HIV-1RT, however, are more widely spaced, lie closer to the sub-strate, and approach the major groove from a different angle.HIV-1 RT R358 lies the closest to E. coli RNase H R88(corresponds to Mo-MLV RT R601). Unexpectedly, HIV-1RT H364 and T365 almost exactly superimpose E. coli RNaseH W81 and N84 (which correspond to Mo-MLV RT H594 andI597, respectively). As described previously, threonine was afunctional residue when substituted for Mo-MLV RT residueI597 (33). The potential three-dimensional alignment of theHIV-1 connection domain residues R358, H364, and T365 tothe predicted Mo-MLV RNase H C-helix residues R601,H594, and I597 strongly suggest that this connection domainregion of HIV-1 RT has replaced the substrate binding abilitiesof the missing HIV-1 RNase H C-helix. HIV-1 RT connectiondomain residues G359, A360, and H361 have also been iden-tified as binding DNA phosphates (�7) and (�6) (Table 4)(40). These residues, however, did not overlay in space as wellwith the E. coli RNase H C-helix. The substitution of HIV-1RT p66 connection domain residues K353 to T365 for theRNase H C-helix explains why an inactive, single-domainHIV-1 RNase H regains nuclease activity with the addition ofthe connection domain either in cis or in trans (41). Mutationalanalysis of this region of the HIV-1 RT p66 connection domainmay reveal its role in nucleotide binding.

Conclusion. The high-resolution crystallographic resolutionof the Mo-MLV RNase H presented here has provided an-other RNase H variant for global comparison. This reportedstructure has already helped for the final resolution of thefull-length Mo-MLV RT structure (9). The surface chargemapping and superimposed modeling has allowed identifica-tion of the regions and residues most likely to bind RNA/DNA

substrate. Most of these residues correlate with previously re-ported substrate-binding determinants in other species ofRNase H (17, 24, 25, 37, 40). A full understanding of thevarious retroviral RTs and their substrate interactions mayhelp explain the detailed roles of these enzymes in retroviralreplication.

ACKNOWLEDGMENTS

This study was supported by PHS grant R37 CA 30488 from theNational Cancer Institute. G.G.G. and C.B. were Associates andW.A.H. and S.P.G. are Investigators of the Howard Hughes MedicalInstitute.

REFERENCES

1. Ariyoshi, M., D. G. Vassylyev, H. Iwasaki, H. Nakamura, H. Shinagawa, andK. Morikawa. 1994. Atomic structure of the RuvC resolvase: a Hollidayjunction-specific endonuclease from Escherichia coli. Cell 78:1063–1072.

2. Baltimore, D. 1970. RNA-dependent DNA polymerase in virions of RNAtumour viruses. Nature 226:1209–1211.

3. Beese, L. S., and T. A. Steitz. 1991. Structural basis for the 3�-5� exonucleaseactivity of Escherichia coli DNA polymerase I: a two metal ion mechanism.EMBO J. 10:25–33.

4. Blain, S. W., and S. P. Goff. 1995. Effects on DNA synthesis and transloca-tion caused by mutations in the RNase H domain of Moloney murineleukemia virus reverse transcriptase. J. Virol. 69:4440–4452.

5. Blain, S. W., and S. P. Goff. 1993. Nuclease activities of Moloney murineleukemia virus reverse transcriptase: mutants with altered substrate speci-ficities. J. Biol. Chem. 268:23585–23592.

6. Boyer, P. L., H. Q. Gao, P. Frank, P. K. Clark, and S. H. Hughes. 2001. Thebasic loop of the RNase H domain of MLV RT is important both for RNaseH and for polymerase activity. Virology 282:206–213.

7. Brautigam, C. A., and T. A. Steitz. 1998. Structural principles for the inhi-bition of the 3�-5� exonuclease activity of Escherichia coli DNA polymeraseI by phosphorothioates. J. Mol. Biol. 277:363–377.

8. Breyer, W. A., and B. W. Matthews. 2000. Structure of Escherichia coliexonuclease I suggests how processivity is achieved. Nat. Struct. Biol. 7:1125–1128.

9. Das, D., and M. M. Georgiadis. 2004. The crystal structure of the monomericreverse transcriptase from Moloney murine leukemia virus. Structure 12:819–829.

10. Davies, J. F., II, Z. Hostomska, Z. Hostomsky, S. R. Jordan, and D. A.Matthews. 1991. Crystal structure of the ribonuclease H domain of HIV-1reverse transcriptase. Science 252:88–95.

11. Doublie, S. 1997. Preparation of selenomethionyl proteins for phase deter-mination. Methods Enzymol. 276:523–530.

12. Dyda, F., A. B. Hickman, T. M. Jenkins, A. Engelman, R. Craigie, and D. R.Davies. 1994. Crystal structure of the catalytic domain of HIV-1 integrase:similarity to other polynucleotidyl transferases. Science 266:1981–1986.

13. Evans, D. B., K. Brawn, M. R. Deibel, Jr., W. G. Tarpley, and S. K. Sharma.1991. A recombinant ribonuclease H domain of HIV-1 reverse transcriptasethat is enzymatically active. J. Biol. Chem. 266:20583–20585.

FIG. 6. Stereo-superimposed structures of the RNase H C-helical regions of Mo-MLV �C (in blue), E. coli (in green with the C-helix residuesW81 to G89 in orange-red), and HIV-1 RNase H (in yellow) and the p66 connection K353 to T365 (in magenta). The RNA/DNA substrate is onthe left (with DNA in red and RNA in blue). The figure made by using MOLSCRIPT and Raster 3D (31, 34).

8388 LIM ET AL. J. VIROL.

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from

14. Gao, H. Q., S. G. Sarafianos, E. Arnold, and S. H. Hughes. 1999. Similaritiesand differences in the RNase H activities of human immunodeficiency virustype 1 reverse transcriptase and Moloney murine leukemia virus reversetranscriptase. J. Mol. Biol. 294:1097–1113.

15. Goedken, E. R., and S. Marqusee. 2001. Co-crystal of Escherichia coli RNaseHI with Mn2� ions reveals two divalent metals bound in the active site.J. Biol. Chem. 276:7266–7271.

16. Hang, J. Q., S. Rajendran, Y. Yang, Y. Li, P. W. In, H. Overton, K. E. Parkes,N. Cammack, J. A. Martin, and K. Klumpp. 2004. Activity of the isolatedHIV RNase H domain and specific inhibition by N-hydroxyimides. Biochem.Biophys. Res. Commun. 317:321–329.

17. Hizi, A., S. H. Hughes, and M. Shaharabany. 1990. Mutational analysis ofthe ribonuclease H activity of human immunodeficiency virus 1 reversetranscriptase. Virology 175:575–580.

18. Holm, L., and C. Sander. 1995. Dali: a network tool for protein structurecomparison. Trends Biochem. Sci. 20:478–480.

19. Holm, L., and C. Sander. 1993. Protein structure comparison by alignment ofdistance matrices. J. Mol. Biol. 233:123–138.

20. Hostomsky, Z., Z. Hostomska, G. O. Hudson, E. W. Moomaw, and B. R.Nodes. 1991. Reconstitution in vitro of RNase H activity by using purifiedN-terminal and C-terminal domains of human immunodeficiency virus type1 reverse transcriptase. Proc. Natl. Acad. Sci. USA 88:1148–1152.

21. Ishikawa, K., M. Okumura, K. Katayanagi, S. Kimura, S. Kanaya, H.Nakamura, and K. Morikawa. 1993. Crystal structure of ribonuclease Hfrom Thermus thermophilus HB8 refined at 2.8 Å resolution. J. Mol. Biol.230:529–542.

22. Jacobo-Molina, A., J. Ding, R. G. Nanni, A. D. Clark, Jr., X. Lu, C. Tantillo,R. L. Williams, G. Kamer, A. L. Ferris, P. Clark, et al. 1993. Crystal structureof human immunodeficiency virus type 1 reverse transcriptase complexedwith double-stranded DNA at 3.0 Å resolution shows bent DNA. Proc. Natl.Acad. Sci. USA 90:6320–6324.

23. Johnson, M. S., M. A. McClure, D. F. Feng, J. Gray, and R. F. Doolittle.1986. Computer analysis of retroviral pol genes: assignment of enzymaticfunctions to specific sequences and homologies with nonviral enzymes. Proc.Natl. Acad. Sci. USA 83:7648–7652.

24. Kanaya, S., C. Katsuda-Nakai, and M. Ikehara. 1991. Importance of thepositive charge cluster in Escherichia coli ribonuclease HI for the effectivebinding of the substrate. J. Biol. Chem. 266:11621–11627.

25. Kanaya, S., A. Kohara, Y. Miura, A. Sekiguchi, S. Iwai, H. Inoue, E. Ohtsuka,and M. Ikehara. 1990. Identification of the amino acid residues involved inan active site of Escherichia coli ribonuclease H by site-directed mutagenesis.J. Biol. Chem. 265:4615–4621.

26. Katayanagi, K., M. Miyagawa, M. Matsushima, M. Ishikawa, S. Kanaya, M.Ikehara, T. Matsuzaki, and K. Morikawa. 1990. Three-dimensional structureof ribonuclease H from Escherichia coli. Nature 347:306–309.

27. Katayanagi, K., M. Miyagawa, M. Matsushima, M. Ishikawa, S. Kanaya, H.Nakamura, M. Ikehara, T. Matsuzaki, and K. Morikawa. 1992. Structuraldetails of ribonuclease H from Escherichia coli as refined to an atomicresolution. J. Mol. Biol. 223:1029–1052.

28. Katayanagi, K., M. Okumura, and K. Morikawa. 1993. Crystal structure ofEscherichia coli RNase HI in complex with Mg2� at 2.8 Å resolution: prooffor a single Mg2�-binding site. Proteins 17:337–346.

29. Keck, J. L., and S. Marqusee. 1995. Substitution of a highly basic helix/loopsequence into the RNase H domain of human immunodeficiency virus re-verse transcriptase restores its Mn2�-dependent RNase H activity. Proc.Natl. Acad. Sci. USA 92:2740–2744.

30. Kohlstaedt, L. A., J. Wang, J. M. Friedman, P. A. Rice, and T. A. Steitz. 1992.Crystal structure at 3.5 Å resolution of HIV-1 reverse transcriptase complexedwith an inhibitor. Science 256:1783–1790.

31. Kraulis, P. J. 1991. MOLSCRIPT: a program to produce both detailed andschematic plots of protein structures. J. Appl. Crystallogr. 24:946–950.

32. Lai, L., H. Yokota, L. W. Hung, R. Kim, and S. H. Kim. 2000. Crystal

structure of archaeal RNase HII: a homologue of human major RNase H.Struct. Fold Des. 8:897–904.

33. Lim, D., M. Orlova, and S. P. Goff. 2002. Mutations of the RNase H C helixof the Moloney murine leukemia virus reverse transcriptase reveal defects inpolypurine tract recognition. J. Virol. 76:8360–8373.

34. Merritt, E. A., and D. J. Bacon. 1997. Raster3D: photorealistic moleculargraphics. Methods Enzymol. 277:505–524.

35. Nakamura, H., Y. Oda, S. Iwai, H. Inoue, E. Ohtsuka, S. Kanaya, S. Kimura,C. Katsuda, K. Katayanagi, K. Morikawa, et al. 1991. How does RNase Hrecognize a DNA · RNA hybrid? Proc. Natl. Acad. Sci. USA 88:11535–11539.

36. Nicholls, A., K. A. Sharp, and B. Honig. 1991. Protein folding and associa-tion: insights from the interfacial and thermodynamic properties of hydro-carbons. Proteins 11:281–296.

37. Nowotny, M., S. A. Gaidamakov, R. J. Crouch, and W. Yang. 2005. Crystalstructures of RNase H bound to an RNA/DNA hybrid: substrate specificityand metal-dependent catalysis. Cell 121:1005–1016.

38. Oda, Y., M. Yoshida, and S. Kanaya. 1993. Role of histidine 124 in thecatalytic function of ribonuclease HI from Escherichia coli. J. Biol. Chem.268:88–92.

39. Rice, P., and K. Mizuuchi. 1995. Structure of the bacteriophage Mu trans-posase core: a common structural motif for DNA transposition and retroviralintegration. Cell 82:209–220.

40. Sarafianos, S. G., K. Das, C. Tantillo, A. D. Clark, Jr., J. Ding, J. M.Whitcomb, P. L. Boyer, S. H. Hughes, and E. Arnold. 2001. Crystal structureof HIV-1 reverse transcriptase in complex with a polypurine tract RNA:DNA. EMBO J. 20:1449–1461.

41. Smith, J. S., K. Gritsman, and M. J. Roth. 1994. Contributions of DNApolymerase subdomains to the RNase H activity of human immunodefi-ciency virus type 1 reverse transcriptase. J. Virol. 68:5721–5729.

42. Smith, J. S., and M. J. Roth. 1993. Purification and characterization of anactive human immunodeficiency virus type 1 RNase H domain. J. Virol.67:4037–4049.

43. Stahl, S. J., J. D. Kaufman, S. Vikic-Topic, R. J. Crouch, and P. T. Wingfield.1994. Construction of an enzymatically active ribonuclease H domain ofhuman immunodeficiency virus type 1 reverse transcriptase. Protein Eng.7:1103–1108.

44. Steitz, T. A., and J. A. Steitz. 1993. A general two-metal-ion mechanism forcatalytic RNA. Proc. Natl. Acad. Sci. USA 90:6498–6502.

45. Tanese, N., and S. P. Goff. 1988. Domain structure of the Moloney murineleukemia virus reverse transcriptase: mutational analysis and separate ex-pression of the DNA polymerase and RNase H activities. Proc. Natl. Acad.Sci. USA 85:1777–1781.

46. Telesnitsky, A., S. W. Blain, and S. P. Goff. 1992. Defects in Moloney murineleukemia virus replication caused by a reverse transcriptase mutation mod-eled on the structure of Escherichia coli RNase H. J. Virol. 66:615–622.

47. Telesnitsky, A., and S. P. Goff. 1993. RNase H domain mutations affect theinteraction between Moloney murine leukemia virus reverse transcriptaseand its primer-template. Proc. Natl. Acad. Sci. USA 90:1276–1280.

48. Telesnitsky, A., and S. P. Goff. 1993. Two defective forms of reverse trans-criptase can complement to restore retroviral infectivity. EMBO J. 12:4433–4438.

49. Temin, H. M., and S. Mizutani. 1970. RNA-dependent DNA polymerase invirions of Rous sarcoma virus. Nature 226:1211–1213.

50. Yang, W., W. A. Hendrickson, R. J. Crouch, and Y. Satow. 1990. Structure ofribonuclease H phased at 2 Å resolution by MAD analysis of the selenomethio-nyl protein. Science 249:1398–1405.

51. Yang, W., and T. A. Steitz. 1995. Recombining the structures of HIV inte-grase, RuvC and RNase H. Structure 3:131–134.

52. Zhang, W. H., E. S. Svarovskaia, R. Barr, and V. K. Pathak. 2002. Y586Fmutation in murine leukemia virus reverse transcriptase decreases fidelity ofDNA synthesis in regions associated with adenine-thymine tracts. Proc. Natl.Acad. Sci. USA 99:10090–10095.

VOL. 80, 2006 RNase H STRUCTURE 8389

on February 10, 2016 by guest

http://jvi.asm.org/

Dow

nloaded from