A two-stranded template-based approach to G·(CA) triad formation: designing novel structural...

Post on 28-Jan-2023

5 views 0 download

Transcript of A two-stranded template-based approach to G·(CA) triad formation: designing novel structural...

doi:10.1006/jmbi.2000.3932 available online at http://www.idealibrary.com on J. Mol. Biol. (2000) 301, 129±146

A Two-stranded Template-based Approach to G � (C-A)Triad Formation: Designing Novel Structural Elementsinto an Existing DNA Framework

Abdelali Kettani, Gautam Basu, Andrey Gorin, Ananya MajumdarEugene Skripkin and Dinshaw J. Patel*

Cellular Biochemistry andBiophysics Program, MemorialSloan-Kettering Cancer CenterNew York, NY 10021, USA

Present address: G. Basu, DepartmBose Institute, Calcutta, 700 054, In

E-mail address of the correspondpateld@mskcc.org

0022-2836/00/010129±18 $35.00/0

We have designed a DNA sequence, d(G-G-G-T-T-C-A-G-G), whichdimerizes to form a 2-fold symmetric G-quadruplex in whichG(syn) �G(anti) �G(syn) �G(anti) tetrads are sandwiched between all transG � (C-A) triads. The NMR-based solution structural analysis was greatlyaided by monitoring hydrogen bond alignments across NÐH � � �N andNÐH � � �O1C hydrogen bonds within the triad and tetrad, in a uni-formly (13C,15N)-labeled sample of the d(G-G-G-T-T-C-A-G-G) sequence.The solution structure establishes that the guanine base-pairs with thecytosine through Watson-Crick G �C pair formation and with adeninethrough sheared G �A mismatch formation within the G � (C-A) triad. Amodel of triad DNA was constructed that contains the experimentallydetermined G � (C-A) triad alignment as the repeating stacked unit.

# 2000 Academic Press

Keywords: G � (C-A) base triad; G-quadruplex; minor groove recognition ofG �C pair; uniform 13C, 15N-labeled DNA

*Corresponding author

Introduction

The anti-parallel Watson-Crick-paired structureof DNA has played a key role in modern biology asa repository of genetic information and as asequence-speci®c target for proteins involved ingene packaging and regulation. Variants of thisarchitecture include the recently elucidated struc-ture of stretched and over-wound DNA ®bers(Allemand et al., 1998) where the phosphate groupsare in the interior and the bases are splayed out,and an earlier structure of the dinucleotide repeatZ-DNA (Wang et al., 1979) with its unanticipatedleft-handed zig-zag repeat. It should be noted thatinside-out DNA, characteristic of the stretched andover-wound state had been previously proposed forthe architecture of ®lamentous bacteriophage pf1DNA bound to coat protein (Liu & Day. 1994) andthe recently solved structure of Z-DNA bound toadenine deaminase (Schwartz et al., 1999) has impli-cations for transient left-handed DNA architecturesin actively transcribing genes (Herbert & Rich,

ent of Biophysics,dia.ing author:

1996). These two examples reinforce the premisethat unusual DNA architectures may contribute inimportant ways to gene function and regulation.

An equally interesting situation arises when oneconsiders potential DNA architectures containingnon-Watson-Crick pairing alignments. Such struc-tural motifs could arise if individual strands ofduplex DNA undergo higher order folding invol-ving unanticipated multi-stranded pairing align-ments stabilized by mismatches, triples, triads andtetrads (reviewed by Neidle, 1999). These architec-tures span the range from two-stranded zipper(Sheppard et al., 1998) and arrowhead (Kettani,et al., 1999) motifs, to three-stranded triplexes(reviewed by Radhakrishnan & Patel, 1944; Sunet al., 1996; Wang & Feigon, 1999) and four-stranded quadruplex (reviewed by Rhodes &Giraldo, 1995; Patel et al., 1999) and i-motifs(Gehring et al., 1993; Chen et al., 1994). Even higherorder pairing alignments involving pentads(Chaput & Switzer, 1999) and hexads (Kettam et al.,2000) have been reported recently. Many of thesemulti-stranded architectures are potential candi-dates for adaptation by DNA oligomers that con-tain elements of telomeric, centromeric and tripletrepeat disease sequences (reviewed by Patel et al.,1999).

# 2000 Academic Press

130 G � (C-A) Base Triad-containing DNA Architecture

Our laboratory has for some time focused con-siderable effort towards the identi®cation of basetriads, a motif ®rst proposed by Kuryavyi & Jovin(1995, 1996), and postulated by them as buildingblocks for architectural models of triplet-repeat dis-ease sequences. Both base triads and base triplesare formed through co-planar alignment of threebases, except that the former is a two-strandedstructure, while the latter is a three-stranded struc-ture. In essence, two adjacent bases from onestrand adopt a co-planar platform arrangementand pair with a third base from the partner strandto form a base triad (Kuryavyi & Jovin, 1995,1996).

The identi®cation of new pairing alignments, bethey triads or other motifs, generally involves thescreening of scores of sequences for those rare can-didates that give NMR spectra containing unusualfeatures and which preferably adopt a single stableconformation with narrow resonances. A rationaldesign-based screening strategy for new alignmentmotifs could have advantages in both reducing thenumber of sequences evaluated and the potentialgenerality of inbuilt designed structural principlesassociated with the successful candidates.We report below on the identi®cation of theG � (C-A) triad (Figure 1(a)) within the folded archi-tecture of a two-stranded d(G-G-G-T-T-C-A-G-G)quadruplex.

The solution structure of a four-stranded quad-ruplex containing A � (T-A) triads (Figure 1(c))formed by the truncated Bombyx mori telomericd(T-A-G-G) sequence reported from our laboratory(Kettani et al., 1997) served as the starting point forthe design of the G � (C-A) triad. This novel archi-tecture (Figure 1(d)) contained a pair of stackedG �G �G �G tetrads sandwiched between ¯ankingA � (T-A) triads. This four-stranded quadruplexapproach for generating base triads stabilized on aG-tetrad platform has limitations, in that the poten-tial triads are restricted to X � (Y-X) pairing align-ments for a general sequence d(Y-X-G-G). Such anapproach would not be applicable to the casewhere all three bases in the triad are different, suchas the G � (C-A) triad of interest here.

Our design strategy for potential formation ofthe G � (C-A) triad alignment required a shift from afour-stranded to a two-stranded architecture, whileretaining the template-based approach where thetriads would be stabilized by stacking onG �G �G �G tetrads. We therefore designed thesequence d(G-G-G-T-T-C-A-G-G) with the antici-pation that, following dimerization, it would adoptthe folding topology outlined in Figure 1(b). Thistopology involves a 2-fold symmetric architecturewhere a pair of G �G �G �G tetrads are sandwichedbetween G � (C-A) triads. Interestingly, thisapproach was successful without any furtheradjustment of the sequence and validated the con-cept, at least for the current example, of design-based approaches for the identi®cation of newDNA motifs characterized by novel pairing align-ments.

Results

NMR spectrum

The proton NMR spectrum (6.5 to 12.5 ppm) ofd(G-G-G-T-T-C-A-G-G) in 100 mM NaCl, 2 mMphosphate, H2O, pH 6.6 at 0 �C is plotted inFigure 2. We observe narrow and well-resolvedexchangeable imino (labeled peaks between 10.5 to12.5 ppm) and amino (labeled peaks between 8.0and 10.5 ppm) resonances associated with a singleconformation. The structural characterization ofthe folded architecture of the d(G-G-G-T-T-C-A-G-G) oligomer was greatly facilitated by the excellentquality of the NMR spectrum.

Strand stoichiometry

We have monitored the strand stoichiometry ofthe d(G-G-G-T-T-C-A-G-G) oligomer in 100 mMNaCl, 2 mM phosphate by undertaking a systema-tic study of the concentration dependence of thisfolded form relative to the unstructured single-stranded form detectable under NMR slow-exchange conditions at lower concentrations. Alog-log plot of the multimer versus single strandconcentrations for the d(G-G-G-T-T-C-A-G-G)sample equilibrated for six weeks at ambient tem-perature is shown in Figure 3 and yields a strandstoichiometry of 1.8 � 0.3. These data support atwo-stranded folded architecture for d(G-G-G-T-T-C-A-G-G) in 100 mM NaCl solution.

Proton and phosphorus assignments

We have used a combination of through spaceand through bond methods on unlabeled anduniformly (13C,15N)-labeled d(G-G-G-T-T-C-A-G-G)in 100 mM NaCl solution to complete the base andsugar proton and backbone phosphorus assign-ments.

An expanded NOESY contour plot of d(G-G-G-T-T-C-A-G-G) in 100 mM NaCl solution, H2O buf-fer at 0 �C is plotted in Figure 4. The observedNOEs involving the exchangeable imino andamino protons have been assigned and are tabu-lated in the caption to Figure 4 and the exchange-able proton chemical shifts are listed inSupplementary Table S1 (``on line'' version). Weobserve pairs of resolved resonances for the aminoprotons of G2 (chemical shift separation �d � 3.12ppm), G8 (�d � 2.36 ppm) and G9 (�d � 3.56ppm), as well as for A7 (�d � 2.14 ppm). Suchlarge separations (>2 ppm) are characteristic ofpairing alignments involving one hydrogenbonded and one exposed amino proton for G2, A7,G8 and G9 within the folded d(G-G-G-T-T-C-A-G-G) architecture. Interestingly, both amino protonsof G3 are down®eld shifted to 7.95 and 8.92 ppm(�d � 0.97 ppm), suggestive of both amino protonsbeing involved in hydrogen bond formation.

A new HNNH-LR experiment de®ned by thepulse sequence shown in Figure 5(a) was designed

Figure 1. (a) Schematic of the G3 � (C6-A7) triad observed in the structure of the d(G-G-G-T-T-C-A-G-G) quadruplex.(b) Schematic showing the topology of the two-stranded G � (C-A) triad-containing G-quadruplex formed by d(G-G-G-T-T-C-A-G-G) sequence in 100 mM NaCl, 2 mM phosphate buffer (pH 6.6). The two strands are related by a two-foldaxis of symmetry. (c) Schematic of the A � (T-A) triad observed in the structure of the d(T-A-G-G) quadruplex (Kettaniet al., 1997). (d) Schematic showing the topology of the four-stranded A � (T-A) triad-containing G-quadruplex formedby B. mori d(T-A-G-G) sequence (Kettani et al., 1997). Note that one pair of symmetry-related strands has a distincttopology from a second pair of symmetry-related strands. The backbone tracing of individual strands in (b) and (d)are shown in thick lines, the triads are shown as shaded triangles and the G �G �G �G tetrads are shown as ``dashed''rectangles. Anti and syn guanine bases represented by open and shaded symbols.

G �(C-A) Base Triad-containing DNA Architecture 131

to correlate exchangeable imino protons with theirnon-exchangeable H-8 counterparts within individ-ual guanine rings of uniformly (13C,15N)-labeledoligomer samples. The HNNH-LR experimentinvolves through bond correlation of both iminoand H-8 protons. to the N9 and N3 nitrogen atomsof individual guanine bases via long-range coup-lings. Such individual guanine imino to H-8 protoncorrelations through their intervening nitrogenatoms are plotted in Figure 5(b)) for the uniformly(13C,15N)-labeled sample of d(G-G-G-T-T-C-A-G-

G). The new HNNH-LR experiment (Figure 4(b))gives improved level of sensitivity relative to ear-lier correlation experiments which, directly linkedimino and H8 protons within individual guaninerings (Fiala et al., 1996; Simorre et al., 1996; Sklenaret al., 1996).

An expanded NOESY contour plot of d(G-G-G-T-T-C-A-G-G) in 100 mM NaCl solution, 2H2O buf-fer at 0 �C is plotted in Figure 6(a). ThisFigure traces the sequential NOEs between baseprotons and their own and 50-¯anking sugar H10

Figure 2. Proton NMR spectrum (6.5 to 13.0 ppm) of the d(G-G-G-T-T-C-A-G-G) sequence (5 mM in single strands)in 100 mM NaCl, 2 mM phosphate, H2O buffer (pH 6.6) at 0 �C. The Figure labels assignments of imino protons reso-nating between 10.5 and 12.5 ppm and amino protons resonating between 7.9 and 10.3 ppm.

132 G � (C-A) Base Triad-containing DNA Architecture

protons along the sequence. The G1 and G8 resi-dues adopt syn glycosidic torsion angles based onthe strong NOES between the base and their ownsugar H10 protons (Patel et al., 1982) in the short50 ms mixing time NOESY contour plot shown inFigure 6(b). The base and sugar non-exchangeableproton chemical shifts are listed in SupplementaryTable S1. The chemical shifts of the H20 protonsof C6 (0.47 ppm) and A7 (0.65 ppm) are to high-®eld while the H20 proton of G8 (3.18 ppm) is tolow-®eld of the remaining H20 protons, which arecentered about 2.2 ppm.

Individual backbone phosphorus atoms werecorrelated to their assigned ¯anking H30 and H40protons located in opposing directions along thebackbone. The phosphorus chemical shifts arelisted in Supplementary Table S1. The chemicalshifts of the phosphorus resonances for the G3-T4(ÿ3.56 ppm) and A7-G8 (ÿ3.22 ppm) are to low-®eld of the remaining phosphorus resonanceswhich resonate between ÿ4.2 and ÿ5.2 ppm.

Identification of G1 �G2 �G8 �G9 tetrads

We have identi®ed formation of symmetryrelated G1 �G2 �G8 �G9 tetrad alignments within the

folded architecture of d(G-G-G-T-T-C-A-G-G)based on a combination of through spaceNOE connectivities and through bond-couplingconnectivities across NÐH � � �N hydrogen bonds(Dingley & Grzesiek, 1998; Pervushin et al., 1998;Majumdar et al., 1999a,b). Such an approach ident-i®es the Watson-Crick to Hoogsteen edge direc-tionalities between adjacent guanines aroundG-tetrads. Thus, pairing of the Watson-Crick edgeof G8 with the Hoogsteen edge of G9 around theG1 �G2 �G8 �G9 tetrad is de®ned by NOEs betweenthe amino protons of G8 and the H-8 of G9 (peaksg and g0, Figure 4) in a NOESY experiment and bycoupling connectivities between the amino N2nitrogen of G8 and the H8 of G9 (Figure 7(a)) in afour-bond H(CN)N(H) correlation experiment(Majumdar et al., 1999b). Related NOE (Figure 4)and coupling (Figure 7(a)) connectivities de®ne theremaining Watson-Crick to Hoogsteen edge direc-tionalities across NÐH � � �N hydrogen bonds asG1 to G2, G2 to G8 and G9 to G1 around theG1 �G2 �G8 �G9 tetrad.

We have also recently developed an NMRapproach for identi®cation of NÐH � � �O1Chydrogen bonds in uniformly (13C,15N)-labelednucleic acids (Liu et al., 2000). This approach,

Figure 3. A plot of the concentration of the d(G-G-G-T-T-C-A-G-G) multimer versus the concentration of thed(G-G-G-T-T-C-A-G-G) single strand as monitored bythe average of the two most resolved peaks (G1(H-8)and G2(H-8) protons) in the multimeric state and thetwo most resolved protons in the single-stranded state.The samples were equilibrated for six weeks at ambienttemperature prior to recording the spectra at 25 �C. Theslope of the curve is 1.8 � 0.3 based on the best ®t tothe data.

G �(C-A) Base Triad-containing DNA Architecture 133

based on identi®cation of 4JNN couplings in anHN(N)-TOCSY experiment, permitted the identi®cation of NÐH � � �O1C hydrogen bonds aroundthe G1 �G2 �G8 �G9 tetrad on a uniformly 13C,15N-labeled sample of the d(G-G-G-T-T-C-A-G-G)oligomer (Liu et al., 2000).

A G(syn) �G(anti) �G(syn) �G(anti) tetrad alignmentis characteristic of a G-quadruplex where adjacentstrands are anti-parallel to each other around thequadruplex (Kang et al., 1992; Macaya et al., 1993;Wang et al., 1993; Kettani et al., 1995), while aG(syn) �G(syn) �G(anti) �G(anti) tetrad alignment ischaracteristic of a G-quadruplex where each strandhas a parallel and anti-parallel neighbor (Smith &Feigon, 1992; Wang & Patel, 1993). The strongNOEs between the base H-8 to their own sugarH-10 protons for G1 and G8 at short mixingtimes (Figure 6(b)), characteristic of syn guaninebases, establishes formation of G1(syn) �G2(anti)�G8(syn) �G9(anti) tetrads for the folded architectureof d(G-G-G-T-T-C-A-G-G). This requires that thed(G-G-G-T-T-C-A-G-G) quadruplex forms throughdimerization of a pair of hairpins involving edge-wise (shown schematically in Figure 1(b)) ratherthan diagonal loops.

Identification of G3 � (C6-A7) triads

We have identi®ed formation of symmetryrelated G3 � (C6-A7) triad alignments within thed(G-G-G-T-T-C-A-G-G) quadruplex based on acombination of through space NOE connectivitiesand through bond coupling connectivities acrossNÐH � � �N hydrogen bonds. We can establish for-mation of a Watson-Crick G3 �C6 base-pair basedon the observed NOEs between the imino protonof G3 and the amino protons of C6 (peaks k andk0, Figure 4) and the NÐH � � �N coupling connec-tivities between the donor imino proton of G3 andthe acceptor N3 of C6 (boxed peak in Supplemen-tary Figure S1). We can also establish formation ofa sheared G3 �A7 mismatch pair where the minorgroove edge of G3 pairs with the major grooveedge of A7 as shown in Figure 1(a). Speci®cally,we observe NOEs between the minor grooveamino protons of G3 and the major groove aminoprotons of A7 (peaks i, i0, j and j0, Figure 4) andbetween the major groove amino protons of A7and the sugar H1' (peak s, Figure 4) and H4' (peakt, Figure 4) protons of G3. In addition, we observecoupling connectivities across the NÐH � � �Nhydrogen bonds between the donor NH2 protonsof A7 and the acceptor N3 of G3 (boxed peaks inFigure 7(b)) in an HNN-COSY correlation exper-iment (Dingley & Grzesiek, 1998; Pervushin et al.,1998; Majumdar et al., 1999a,b) and between theamino N2 nitrogen atom of G3 and the H8 protonof A7 (Figure 7(a)) in an H(CN)N(H) correlationexperiment (Majumdar et al., 1999b). Note thatboth amino protons of G3 are hydrogen bonded inthe established G3 � (C6-A7) triad alignment(Figure 1(a)), consistent with the observed down-®eld shift of both amino protons as mentioned inan earlier section.

Distance restraints and moleculardynamics calculations

Distance restraints associated with exchangeableprotons (total of 82) were qualitatively deducedfrom NOESY experiments in H2O at two mixingtimes, while those associated with their non-exchangeable proton counterparts (total of 366)were quanti®ed from NOE buildup curves in 2H2Oat ®ve mixing times as outlined in Materialsand Methods. The observation of a single set ofnarrow resonances for the d(G-G-G-T-T-C-A-G-G)sequence at temperatures down to 0 �C was con-sistent with formation of a folded architecture con-taining two strands related by a 2-fold symmetryaxis. Therefore, non-crystallographic symmetryrestraints were used during the computations. Alldistance restraints were classi®ed as ambiguousduring the distance-restrained molecular dynamicscomputation, since we have not distinguishedbetween intra-strand and inter-strand restraintsbetween proton pairs within the folded architec-ture. Experimentally de®ned hydrogen bonding

Figure 4. An expanded NOESY (200 ms mixing time) contour plot correlating NOEs between imino, amino andnon-exchangeable protons in the d(G-G-G-T-T-C-A-G-G) sequence in 100 mM NaCl, 2 mM phosphate, H2O buffer(pH 6.6) at 0 �C, The cross peaks a to x are assigned as folows: a, G3(NH-1)-G1(NH-1); b, G2(NH-1)-G9(NH-1); c andc0, G9(NH-1)-G9(NH2); d and d0, G8(NH-1)-G8(NH2); e and e0, G2(NH-1)-G2(NH2); f and f0, G3(NH-1)-G3(NH2); gand g0, G8(NH2)-G9(H-8); h and h0, G9(NH2)-G1(H-8); i, i0, j and j0, G3(NH2)-A7(NH2); k and k0, G3(NH-1)-C6(NH2); land l0, G1(NH-1)-G3(NH2); m and n, G1(NH-1)-C6(NH2); o, G1(NH-1)-G2(H-8); p, G2(NH-1)-G8(H-8); q, G9(NH-1)-Gl(H-8); r and r0, C6(NH2)-T5(H-10); s, A7(NH2)-G3(H-10); t, A7(NH2)-G3(H-40); u and u0, G8(NH2)-A7(H-30); v,G8(NH2)-A7(H-5 0*); w, G8(NH-1)-A7(H-20); x, G8(NH2)-A7(H-20).

134 G � (C-A) Base Triad-containing DNA Architecture

Figure 5. (a) Pulse sequence for the HNNH-LR experiment correlating (H8,NH) (f2) with (N1,N3,N9) resonances(f1) within individual guanine rings. The spectrum was recorded on a 500 MHz (1H) Varian INOVA spectrometer,equipped with a triple-resonance z-gradient probe. Number of complex points, spectral width and tmax along f1 andf2 were: 66, 2200 Hz, 29.5 ms (f1); 1920, 12,900 Hz, 74 ms (f2). Quadrature detection along f1 was achieved via States-TPPI phase cycling of f2. Selective pulses on H2O were applied using 4 ms qsneeze (Mz!Mxy) or time-reversedqsneeze (Mxy!Mz) pulses. 96 scans/®d were acquired with a relaxation delay of 1.8 seconds per scan, resulting in atotal acquisition time of 7.5 hours. The imino and H8 proton chemical shifts are on the horizontal axis while the N3and N9 nitrogen chemical shifts are on the vertical axis. The imino protons are correlated with their N3 nitrogenatoms while the H8 protons are correlated with their N3 and N9 nitrogen atoms. Phase cycles: f1 � x, ÿx; f2 � 2(x),2(ÿx); f3 � 4(x), 4(ÿx); f4 � 8(x), 8(ÿx); fR � ABBA where A � x, ÿx, ÿx, x; B � ÿ x, x, x, ÿx. Quadrature detectionin the o1 dimension was achieved by States-TPPI phase cycling of f2. (b) Contour plot of an HNNH-LR experimentestablishing connectivities between imino and H-8 protons within individual guanine (G1, G2, G3, G8 and G9) ringsthrough correlation with their N3 and N9 ring atoms in uniformly 13C,15N-labeled d(G-G-G-T-T-C-A-G-G) sequence in100 mM NaCl, 2 mM phosphate, H2O buffer (pH 6.6) at 0 �C.

G �(C-A) Base Triad-containing DNA Architecture 135

Figure 6. (a) An expanded NOESY (250 ms mixing time) contour plot correlating NOES between the base (7.0 to8.4 ppm) and the sugar (5.3 to 6.2 ppm) protons in the d(G-G-G-T-T-C-A-G-G) sequence in 100 mM NaCl, 10 mMphosphate, 2H2O buffer (pH 6.6) at 0 �C. The cross peaks a to g are assigned as follows: a, G9(H-8)-C6(H-10); b, T5(H-6)-C6(H-5); c, G1(H-8)-C6(H-5); d, A7(H-8)-T4(H-10); e, Gl(H-8)-G2(H-10); f, A7(H-2)-A7(H-10); g, A7(H-2)-G8(H-10). (b)An expanded NOESY (50 ms mixing time) stacked plot correlating NOES between the base (7.0 to 8.4 ppm) and thesugar (5.3 to 6.2 ppm) protons in the d(G-G-G-T-T-C-A-G-G) sequence in 100 mM NaCl, 2 mM phosphate, 2H2O buf-fer (pH 6.6) at 0 �C. The strong NOE cross peaks are between the H-6 and H-5 protons of C6 (labeled C6*) andbetween the H-8 and H-10 protons of G1 and G8 (labeled G1 and G8).

136 G � (C-A) Base Triad-containing DNA Architecture

alignments from NOE patterns and scalar coup-lings discussed above were used to restrain thesymmetry related G3 � (C6-A7) triads andG1 �G2 �G8 �G9 tetrads, with the folding models

retaining these hydrogen bonding alignmentsduring the computations.

The solution structure of the d(G-G-G-T-T-C-A-G-G) sequence in 100 mM NaCl was solved by

Figure 7. Measurement of coupling connectivities across NÐH � � �N hydrogen bonds in uniformly 13C,15N-labeledd(G-G-G-T-T-C-A-G-G) sequence in 100 mM NaCl, 2 mM phosphate, H2O buffer (pH 6.6). (a) An H(CN)N(H)(Majumdar et al., 1999b) contour plot recorded at 20 �C correlating four bond coupling connectivities between guanineamino N2 nitrogen atoms and guanine/adenine H8 protons across NÐH � � �N hydrogen bonded pairs. The labeledpeaks establish G8(N2) �G9(H8), G9(N2) �G1(H8), G1(N2) �G2(H8), G2(N2) �G8(H8) and G3(N2) �A7(H8) coupling con-nectivities. The H8 proton chemical shifts are on the horizontal scale while the N2 nitrogen chemical shifts are on thevertical scale. (b) An HNN-COSY (Dingley & Grzesiek, 1998; Pervushin et al., 1998; Majumdar et al., 1999a,b) contourplot recorded at 0 �C correlating two bond coupling connectivities across NÐH � � �N hydrogen bonded pairs. Theboxed peaks establish coupling connectivities between the A7(N6 amino proton donors) �G3(N7 acceptor) pair. Theamino proton chemical shifts are on the horizontal scale while the N3 nitrogen chemical shifts are on the vertical scale.

G �(C-A) Base Triad-containing DNA Architecture 137

molecular dynamics computations guided byhydrogen bonding and NOE distance restraints. Atotal of 60 starting structures were generated forthe d(G-G-G-T-T-C-A-G-G) 9-mer segment as setsof pairs of randomized chains separated by spaceintervals of 50 AÊ . The protocol outlined inMaterials and Methods involved initial torsionspace dynamics at 20,000 K followed by Cartesianspace dynamics at 300 K. A subset of 18 distance-

re®ned structures of the d(G-G-G-T-T-C-A-G-G)quadruplex were identi®ed based on a combi-nation of low NOE energies and fewest NOEviolations.

Intensity restraints and NOE back calculations

The subset of 18 converged distance-re®nedstructures were next re®ned against the non-

138 G � (C-A) Base Triad-containing DNA Architecture

exchangeable proton NOE intensities associatedwith NOESY spectra recorded at ®ve mixing times.These computations utilized a molecular dynamicswith back calculation protocol outlined inMaterials and Methods. The NOE violations, devi-ations from covalent geometry and pairwiser.m.s.d. values for the ten lowest energy intensity-re®ned structures of the d(G-G-G-T-T-C-A-G-G)quadruplex are listed in Table 1.

Structural features

A view of the ten superpositioned lowest energyintensity re®ned structures of the d(G-G-G-T-T-C-A-G-G) quadruplex are shown in stereo inFigure 8(a). The sugar-phosphate backbone of indi-vidual symmetry-related strands are colored inorange and green, with the sequentially stackedloop thymine bases, G � (C-A) triads and G �G �G �Gtetrads colored in magenta, yellow and cyan,respectively. The directionalities of the individualstrands in the quadruplex is shown in Figure 8(b),while a space ®lling view of this architecture isshown in Figure 8(c).

A stick representation of one symmetric half ofthe d(G-G-G-T-T-C-A-G-G) quadruplex is shownin Figure 9(a). The pairing alignments of theG3 � (C6-A7) triad and the Gl �G2 �G8 �G9 tetradin this representative structure are shown inFigure 9(b) and (c), respectively.

The stacking geometries between the T4 andT5 bases (in magenta) with the G3 � (C6-A7) triad(in yellow) is shown in Figures 10(a), whilethat between the G3 � (C6-A7) triad and the

Table 1. NMR restraints and structural statistics for the inquadruplex

A. NMR restraintsDistance restraints

ExchangeIntra-residuea 8Sequential (i, i � 1)a 24Long range 5 (i, i � 2)a 50

Other restraintsHydrogen bonding restraintsb

Glycosidic torsion angle restraintsc

Intensity restraintsNon-exchangeable protons (per mixing time)

B. Structural statisitics in complex following intensity refinementNOE violations

Number >0.2 AÊ

Maximum violationsr.m.s.d. of violations

NMR R factor (R1/6)Deviations from ideal covalent geometry

Bond lengthsBond anglesImpropers

Pairwise all heavy atom r.m.s.d. values (ten refined strucutres)All heavy atoms

a All distance restraints were set as ambiguous between intra andb These hydrogen bonding restraints are based on experimental Nc Residues G2, G3, T4, T5, C6, A7 and G9 were restrained to w

identi®ed anti glycosidic torsion angles while G1 and G8 were restrmentally identi®ed syn glycosidic torsion angles.

G1 �G2 �G8 �G9 tetrad (cyan) is shown inFigure 10(b). The overlap geometry betweenadjacent symmetry related G1 �G2 �G8 �G9 tetrads(cyan and gray) is shown in Figure 10(c).

Discussion

We have now extended on our previous designstrategy (Kettani et al., 1997) for the generation ofstable base triads through stacking on G �G �G �Gtetrad templates. The more cumbersome andlimited earlier approach of sandwiching G �G �G �Gtetrads between base triads in a four-strandedquadruplex (Figure 1(d)) (Kettani et al., 1997) hasbeen successfully replaced by its simpler two-stranded quadruplex counterpart (Figure 1(b)) (thisstudy) which has the potential for accommodatinga greater range of base triads.

Quality of NMR spectra and structures

The robustness of elucidated DNA solutionstructures are directly dependent on the numberand distribution of the input restraints, which inturn are dependent on the quality of the NMRspectra. The proton NMR spectrum of the d(G-G-G-T-T-C-A-G-G) quadruplex in 100 mM NaCl is ofexceptional quality (Figure 2). Furthermore, theavailability of uniform (13C,15N)-labeled d(G-G-G-T-T-C-A-G-G) oligomer has aided both the reson-ance identi®cation, and provided unambiguousassignments of hydrogen bond alignments aroundthe G � (C-A) triad and G �G �G �G tetrad (Figure 7).The available distance, torsional and hydrogen-

tensity re®ned structures of the d(G-G-G-T-T-C-A-G-G)

able Non-exchangeable20111847

10418

365

10.9�1.70.34 A�0.04

0.051 A�0.00070.061�0.0004

0.018 A�0.00042.62��0.060.51��0.02

0.42 A�0.14

inter-strand contributions.OE and coupling constant data.values in the 210(�40) � range, characteristic of experimentally

ained to w values in the 65 � � 25 range; characteristic of experi-

Figure 8. (a) Superpositioned stereo views of ten intensity re®ned structures of the G � (C-A) triad-containing d(G-G-G-T-T-C-A-G-G) quadruplex. The backbones of individual strands are in orange and green with phosphate oxygenatoms removed for clarity. Symmetry related Gl �G2 �G8 �G9 tetrads are cyan, G3 � (C6-A7) triads are yellow, T4 andT5 loop residues are magenta. (b) Stick depiction of the backbone and bases and (c) space-®lling views of a represen-tative intensity re®ned structure of the G � (C-A) triad-containing d(G-G-G-T-T-C-A-G-G) quadruplex. Individualstrands are in yellow and cyan with bases in white and backbone phosphorus atoms in red.

G �(C-A) Base Triad-containing DNA Architecture 139

bonding restraints has greatly facilitated the struc-tural characterization and yielded intensity re®nedstructures with low pairwise r.m.s.d. values(Table 1).

DNA quadruplex topology

There are key fundamental differences in thefolding topology of the architectures involvingG �G �G �G tetrads sandwiched between base triadsin the four-stranded quadruplex reported pre-viously (Figure 1(d)) and its two-stranded counter-part reported here (Figure 1(b)). The topology ofthe truncated B. mori d(T-A-G-G) four-strandedquadruplex shown in Figure 1(d) has G(syn) �G(syn) �G(anti) �G(anti) tetrad alignments and indi-vidual strands which have both parallel and anti-parallel neighbors. By contrast, the topology oftwo-stranded d(G-G-G-T-T-C-A-G-G) quadruplexshown in Figure 1(b) has G(anti) �G(syn) �G(anti) �G(syn) tetrad alignments and adjacent strandsthat run anti-parallel to each other around thequadruplex. Thus, the concept of triads stabilizedthrough stacking on G �G �G �G tetrads can be

accommodated within different G-quadruplexfolding architectures.

G � (C-A) triad

The pairing alignment of the G3 � (C6-A7) triad isshown in Figure 1(a). Note that G3 interacts withthe C6-A7 base platform through formation of aWatson-Crick G �C pair with C6 and a shearedG �A mismatch pair with A7. Thus, all donor andacceptor atoms of the purine ring of G3, except N7,are involved in hydrogen bond formation withinthe G � (C-A) triad. Note that the same recognitionprinciples and alignments are used to generate theA � (T-A) (Figure 1(c)) and G � (C-A) (Figure 1(a))triads. In each case, a purine ring interacts with apyrimidine-purine platform by forming a Watson-Crick pair with the 50 pyrimidine of the platformand a sheared mismatch pair with the 30 purine ofthe platform. The use of a sheared G �A mismatchto form the G � (C-A) triad reinforces the repeatedobservation of this mismatch alignment in DNA(Li et al., 1991; Chou et al., 1997; Lin & Patel, 1997;Lin et al., 1998) and RNA (Heus & Pardi, 1991;

Figure 9. (a) A stick representation of one symmetric half of a representative intensity re®ned structure of theG � (C-A) triad-containing d(G-G-G-T-T-C-A-G-G) quadruplex. The color code is the same as outlined in the caption toFigure 8(a). (b) Pairing alignment of the G3 � (C6-A7) triad. The sugar ring oxygen atoms are indicated by white balls.Note that G3 pairs with both C6 and A7 of the C6-A7 platform step. (c) Pairing alignment of the Gl �G2 �G8 �G9 tet-rad.

140 G � (C-A) Base Triad-containing DNA Architecture

Pley et al., 1994; Szewczak & Moore, 1995; Cai et al.,1998; Legault et al., 1998) motifs.

The G � (C-A) triad identi®ed here adds to agrowing number of base triad alignments(Kuryavyi & Jovin, 1995, 1996) and associated baseplatforms (Cate et al., 1996) determined recently inDNA (Kettani et al., 1997, 2000; Kuryavyi et al.,2000) and RNA (Zimmerman et al., 1997;Kalurachchi et al., 1997; Conn et al., 1999;Wimberly et al., 1999; Sussman et al., 2000) studies.

Stacking interactions

Two bases, T4 and T5, are involved in chainreversal in the solution structure of the d(G-G-G-T-T-C-A-G-G) quadruplex. These two thymine bases

are approximately co-planar but are not hydrogenbonded to each other. The well de®ned T4 and T5bases are positioned over the G � (C-A) triad suchthat there is partial stacking of T4 over A7 and T5over G3 (Figure 10(a)).

The stacking of the G3 � (C6-A7) triad over theG1 �G2 �G8 �G9 tetrad is shown in Figure 10(b).There is stacking of the purine rings of adjacent G2and G3 residues and of adjacent A7 and G8 resi-dues, while the pyrimidine ring of C6 is positionedover the G1 �G9 mismatch pair. Note that the sugarring of C6 is positioned over the purine ring of G9and the sugar ring of A7 is positioned over thepurine ring of G8 (Figure 10(b)), which couldexplain the up®eld shift of the H20 protons of C6and A7 observed experimentally (Table S1).

Figure 10. Base stacking overlap patterns in a repre-sentative intensity re®ned structure of the G � (C-A)triad-containing d(G-G-G-T-T-C-A-G-G) quadruplex. (a)Stacking of the T4 and T5 loop bases in magenta overthe G3 � (C6-A7) triad in yellow. (b) Stacking of theG3 � (C6-A7) triad in yellow over the G1 �G2 �G8 �G9tetrad in cyan. (c) Stacking between adjacentG1 �G2 �G8 �G9 tetrads in cyan and gray.

G �(C-A) Base Triad-containing DNA Architecture 141

The symmetry related adjacent G1 �G2 �G8 �G9tetrads stack on each other through their ®ve-membered purine rings as shown in Figure 10(c),similar to that reported for G-quadruplexes whereadjacent strands run in opposite directions (Kanget al., 1992; Macaya et al., 1993; Wang et al., 1993.,Kettani et al., 1995).

Triad DNA model

We have generated a model of triad DNA con-taining four stacked G � (C-A) triads (Figure 11(a)).For the A-G step, there is primarily cross-strandstacking of the six-membered rings of adeninebases on partner strands (Figure 11(b)). For theG-C step, there is partial cross-strand stacking ofthe six-membered rings of guanine bases on part-ner strands (Figure 11(c)). The limited stackingbetween adjacent G � (C-A) triads at both A-G(Figure 11(b)) and G-C steps (Figure 11(c)) in themodel, suggests that the potential formation ofstable G � (C-A) containing triad DNAs may requireprotein-binding partners.

Biological implications

The majority of published research on DNAarchitectures and their interaction with regulatoryproteins has focused on Watson-Crick aligned anti-parallel duplexes. By contrast, there has been lim-ited information on the potential role of non-B-DNA structures in modulating biological events.One exception has been Z-DNA, where accumulat-ing evidence gathered over the 20 years followingits discovery in 1979, implicates potential for-mation of this left-handed architecture at negativesuper-helical sites behind a transcribing RNA poly-merase (reviewed by Herbert & Rich, 1996). Adouble-stranded RNA adenosine deaminase hasalso been shown to target Z-DNA, primarilythrough recognition of its zig-zag left-handedsugar-phosphate backbone (Schwartz et al., 1999).

One goal of our laboratory has been to identifyadditional non-B-DNA architectures by focusingon sequences that cannot align through Watson-Crick pair formation. Our efforts are based on anunderlying anticipation that the resulting novelnon-B-DNA folds, with potential unique recog-nition elements, should provide new insights intonucleic acid architecture and recognition. Much ofour effort has focused on purine-rich sequencesbecause of their propensity to form multistrandedDNA architectures. Such purine-rich segments areassociated with telomeric, centromeric and tripletrepeat disease sequences and their multi-strandedarchitectures could be potentially stabilized andfunctionally modulated by interaction with comp-lementary binding surfaces.

Here, we have addressed the issue of designingsequences that could readily fold to generatemulti-stranded DNA architectures involving basetriad formation. We set out to stabilize triad align-ments through stacking with well-characterized

Figure 11. (a) Triad DNA model containing four stacked G � (C-A) triads. Such an anti-parallel two-stranded DNAarchitecture was modeled for the d(CAG)2 sequence. The sugar ring oxygen atoms are represented by balls. (b) Stack-ing between adjacent triads (in orange and magenta) at A-G steps emphasizing the cross-strand overlap between thesix-membered rings of adenine residues . (c) Stacking between adjacent triads (in magenta and white) at G-C stepsemphasizing the partial crossstrand overlap between the six-membered rings of guanine residues.

142 G � (C-A) Base Triad-containing DNA Architecture

G �G �G �G tetrad alignments. Our current effortsfocused on the generation and characterization ofthe G � (C-A) triad which has been postulated, to bea potential building block for a triad DNA fold for(CAG)n sequences (Kuryavyi & Jovin, 1995, 1996;also see review by Mitas, 1997, on alternate foldingtopologies for triplet repeat sequence architec-tures). We have achieved this goal here, on thed(G-G-G-T-T-C-A-G-G) sequence, by establishingformation of isolated G � (C-A) triads through stack-ing on G �G �G �G tetrad templates, as shown sche-matically in Figure 1(b). The newly identi®edG � (C-A) triad alignment served as a building blockfor the generation of a model of triad DNA con-taining stacked G � (C-A) triads (Figure 11).

The pairing alignment of the G � (C-A) triaddetermined here establishes ®rstly that the major

groove edge of the Watson-Crick G �C base-pairalong with the Watson-Crick and minor grooveedges of the A residue are available for further rec-ognition by other nucleic acids and/or proteins.Secondly, the potential exists for replacing theG � (C-A) triad by other triad alignments within thecontext of the folding topology shown inFigure 1(b). Thirdly, opportunities exist within thetemplate-based approach, for the design and test-ing of sequences capable of accomodating two ormore stacked triads.

We know a great deal about base triple pairingrules for the recognition of duplex DNA by oligo-nucleotide third strand alignment within the majorgroove (reviewed by Radhakrishnan & Patel, 1994;Sun et al., 1996; Wang & Feigon, 1999). By contrast,there is no information on base triple pairing rules

G �(C-A) Base Triad-containing DNA Architecture 143

for potential oligonucleotide based-targeting of theDNA minor groove. The latter process is of con-siderable biological interest, since RecA-mediatedstrand exchange between homologous DNAsequences involves oligonucleotide-based minorgroove strand invasion of DNA (reviewed byTakahashi & Norden, 1994; Kurumizaka & Shibata,1996). The pairing alignments of the G � (C-A) triad(Figure 1(a)) and A � (T-A) triad (Figure 1(c)) estab-lish that A (of the third strand) in an anti orien-tation and aligned anti-parallel to the purine(of the duplex), can target the minor groove of aWatson-Crick G �C (or A �T) base-pair throughsheared G �A (or A �A) mismatch formation.

Materials and Methods

Preparation of unlabeled and uniformly(13C,15N)-labeled d(G-G-G-T-T-C-A-G-G)

The unlabeled d(G-G-G-T-T-C-A-G-G) sequence wassynthesized on a 10 mmol scale on an Applied Biosys-tems 392 DNA synthesizer using solid phase b-cyano-ethylphosphoamidite chemistry and was subsequentlypuri®ed by high pressure liquid chromatography(HPLC). The DNA oligomers in 5 ml volume were dia-lyzed against ®ve changes of H2O, followed by dialysisagainst 100 mM NaCl, 2 mM Na-phosphate buffer(pH 6.6) and ®nal dialysis against 10 mM NaCl, 0.2 mMphosphate buffer (pH 6.6) and lyophilized.

A modi®ed version of the Zimmer & Crothers (1995)procedure as described previously (Kettani et al., 1999,2000) was used for the enzymatic synthesis of uniformly(13C,15N)-labeled d(G-G-G-T-T-C-A-G-G). Uniformly(13C,15N)-labeled dNTPs were prepared in-house asdescribed (Kettani et al., 2000) and used as buildingblocks during the in vitro polymerization reaction cata-lyzed by murine mammary leukemia virus (MMLV)reverse transcriptase (Gibco-BRL). The uniformly(13C,15N)-labeled d(G-G-G-T-T-C-A-G-G) 9-mer was sep-arated from the unlabeled 25-mer template using 22 %denaturing polyacrylamide electrophoresis. The DNA9-mer bands were eluted from the gel and puri®ed asdescribed above for the non-labeled samples.

NMR data collection and processing

NMR data on the d(G-G-G-T-T-C-A-G-G) 9-mer inH2O and 2H2O buffer (100 mM NaCl, 2 mM phosphate(pH 6.6)) were collected on a Varian 600 MHz UnityInova NMR spectrometer. Proton assignments are basedon NOESY, COSY, TOCSY and HNNH-LR experiments.Data sets were processed and analyzed using the FELIXprogram (Molecular Simulations). Scalar couplingsacross NÐH � � �N hydrogen bonds in uniformly(13C,15N)-labeled d(G-G-G-T-T-C-A-G-G) in 100 mMNaCl were monitored in HNN-COSY (Dingley &Grzesiek, 1998; Pervushin et al., 1998 Majumdar et al.,1999a) and H(CN)N(H)-COSY (Majumdar et al., 1999b)contour plots using pulse sequences as described.

Distance restraints

The distances between non-exchangeable protonswere estimated from the buildup curves of cross-peakintensities in NOESY spectra at ®ve different mixingtimes (50, 100, 150, 200 and 300 ms) in 2H2O and given

bounds of �30 % with distances referenced relative tothe sugar H10-H200 distance of 2.20 AÊ . Exchangeable pro-ton restraints are based on NOESY data sets at two mix-ing times (60 and 200 ms) in H2O. Cross peaks involvingexchangeable protons were classi®ed as strong (strongintensity at 60 ms), medium (weak intensity at 60 ms)and weak (observed only at a mixing time of 200 ms)and proton pairs were then restrained respectively todistances of 3.0(�0.9) AÊ , 4.0(�1.2) AÊ and 6.0(�1.8) AÊ .Since the experimental NMR data are consistent with atwo-stranded motif containing a 2-fold symmetry axis,non-crystallographic symmetry restraints were imposedon all heavy atoms.

Structure calculations

The structure of the d(G-G-G-T-T-C-A-G-G) sequencein 100 mM NaCl was determined by molecular dynamics(MD)-simulated annealing computations driven by NOEdistance and hydrogen bonding restraints using X-PLORpackage, version 3.8 (BruÈ nger 1992). At the initial stageof the re®nement, torsional molecular dynamics wasundertaken at high temperature. The molecules wereequilibrated at 20,000 K (30,000 steps over 3 ps) andthen cooled very slowly to 1000 K (40,000 steps over20 ps). The potential energy function included a repul-sive force ®eld, NOE and hydrogen bond distancerestraints, glycosidic bond (w) dihedral angle restraintsand a non-crystallographic symmetry potential. Theforce constant for NOE distance restraints was main-tained at a value of 30 kcal molÿ1 AÊ ÿ2, while for hydro-gen bonds restraints the value was 50 kcal molÿ1 AÊ ÿ2.All NOE distance restraints were considered as ambigu-ous and treated with the ``sum'' averaging option(Nilges et al., 1991; Nilges, 1995). Dihedral anglerestraints (220(�50) �, with force constant of 50 kcalmolÿ1 radÿ2) were imposed on glycosidic torsion anglesfor the residues G2, G3, T4, T5, C6, A7 and G9 shownexperimentally to adopt anti conformations. Dihedralangle restraints of 65(�25) � were imposed on glycosidictorsion angles for the residues G1 and G8 shown exper-imentally to adopt syn conformations. The force constantfor non-crystallographic symmetry was maintained at30 kcal molÿ1 AÊ ÿ2.

These computations were followed by lower tempera-ture Cartesian space molecular dynamics guided by thehydrogen bonding and NOE distance restraints withchanges in the potential energy function: the repulsiveforce ®eld was replaced with Lennard-Jones potentialsand planarity restraints were included for triad and tet-rad planes with low weights of 1.5 kcal molÿ1 AÊ ÿ2 and3.0 kcal molÿ1 AÊ ÿ2, respectively. During this stage of thedynamics, the structures were further cooled from1000 K to 300 K (20,000 steps over 10 ps) and minimizeduntil the gradient of energy was less than 1.0 kcal molÿ1.

The re®nement protocol started from 60 differentinitial structures. The initial structures were generated assets of two chains, each nine nucleotides long, random-ized for all dihedral angles, and separated by space inter-vals of 50 AÊ . The convergence rate was good fordynamics computations guided by hydrogen bondingrestraints de®ning the G3 � (C6-A7) triad andG1 �G2 �G8 �G9 tetrad alignments: 18 low energy struc-tures out of 60 emerged with the same fold and pairwiser.m.s.d. values less than 1 AÊ between members of thegroup. Non-converged structures were not folded at alland were separated from the converged group by largegaps (in total more than 1500 kcal) in all components of

144 G � (C-A) Base Triad-containing DNA Architecture

the potential energy (van der Waals, NOE violations,covalent geometry).

The 18 converged distance re®ned structures corre-sponding to the folding topology shown schematically inFigure 1(b) were used as the starting point for sub-sequent X-PLOR based molecular dynamics compu-tations with back-calculation of the NOESY spectra. Therelaxation matrix was set up for the non-exchangeableprotons, with the exchangeable imino and amino protonsexchanged for deuterons. A total of 1830 non-exchange-able intensity values from NOESY data sets at ®ve mix-ing times in 2H2O buffer (366 non-exchangeableintensities per mixing time) were included with forceconstant of 200 kcal molÿ1. Planarity restraints werelifted at this stage while distance restraints were retainedwith 30 % bounds and the same masses as before. Themolecular dynamics computations were initiated byheating to 1000 K, followed by slow cooling over 8000steps to 300 K. The NMR R factor (R1/6) improved fromthe initial value of 11 % to 6 %, while retaining structuralconvergence and stereochemistry. The pairwise r.m.s.damongst the ten lowest energy intensity re®ned struc-tures was 0.42(�0.14).

Modeling of triad DNA

We have used the identi®ed alignment of the G � (C-A)triad as a building block to model a d(CAGCAG) �d(CAGCAG) triad DNA helix composed of four stackedG � (C-A) triads. The four triads in the initial models werepositioned to allow a right-handed stereochemistry forthe linking sugar-phosphate backbones. Ten initialmodels were then re®ned by two methods: (1) Cartesianmolecular dynamics and (2) minimization in helical par-ameter space. The potential energy function includedLennard-Jones potentials and electrostatic energy withreduced phosphate charges. The regular triad DNA con-sists of two distinct steps: for A-G triad steps, the sugar-phosphate backbone connects adenine and guaninebases from adjacent triads, while G-C steps connectguanine and cytosine bases from adjacent triads. Usinghelical parameters we were able to consider only signi®-cant degrees of freedom, four for each of two uniquesteps: helical Twist and Dx, Dy, Dz (three movementalong Ox, Oy and Oz directions).

The de®nition of the helical twist and displacement(Dx, Dy and Dz) parameters were similar to de®nedCompDNA helical parameters for duplex DNA (Gorinet al., 1995; Olson et al., 1998). A reference coordinateframe was obtained for each triad plane with the pivotpoint at equidistant positions from three C10 atoms (onefrom each residue forming triad), and the Oz axis as anaverage vector of three normal vectors (one for eachtriad base). Normal vectors for the bases were selectedas co-linear, so that vectors for all four Oz-axis are colli-near. Axis Oy was de®ned to be parallel to the line con-necting C10 atoms of guanine and cytosine and pointingfrom the strand 1 towards strand 2. Under such a de®-nition for a right-handed coordinate system, the Ox-axisis always directed into the major groove of the G �C pairfrom the corresponding triad. The twist angle ismeasured as the angle between Oy-axis of adjacenttriads, and the translational parameters Dx, Dy and Dzare the components of the translational vector to thereference frame of the next triad.

The lowest energy modeled structure has the follow-ing values of the helical parameters. For A-G steps, thetwist value was 48 �, while the Dx, Dy and Dz values

were 1.0, 0.8 and 3.4 AÊ , respectively. For G-C steps, thetwist value was 37 �, while the Dx, Dy and Dz valueswere 0.5, 0.2, 3.3 AÊ , respectively.

Coordinates deposition

Coordinates (accession number: 1fs3) of the d(G-G-G-T-T-C-A-G-G) quadruplex have been deposited in theRCSB Protein Data Bank.

Acknowledgments

This research was supported by NIH grants GM-34504 to D.J.P.

References

Allemand, J. F., Bensimon, D., Lavery, R. & Croquette,V. (1998). Stretched and overwound DNA forms aPauling-like structure with exposed bases. Proc. NatlAcad. Sci. USA, 95, 14152-14157.

BruÈ nger, A. T. (1992). X-PLOR. A System for X-ray Crys-tallography and NMR, Yale University Press, NewHaven, CT.

Cai, Z., Gorin, A., Frederick, R., Ye, X., Hu, W.,Majumdar, A., Kettani, A. & Patel, D. J. (1998).Solution structure of P22 transcriptional antitermination N peptide-boxB RNA complex. NatureStruct. Biol. 5, 203-212.

Cate, J. H., Gooding, A. R., Podell, E., Zhou, K., Golden,B. L., Szewczak, A. A., Kundrot, C. E., Cech, T. E.& Doudna, J. A. (1996). RNA tertiary structuremediation by adenosine platforms. Science, 273,1696-1699.

Chaput, J. C. & Switzer, C. (1999). A DNA pentaplexincorporating nucleobase quintets. Proc. Natl Acad.Sci USA, 96, 10614-10619.

Chen, L., Cai, L., Zhang, X. & Rich, A. (1994). Crystalstructure of a four-stranded intercalated DNA:d(C4). Biochemistry, 33, 13540-13546.

Chou, S.-H., Zhu, L. & Reid, B. R. (1997). Sheared puri-ne �purine pairing in biology. J. Mol. Biol. 267, 1055-1067.

Conn, G. L., Draper, D. E., Lattmann, E. E. & Gittis,A. G. (1999). Crystal structure of a conserved ribo-somal protein-RNA complex. Science, 284, 1171-1174.

Dingley, A. J. & Grzesiek, S. (1998). Direct observationof hydrogen bonds in nucleic acid base-pairs byinternucleotide 2JNN couplings. J. Am. Chem. Soc.120, 8293-8297.

Fiala, R., Jiang, F. & Patel, D. J. (1996). Direct correlationof exchangeable and nonexchangeable protons onpurine bases in 13C, 15N-labeled RNA usingHCCNH-TOCSY experiment. J. Am. Chem. Soc. 118,689-690.

Gehring, K., Leroy, J. L. & Gueron, M. (1993). A tetra-meric DNA structure with protonated cytosine-cytosine base-pairs. Nature, 363, 561-565.

Gorin, A. A., Zhurkin, V. B. & Olson, W. K. (1995).B-DNA twisting correlates with base-pair mor-phology. J. Mol. Biol. 247, 34-48.

Herbert, A. & Rich, A. (1996). The biology of left-handedZ-DNA. J. Biol. Chem. 271, 11595-11598.

G �(C-A) Base Triad-containing DNA Architecture 145

Heus, H. A. & Pardi, A. (1991). Structural features thatgive rise to unusual stability of RNA hairpins con-taining GNRA loops. Science, 253, 191-194.

Kalurachchi, K., Uma, K., Zimmermann, R. A. &Nikonowicz, E. P. (1997). Structural features of thebinding site for ribosomal protein S8 in E. coli 16 SrRNA de®ned using NMR spectroscopy. Proc. NatlAcad. Sci. USA, 94, 2139-2144.

Kang, C., Zhang, X., Ratliff, R., Moyzis, R. & Rich, A.(1992). Crystal structure of four-stranded Oxytrichatelomeric DNA. Nature, 356, 126-131.

Kettani, A., Bouaziz, S., Wang, W., Jones, R. A. & Patel,D. J. (1997). Bombyx mori single repeat telomericDNA sequence forms a G-quadruplex capped bybase triads. Nature Struct. Biol. 4, 382-389.

Kettani, A., Bouaziz, S., Skripkin, E., Majumdar, A.,Wang, W., Jones, R. A. & Patel, D. J. (1999). Inter-locked mismatch-aligned arrowhead DNA motifs.Structure, 7, 803-815.

Kettani, A., Gorin, A., Majumdar, A., Hermann, T.,Skripkin, E., Zhao, H., Jones, R. & Patel, D. J.(2000). A dimeric DNA interface stabilized bystacked A � (G �G �G �G) �A hexads and bound mono-valent cations. J. Mol. Biol. 297, 627-644.

Kettani, A., Kumar, R. A. & Patel, D. J. (1995). Solutionstructure of a DNA quadruplex containing thefragile X syndrome triplet repeat. J. Mol. Biol. 254,636-656.

Kurumizaka, H. & Shibata, T. (1996). Homolobgous rec-ognition by RecA protein using non-equivalentthree DNA strand binding sites. J. Biochem. 119,216-223.

Kuryavyi, V. V. & Jovin, T. M. (1995). Triad DNA: amodel for trinucleotide repeats. Nature, Genetics, 9,339-341.

Kuryavyi, V. V. & Jovin, T. M. (1996). Triangular com-plementarity of the triad-DNA duplex. In Proc.Ninth Conversation in Biomol. Struct. Dynam. (Sarma,R. H. & Sarma, M. H., eds), pp. 91-103, AdeninePress, Guilderland, New York.

Kuryavyi, V., Kettani, A., Wang, W., Jones, R. & Patel,D. J. (2000). A diamond-shaped zipper-like DNAarchitecture containing triads sandwiched betweenmismatches and tetrads. J. Mol. Biol. 295, 455-469.

Legault, P., Li, J., Mogridge, J., Kay, L. E. & Greenblatt,J. (1998). NMR structure of the bacteriophage l Npeptide/boxB RNA complex: recognition of aGNRA fold by an arginine rich motif. Cell, 93, 289-299.

Li, Y., Zon, G. & Wilson, W. D. (1991). NMR and mol-ecular evidence for a G �A mismatch base pair in apurine-rich DNA duplex. Proc. Natl Acad. Sci. USA,88, 26-30.

Lin, C. H. & Patel, D. J. (1997). Structural basis of DNAfolding and recognition in an AMP-DNA aptamercomplex: distinct architectures but common recog-nition motifs for DNA and RNA aptamers com-plexed to AMP. Chem. Biol. 4, 817-832.

Lin, C. H., Wang, W., Jones, R. A. & Patel, D. J. (1998).Formation of an amino acid-binding pocket throughadaptive zippering-up of a large DNA hairpin loop.Chem. Biol. 5, 555-572.

Liu, A., Majumdar, A., Hu, W., Kettani, A., Skripkin, E.& Patel, D. J. (2000). NMR detection of NÐH � � �O1C hydrogen bonds in 13C,15N-labelednucleic acids. J. Am. Chem. Soc. 122, 3206-3210.

Liu, D. J. & Day, L. A. (1994). Pf1 virus structure: helicalcoat protein and DNA with paraxial phosphates.Science, 265, 671-674.

Macaya, R. F., Schultze, P., Smith, F. W., Roe, J. A. &Feigon, J. (1993). Thrombin-binding DNA aptamerforms a unimolecular quadruplex structure insolution. Proc. Natl Acad. Sci. USA, 90, 3745-3749.

Majumdar, A., Kettani, A. & Skripkin, E. (1999a). Obser-vation and measurement of 2JNN coupling constantsbetween 15N nuclei with widely separated chemicalshifts. J. Biomol. NMR, 14, 67-70.

Majumdar, A., Kettani, A. & Skripkin, E. (1999b). Obser-vation of internucleotide NÐH � � �N hydrogenbonds in the absence of directly detectable protons.J. Biomol. NMR, 15, 207-211.

Mitas, M. (1997). Trinucleotide repeats associated withhuman disease. Nucl. Acids Res. 25, 2245-2254.

Neidle, S., (ed.) (1999). Oxford Handbook of Nucleic AcidStructure, Oxford University Press, Oxford.

Nilges, M. (1995). Calculation of protein structures withambiguous distance restraints. Automated assign-ment of ambiguous NOE cross peaks and disul®deconnectivities. J. Mol. Biol. 245, 645-660.

Nilges, M., Habazettl, J., Brunger, A. T. & Holak, T. A.(1991). Relaxation matrix alignment of the solutionstructure of squash trypsin inhibitor. J. Mol. Biol.219, 499-510.

Olson, W. K., Gorin, A. A., Lu, X.-J., Hock, L. M. &Zhurkin, V. B. (1998). DNA sequence-dependentdeformability deduced from protein-DNA crystalcomplexes. Proc. Natl Acad. Sci. USA, 95, 11163-11168.

Patel, D. J., Bouaziz, S., Kettani, A. & Wang, Y. (1999).Structures of guanine-rich and cytosine-rich quadru-plexes formed in vitro by telomeric, centromeric andtriplet repeat disease sequences. In Oxford Handbookof Nucleic Acid Structure (Neidle, S., ed.), pp. 389-453, Oxford University Press, Oxford.

Patel, D. J., Kozlowski, S. A., Nordheim, A. & Rich, A.(1982). Right-handed and left-handed DNA: studiesof B-DNA and Z-DNA by using proton nuclearOverhauser effect and phosphorus NMR. Proc. NatlAcad. Sci. USA, 79, 1413-1417.

Pervushin, K., Ono, A., Fernandez, C., Szyperski, T.,Kainosho, M. & Wuthrich, K. (1998). NMR scalarcouplings across Watson-Crick base-pair hydrogenbonds in DNA observed by transverse relaxation-optimized spectroscopy. Proc. Natl Acad. Sci. USA,95, 14147-14151.

Pley, H. W., Flaherty, K. M. & McKay, D. B. (1994).Three-dimensional structure of a hammerhead ribo-zyme. Nature, 372, 68-74.

Radhakrishnan, I. & Patel, D. J. (1994). DNA triplexes:solution structures, hydration sites, energetics, inter-actions and function. Biochemistry, 33, 11405-11416.

Rhodes, D. & Giraldo, R. (1995). Telomere structure andfunction. Curr. Opin. Struct. Biol. 5, 311-312.

Schwartz, T., Rould, M. A., Lowenhaupt, K., Herbert, A.& Rich, A. (1999). Crystal structure of the Zadomain of the human editing enzyme ADAR1bound to left-handed Z-DNA. Science, 284, 1841-1845.

Sheppard, W., Cruse, W. B., Fourme, R., de la Fortelle,E. & Prange, T. (1999). A zipper-like duplex inDNA: the crystal structure of d(GCGAAAGCT) at2.1 AÊ resolution. Structure, 6, 849-861.

Simorre, J. P., Zimmermann, G. R., Mueller, L. & Pardi,A. (1996). Correlation of guanosine exchangeableand nonexchangeable base protons in 13C,15N-labeled RNA with an HNC-TOCSY-CH experiment.J. Biomol. NMR, 7, 153-156.

146 G � (C-A) Base Triad-containing DNA Architecture

Sklenar, V., Dieckmann, T., Butcher, S. E. & Feigon, J.(1996). Through bond correlation of imino andaromatic resonances in (13C,15N)-labeled RNA viaheteronuclear TOCSY. J. Biomol. NMR, 7, 83-87.

Smith, F. W. & Feigon, J. (1992). Quadruplex structureof Oxytricha telomeric DNA oligonucleotides.Nature, 356, 164-168.

Sun, J. S., Garestier, T. & Helene, C. (1996). Oligonucleo-tide directed triple helix formation. Curr. Opin.Struct. Biol. 6, 327-333.

Sussman, D., Nix, J. C. & Wilson, C. (2000). The struc-tural basis for molecular recognition by the vitaminB12 RNA aptamer. Nature Struct. Biol. 7, 53-57.

Szewczak, A. A. & Moore, P. B. (1995). The sarcin/ricinloop, a modular RNA. J. Mol. Biol. 247, 81-98.

Takahashi, M. & Norden, B. (1994). Structure of RecA-DNA complex and mechanism of DNA strandexchange reaction in homologous recombination.Advan. Biophys. 30, 1-35.

Wang, A. H. J., Quigley, G. J., Kolpak, F. J., Crawford,J. L., van Boom, J. H., van der Marel, G. & Rich, A.(1979). Molecular structure of a left-handed DNAfragment at atomic resolution. Nature, 282, 680-686.

Wang, E. & Feigon, J. (1999). Structure of nucleic acidtriplexes. In Oxford Handbook of Nucleic Acid Struc-tures (Neidle, S., ed.), pp. 355-388, OxfordUniversity Press, Oxford.

Wang, K. Y., Krawczyk, S. H., Bischofberger, N.,Swaminathan, S. & Bolton, P. H. (1993). The tertiarystructure of a DNA aptamer which binds to andinhibits thrombin determines activity. Biochemistry,32, 11285-11292.

Wang, Y. & Patel, D. J. (1993). Solution structure of thehuman telomeric repeat d(AG3[T2AG3]3) G-tetra-plex. Structure, 1, 263-282.

Wimberly, B. T., Guymon, R., McCutcheon, J. P., White,S. W. & Ramakrishnan, V. (1999). A detailed viewof a ribosomal active site: the structure of the L11-RNA complex. Cell, 97, 491-502.

Zimmer, D. P. & Crothers, D. M. (1995). NMR of enzy-matically synthesized uniformly 13C,15N-labeledDNA oligonucleotides. Proc. Natl Acad. Sci. USA,92, 3091-3095.

Zimmerman, G. R., Jenison, R. D., Wick, C. L., Simmore,J. P. & Pardi, A. (1997). Interlocking structuralmotifs mediate molecular discrimination by a theo-phylline-binding RNA aptamer. Nature Struct. Biol.4, 644-649.

Edited by M. Summers

(Received 3 April 2000; received in revised form 5 June2000; accepted 5 June 2000)

http://www.academicpress.com/jmb

Supplementary material is available from JMBOnline