Download - Conformational variation of the central CG site in d(ATGACGTCAT)2 and d(GAAAACGTTTTC)2 : An NMR, molecular modelling and 3D-homology investigation

Transcript

Eur. J. Biochem. 261, 722±733 (1999) q FEBS 1999

Conformational variation of the central CG site in d(ATGACGTCAT)2 andd(GAAAACGTTTTC)2

An NMR, molecular modelling and 3D-homology investigation

Christine Cordier, Laurence Marcourt, Michel Petitjean and Guy Dodin

Institut de Topologie et de Dynamique des SysteÁmes, associe au CNRS, Universite D. Diderot (Paris 7), France

The determination of the solution structure of two self-complementary oligomers d(ATGACGTCAT)2 (CG10) and

d(GAAAACGTTTTC)2 (CG12), both containing the 5 0-pur-ACGT-pyr-3 0 sequence, is reported. The impact of the

base context on the conformation of the central CpG site has been examined by a combined approach of: (a)

2D 1H-NMR and 31P-NMR; (b) molecular mechanics under experimental constraints; (c) back-calculations of

NOESY spectra and iterative refinements of distances; and (d) 3D-homology search of the central tetrad ACGT

within the complete oligonucleotides. A full NMR study of each fragment is achieved by means of standard 2D

experiments: NOESY, 2D homonuclear Hartmann±Hahn spectroscopy, double-quantum-filtered COSY and

heteronuclear 1H-31P correlation. Sugar phase angle, 1±z difference angle and NOE-derived distances are input

as experimental constraints to generate molecular models by energy minimization with the help of jumna. The

morass program is used to iteratively refine the structures obtained. The similarity of the two ACGTs within the

whole oligonucleotides is investigated. Both the decamer and the dodecamer adopt a B-like DNA conformation.

However, the helical parameters within this conformational type are significantly different in CG12 and CG10. The

central CpG step conformation is not locked by its nearest environment (5 0A and 3 0T) as seen from the structural

analysis of ACGT in the two molecules. In CG12, despite the presence of runs of A-T pairs, CpG presents a high

twist of 438 and a sugar phase at the guanine of about 1808, previously observed in other ACGT-containing-

oligomers. Conversely, ACGT in CG10 exhibits strong inclinations, positive rolls, a flat profile of sugar phase, twist

and glycosidic angles, as a result of the nucleotide sequence extending beyond the tetrad. The structural specificity

of CG10 and its flexibility (as reflected by its energy) are tentatively related to the process of recognition of the

cyclic AMP response element by its cognate protein.

Keywords: 2D NMR; modelling; 3D homology; ACGT step; CRE; base sequence effects.

The 5 0-CpG-3 0 sequence is a preferred site of interactionbetween nucleic acids and many proteins or ligands. It is wherecytosine methylation mainly occurs, an event recognized to playa pivotal role in imprinting and in the regulation of genetranscription [1±5]. Me5CG is also a hot-spot for spontaneouspoint mutation resulting from C to T transition [6,7]. CpG ispresent in the cyclic AMP response element (CRE) [8,9], thetarget of activation factors in the series of leucine zippers, and ithas been shown to be a strong activator of the expression ofgenes involved in the immune response, specially when it isflanked by two 5 0 purines and two 3 0 pyrimidines [10±13]. Inthis nucleotide context, it is the major site of alkylation ofguanine by ligands in the bipyridiniumaldehyde series [14].Several structural studies have shown the ability of CpG to adoptvarious conformations as a result of its intrinsic malleability

with respect to its nucleotide surrounding [15±19]. When it ispresent in the ACGT string, the following structural features forthe tetramer are observed: low±high±low twist pattern, high±low±high 1±z angle difference in the BI state and alternatinglow±high sugar phase. This indicates that CpG significantlydeparts from a strict B geometry. It is therefore expected thatCpG-containing sequences may be readily fine-tuned by the 5 0

and 3 0 environment of the central step to provide specific signalsfor effector or protein recognition.

In order to assess the influence of the sequences surroundingCpG, we have undertaken the study of the self-complementarydecamer d(ATGACGTCAT)2 (CG10), which includes the octa-meric sequence of CRE, and the dodecamer d(GAAAACGTTTTC)2

(CG12) where the A and T tracts are expected to induce markedlocal variations in the helical parameters (high values ofpropeller twists, curvature of the helix axis, narrowing of theminor groove, etc.) [17,20]. The presence of the same centraltetrad ACGT in both oligomers will indicate whether thevariations in the CG step structure are entirely determined by theimmediate environment (here 5 0A and 3 0T) or if it also dependson the distal upstream and downstream sequences. In this work,the two oligomers are extensively studied by using standard 2DNMR experiments in solution. Conformers are generated fol-lowing a two-step procedure, i.e. molecular mechanics withjumna under experimental constraints (sugar puckering, 1±zdifference angle and NOE-derived distances) and back-calculations

Correspondence to G. Dodin, Institut de Topologie et de Dynamique des

SysteÁmes, associe au CNRS, Universite D. Diderot (Paris 7), France.

Fax: +33 1 44 27 68 14, E-mail: [email protected]

Abbreviations: HOHAHA, 2D homonuclear Hartmann±Hahn spectroscopy;

COLOC, X-1H shift correlation by long range coupling; CRE, cyclic AMP

response element; DQF-COSY, double-quantum-filtered COSY;

SDM, sorted distances matrix; CSR, combined SDM/RMS; TPPI,

time-proportional phase incrementation.

(Received 30 November 1998, revised 8 February 1999, accepted

9 February 1999)

q FEBS 1999 Solution structure of two self-complementary oligomers (Eur. J. Biochem. 261) 723

of NOESY spectra completed with iterative refinement of inter-proton distances (morass program). Only one refined geometryis obtained with a satisfactory R-factor for each palindrome. Adetailed analysis of structural parameters is presented. In orderto quantify the structural similarity of the ACGT tetrad, a newapproach to 3D-homology search is used. The advantages of thismethod, which combines the sorted distances matrix (SDM)/RMS algorithms to extract a possible common substructure fromtwo chains of unequal length, are described.

MATERIALS AND METHODS

Sequence numbering is as follows:Decamer: 5 0-A1 T2 G3 A4 C5 G6 T7 C8 A9 T10-3 0

Dodecamer: 5 0-G1 A2 A3 A4 A5 C6 G7 T8 T9 T10 T11C12-3 0

Materials

Oligonucleotides were synthesized at the Institut Pasteur (Paris)and supplied as NH4

+ salts (18 ammonium ions per duplex forCG10 and 22 for CG12).

NMR samples were prepared by dissolving the oligonucleo-tides (5.5 mg for CG10/9.6 mg for CG12) in 450 mL of phos-phate buffer (pH = 7.4, I = 50 mm) containing 1 mm EDTA. Inthe nonexchangeable proton investigation, samples are firstlyophilized from the aqueous buffer then twice from 99.90%D2O and finally dissolved in 400 mL of 99.95% D2O. For theexchangeable proton study, the D2O samples are freeze-driedand redissolved in a H2O/D2O (90/10) solvent mixture.

NMR methods

1H-NMR spectra are recorded on a Bruker AM-500 MHzspectrometer (D2O solvent) and on a Bruker DMX-500 MHz(H2O/D2O mixed solvent). The temperature is set at 298 K(nonexchangeable proton study) or 283 K (amino and iminoproton investigation) taking into consideration the Tm measuredby NMR (319/316 ^ 1 K for CG10/CG12).

1D spectra (D2O solvent) are recorded typically by using a908 excitation pulse, an acquisition time of 2.0 s and a 2.2-srelaxation delay. The residual HOD signal is suppressed by low-power decoupling. In H2O/D2O experiments, the water reso-nance is suppressed by a `water gate' field gradient sequence[21] associated with a 3-9-19 pulse sequence for the selectiveexcitation of water (spectral width = 20 p.p.m., delay = 175 ms).

Nonexchangeable protons are assigned by means of conven-tional 2D experiments: NOESY at long mixing time (250 ms forthe decamer/300 ms for the dodecamer) and 2D homonuclearHartmann±Hahn spectroscopy (HOHAHA) with a 115-ms spin-lock pulse at 8.0 kHz. Both types of correlation are performedon nonspinning NMR samples. Data are acquired with thetransmitter frequency placed at the residual HOD resonance. Forphase-sensitive NOESY experiments, a low-power continuouswave irradiation is applied to the residual water signal during therecycle delay (2.0 s) and the mixing time. Data are acquired, intime-proportional phase incrementation (TPPI) mode [22], with2048 complex data points in t2 and 320/300 data points in t1.They are zero-filled to 2048 £ 1024 before Fourier transforma-tion. Data are processed with shifted sinebell window functionsin t1 and t2. For the HOHAHA, the spin-lock period isdetermined by using an MLEV-17 pulse sequence [23,24] withthe spectrometer configured in the `inverse-mode' and thedecoupler set to generate low-power 1H 908 pulses of 32/30 ms.

Amino and imino proton resonances are then assigned fromphase-sensitive NOESY experiments (TPPI mode) in H2O/D2O

(Water Gate and 3-9-19 trim pulse sequence), obtained with2048 complex data points in t2 and 400 real points in t1. Amixing time of 200 ms and a relaxation delay of 2.0 s betweeneach of the 96 scans per experiment over 10 kHz are used. Free-induction decays are processed as for D2O experiments.

Quantitative NOESY spectra in D2O are run at various mixingtimes (80, 160, 220 ms for CG10 and 80, 120, 150, 200 ms forCG12) over a single 90-h period without removing the samplefrom the spectrometer or changing the acquisition parameters(see long mixing time NOESY experiments for the latter, exceptthe number of increments (t1) which is increased to 360).

Pure absorption double-quantum filtered (DQF) COSY spectra[25,26] are obtained in D2O with the standard pulse sequenceusing the TPPI mode. 600/512 increments in t1 are collectedover a spectral width of 3.5/4.0 kHz and 4096 points in t2. Therepetition delay is set to 2.2 s and 128/104 scans are collectedfor each t1 increment. Data are zero-filled to generate aresolution of 3.5/4.0 Hz per pt in the f1 and 1.7/2.0 Hz per ptin the f2 dimension. A shifted sinebell (p/2) window function isapplied to both dimensions before Fourier transformation.

The 31P-NMR experiments are recorded in D2O with a Bruker300 MHz spectrometer using a 5-mm quadra-probe (1H, 13C,31P, 19F). 1D and 2D spectra are recorded at 301K on thesamples used previously for the 1H experiments. The 31P spectraare referenced to trimethylphosphate at 0.0 p.p.m. The long-range heteronuclear correlation (COLOC) [27±30] is acquiredin 300/502 scans for each of the 128 free-induction decays,which contain 2048 data points over 600/530 Hz spectralwidths. Zero-filling in f1 produces 2048 £ 1024 data matrices.Before 2D Fourier transformation, a sinebell function is appliedin f1 and a Gaussian function in f2.

From these experiments, the 31P resonances for the 9/11phosphate groups are assigned and 1-z parameters (differencebetween two backbone torsion angles) are calculated easilyaccording to the empirical relation: 1±z = 254.5 + 72.8 d31P [31].

NOE distance constraints

The NOEs are quantified by integrating the volume of eachresolved crosspeak with the help of the gifa 4.0 program[32,33]. Crosspeaks are detected by a peak-picking algorithmafter a noise-level evaluation. Each crosspeak is integrated byfinding a contour spanning the largest extension of the peak (theamoeba). The amoeba is determined by the following fourcriteria: slope, threshold, radius and the intensity ratio betweenthe largest point and the current evaluated point. The sum of thepoints under this amoeba is then computed and stored as avolume table. All peaks of interest are integrated at all mixingtimes. Interproton distances are estimated using the distanceextrapolation method [34,35]. Each NOE-derived distance isscaled to the cytosine H5±H6 distance fixed at 2.46 AÊ [36].Distances involving methyl groups are ignored because of themotions of these groups.

Molecular modelling

Molecular modelling is performed using the jumna 8.5algorithm (Junction minimization of nucleic acids) [37±39]that was specifically developed to use helicoidal variablesdescribing the structure of DNA fragments during energyminimization. The force field, flex, employed within jumnais based on Lennard±Jones parameters and atomic charges [40].A sigmoidal distance-dependent dielectric function [41](slope = 0.16 2 plateau = 78) is used as a representation ofthe H2O effect. Net phosphate charges are reduced to 20.5 e to

724 C. Cordier et al. (Eur. J. Biochem. 261) q FEBS 1999

simulate a physiological pH. jumna can also energy-optimizeDNA fragments under local constraints and is well-adapted tohandling interproton distances and dihedral angles either asfixed values or as upper and lower bounds. Violations of theinput constraints are prevented by a simple quadratic penaltyterm using force constants of 25.1 kJ´mol21 per AÊ 2 for fixeddistances, 50.2 kJ´mol21 per AÊ 2 for upper and lower distancebounds and 4.18 MJ´mol21 per rad22 for torsional angles. Sugarpuckers can be imposed by phase angle and/or amplituderestraints. Each minimized structure is analysed with thecurves 4.1 program [42], which calculates the optimal helicalaxis and the complete set of helical, structural and sugarparameters. Other features of the helix are given in terms ofhelical axis curvature and width/depth of grooves.

Generation of initial structures. Stable conformations to be usedas starting points are sought with jumna by scanning the P anglefrom 1208 to 2208 in 108 steps. Input rise and twist parametersare 3.38 AÊ and 368, in agreement with canonical parameters of aB-type DNA. For these first calculations, phase constraints areroughly imposed (e.g. a single value for all sugars of a DNAfragment). This methodology has the advantage of giving ageneral trend for the sugar puckering and is not too restrictivefor the whole structure. It has been improved to locate the stableDNA conformations [43]. In the next step, the system is relaxedand the energy re-optimized. Previous modelling work on CG10[44] and NMR evidence for BI backbone geometry in CG12 [45]make it unnecessary to perform a conformational search withBII-type backbone junctions (BI conformation corresponds to1±z = ±908 and BII to +908). Finally, initial structures are sortedon the basis of energy criteria and are grouped into conforma-tional families according to their structural parameters and rms.This latter is the rms calculated for all the base-pairs except the3 0 and 5 0 terminal pairs.

Back-calculations of NOESY spectra and iterative refinement ofdistances. In order to refine 3D structures, back-calculations ofNOESY spectra are computed with the help of the morass 2.1software package [46] by using the iterative relaxation-ratematrix approach. After minimization, data of the 3D structureare fed into morass as a protein databank (PDB) file. First, theprogram simulates a theoretical back-NOESY spectrum assum-ing a rigid molecule with a single overall correlation time (tC).A theoretical relaxation-rate matrix is set up taking into accountthe full multispin system; this leads to a theoretical crosspeakvolume determination for a chosen mixing time. Methyl groupsand exchangeable protons are ignored. In a second step, simu-lated and experimental NOE data (input as volumes) are com-pared; their reliability is quantified by the classical R-factordefined below, where n is the number of peaks of interest used inthe refinement procedure.

R �

Pni�1i±j

�Vij exp ÿ Vij theo�

Pni�1i±j

Vij exp

In the third step, a set of new distances is generated by mergingthe experimental and theoretical volumes according to theprotocol described by Lefebvre et al. [16] where dnew = dijtheo(Vij theo/Vij exp)1/6. The R-factor is then minimized by iterativerefinement of the distances. For this study, a correlation time(tC) of 7/5 ns is used to simulate the theoretical NOESY at amixing time of 160/80 ms for CG10/CG12, respectively. These

tC values are used because they give the best fit with the morassprogram. For the decamer, an attempt to refine the distanceswith the data extracted from the 80 ms NOESY as starting pointled to a poor R-factor because of the low signal-to-noise ratio forthis spectrum.

Refinement of molecular modelling. Starting structures havingbeen generated, NOE-derived constraints are applied following afive-step procedure. For each step, NOESY spectra are back-calculated and distances refined iteratively until the R-factorreaches a minimum. NOE-derived distances are applied so as togradually tighten the models. The two terminal nucleotides areleft freely minimized. First, sugar puckering is controlled byimposing the intranucleotide H1 0±H4 0 distances as a range withbounds of ^ 10%. The 1±z parameters [31] are then input withbounds of ^ 108. In the third step, the main intranucleotidedistances, namely the five sets: H6/H8±H1 0,H2 0,H2 0 0 and H1 0±H2 0,H2 0 0 are submitted [47] to the refinement protocol asranges. Two types of internucleotide distance (e.g. H6/H8±H1 0,H2 0) are then supplied. In the fifth and final step, the list ofconstraints is completed with internucleotide H6/H8±H2 0 0

distances. Finally, 80/93 constraints are taken into account.

3D-homology and maximal common 3D substructure search

Full details of the 3D-homology method have been givenelsewhere [48]. The advantages of this procedure over theclassical rms algorithm available in many modelling softwaresare: (a) no one-to-one pairwise correspondence is needed; (b)better alignments are achieved [49]; and (c) molecules ofunequal length, such as the DNA fragments in this study, arereadily compared. Briefly, after the input of the respective set ofatomic positions (x, y, z) for both CG10 and CG12 derived fromthe NMR experiments, the analysis proceeds by two successivesteps. The first is the spatial alignment where one fragment (A),is displaced by rotation and translation in order to get the bestspatial superposition with the other (B). The similarity betweenA and B is evaluated by a numerical criterion, d(AB), whichdepends on the local coordinate systems of A and B. d(AB) is afunctional distance which is subsequently minimized to D(AB),by means of a Newton-like algorithm, with respect to allpossible rotations and translations, starting from an initialrandom superposition. In order to avoid local minima of d(AB),the procedure is initiated from random points. The minimizeddistance, D(AB), does not depend on the local coordinatesystems of A and B and is, in the mathematical sense, a distancein the metric space of the molecule (the triangle inequality issatisfied). d(AB) is the norm (i.e. the length) of a bidimensionalvector with two components associated with a positive and anegative charge distribution, respectively [48]. The squares ofd(AB) and of D(AB) have therefore the physical dimension ofan electric charge. It must be stressed that d(AB) and D(AB) areby no means the geometric distances computed by rmsprocedures. The percentage of similarity between the fragmentsis defined as the ratio of D(AB), the lower bound of d(AB), tothe upper bound, this latter being reached when fragment A is atinfinity. This percentage should not be confused with thepercentage of 3D-homology (see below).

The second step is the extraction of the motif common to Aand B by means of the SDM algorithm. Once the optimumalignment of fragments A and B has been reached, the SDMprocedure displays the common region as follows. Therectangular array of distances between two sets of atoms issorted out in order of increasing values. The pair of nearestatoms is always included in the common 3D substructure. Then

q FEBS 1999 Solution structure of two self-complementary oligomers (Eur. J. Biochem. 261) 725

the subsequent atom pairs are included in the common sub-structure until one atom in the pair has been already encountered.This last pair is then rejected and the procedure stops. Conse-quently, the common region cannot have more atoms than theshortest fragment and the two subsets in the common regionhave pairwise associated atoms. The percentage of 3D-homo-logy is the ratio of the number of atom pairs in the commonregion, Nc, to the mean number of atoms in the two molecules,N(A) and N(B). This percentage may also be computed for anyspatial alignment using the rms algorithm. However, in this casethe knowledge of the one-to-one pairwise correspondencebetween atoms in the two fragments is necessary, a requirementnot met in many situations. Therefore, the rms procedure isrestricted to equal-length fragments. Moreover, the quality of therms alignment is poor because of the large numerical contri-bution of the atom pairs outside the common region.

3D-homology analysis was also performed using the com-bined SDM/RMS (CSR) algorithm where the percentage ofhomology is maximized as follows [50]. Starting from a randomspatial alignment, the Nc atom-pairs of the common 3D motifare computed by the SDM algorithm. Then the standard rmsalignment between these Nc pairs is performed, thus superposingthe whole fragments. Following this, the SDM algorithm isapplied again. If the updated Nc value is increased, the process isiterated. When Nc is stationary, the process is iterated if the rmsis lowered. Several initial random spatial alignments are selectedto ensure that the algorithm is not trapped in a local extremum.

RESULTS AND DISCUSSION

Analysis of NMR data

1D 1H NMR spectra (D2O) show that the protons of homologousnucleotides in the two strands are equivalent. This indicates adyad symmetry as expected from the self-complementarity ofthe DNA fragments studied. The features of 2D spectra are in

good agreement with B-like DNA structures for CG10 andCG12. For example, a right-handed B-type conformation foroligomers is consistent with the presence of intraresidue base±H2 0 and interresidue base±H2 0 0 crosspeaks at a short mixingtime (80 ms), whereas intraresidue base±H2 0 0 and interresiduebase±H2 0 spots are very weak and are strong only at longermixing times. Moreover, the lack of H3 0±H2 0 0 crosspeaksconcomitant with the presence of H3 0±H2 0 correlations in theDQF-COSY spectra confirms that the sugars are in the C2 0-endorange. Nonexchangeable proton resonances are assigned in D2Oat 298 K using the classical methods for nucleic acids [51±53]:NOESY at long mixing time, HOHAHA and DQF-COSYexperiments. Exchangeable proton resonances are identified bytheir connectivity with aromatic, H1 0 and CH3 protons fromNOESY experiments recorded in H2O/D2O (90/10) at 283 K asdescribed previously [52,54]. Proton chemical shifts are given inTable 1 for both oligonucleotides. The small phosphorus signaldispersion, less than 1 p.p.m., is associated with right-handedhelices. 1H-31P COLOC spectra are run in order to assign the 31Patoms in the phosphodiester bridges [55]. Phosphorus signals areassigned by their correlation with the H3 0

(n) and H4 0(n+1) protons

(Table 2).

Decamer. After an unambiguous proton assignment [14,15] acareful examination of DQF-COSY crosspeak fingerprintsprovides information about the sugar pucker [56]. As is wellknown, the sugar conformation plays a fundamental role indetermining the local structure of double helices [15,57]. Theintensity of the H2 0±H3 0, H2 0 0±H3 0, H3 0±H4 0 COSY multipletsis directly dependent on the magnitude of the scalar coup-ling constants involved. It allows the sugar pucker to be quali-tatively confined into phase ranges. For CG10, all the H3 0±H4 0

correlations (except for G3 and T10 nucleotides) are weak butquite detectable. Combining this result with the lack of H2 0±H3 0

couplings puts P in the 140±1628 range (21T to 2E range, i.e.

around C2 0-endo). The crosspeak associated with T10 is strong,

Table 1. Proton chemical shifts of d(ATGACGTCAT)2 and d(GAAAACGTTTTC)2 (in p.p.m.). Determined from 2D NOESY and HOHAHA spectra

recorded in D2O at 298 K. Phosphate buffer, I = 50 mm, pH = 7.4 + 1 mm EDTA or in H2O/D2O (90/10) at 283 K. NO, not observed.

Nucleotide H6/H8 H1 0 H2 0 H2 0 H3 0 H4 0 H2/H5/CH3 Imino Amino

Decamer (CG10)

A1 8�.20 6�.25 2�.68 2�.82 4�.90 4�.33 8�.08 ± NO

T2 7�.34 5�.65 2�.15 2�.42 4�.88 4�.25 1�.53 13�.82 ±

G3 7�.93 5�.59 2�.73 2�.81 5�.04 4�.42 ± 12�.65 NO

A4 8�.21 6�.26 2�.72 2�.93 5�.08 4�.54 7�.82 ± 6�.74

C5 7�.22 5�.60 2�.05 2�.38 4�.82 4�.24 5�.27 ± 8�.18±6.63

G6 7�.81 5�.96 2�.60 2�.79 4�.94 4�.41 ± 12�.77 NO

T7 7�.26 6�.03 2�.11 2�.50 4�.87 4�.27 1�.41 13�.82 ±

C8 7�.51 5�.64 2�.02 2�.39 4�.86 4�.18 5�.75 ± 8�.59±7.06

A9 8�.35 6�.33 2�.80 2�.89 5�.05 4�.47 7�.92 ± NO

T10 7�.33 6�.13 2�.20 2�.20 4�.55 4�.11 1�.65 13�.80 ±

Dodecamer (CG12)

G1 7�.85 5�.50 2�.41 2�.61 4�.82 4�.16 15�.58 NO

A2 8�.22 5�.79 2�.74 2�.84 5�.06 4�.39 7�.44 ± NO

A3 8�.12 5�.85 2�.63 2�.84 5�.06 4�.45 7�.20 ± NO

A4 8�.04 5�.89 2�.58 2�.85 5�.06 4�.45 7�.15 ± NO

A5 8�.00 6�.06 2�.50 2�.83 5�.00 4�.45 7�.62 ± NO

C6 7�.09 5�.55 1�.89 2�.31 4�.77 4�.15 5�.06 ± 7�.72±6.25

G7 7�.79 5�.98 2�.61 2�.81 4�.95 4�.38 ± 12�.50 NO

T8 7�.26 6�.07 2�.17 2�.61 4�.87 4�.29 1�.35 13�.95 ±

T9 7�.51 6�.20 2�.23 2�.68 4�.95 4�.25 1�.62 13�.90 ±

T10 7�.53 6�.20 2�.22 2�.68 4�.95 4�.25 1�.71 13�.83 ±

T11 7�.48 6�.20 2�.21 2�.58 4�.95 4�.21 1�.77 13�.93 ±

C12 7�.70 6�.32 2�.31 2�.31 4�.62 4�.07 5�.88 ± 8�.20±7.07

726 C. Cordier et al. (Eur. J. Biochem. 261) q FEBS 1999

as is commonly observed for a 3 0 terminal residue. Unlike theothers, the G3 H3 0±H4 0 coupling is hardly detectable; this couldindicate that the phase value is high.

We have chosen to use the empirical relationship: 1±z =254.5 + 72.8 d31P [31] to gain access to backbone informationfrom the 31P chemical shifts (Table 2). This straightforwardapproach is a good alternative to 3JH3 0±P coupling constantestimation. Indeed, previous work [19] has shown that it isdifficult to obtain these coupling constants accurately because ofthe width of the lines associated with proton-detected hetero-correlation experiments with a selective pulse on the H3 0

protons [58]. The fact that the phosphorus signals are broad(500 MHz spectrometer) and the 3JH3 0±P coupling constantssmall leads to an overestimation of values of the latter. The 1±zvalues for T2pG3, C5pG6 and C8pA9 junctions, obtained in thisway, are remarkable and correspond to the three deshieldedpeaks (. ±4 p.p.m.).

Dodecamer. Sequential assignment is achieved using NOESYexperiments performed at 300 ms in D2O. The combination ofNOESY base-H2 0/H2 0 0 crosspeak intensity, H1 0±H2 0/H2 0 0 scalarcouplings observed in the HOHAHA experiment and multipletmorphology of the DQF-COSY spectrum (H1 0±H20/H2 0 0 region)makes it possible to distinguish H2 0 from H2 0 0 protons. Despitethe fact that there are four identical nucleotides within theadenine or thymine tracts, all protons except H5 0 and H5 0 0 areassigned (Table 1). The H1 0/H3 0 proton resonances for thethymines 9, 10 and 11 overlap strongly. The exchangeableprotons as well as the adenine H2 protons (Fig. 1) are assignedusing the classical methods in H2O/D2O for nucleic acids[52,54]. Methyl proton resonances are sufficiently spread out forit to be possible to identify unambiguously thymine imino peaksfrom the NOESY spectrum. The peak arising from G1 imino isbroad and no correlation is observed on the NOESY map, asexpected for a terminal residue. As usually found, amino protonsof guanines are not detected because of intermediate rotationabout the C±N bond. More interestingly, the amino protons ofadenines are not observed as a consequence of a conformationaleffect as a result of the adenine tract [54].

Conformational analysis of sugar from NMR data is notdiscussed for terminal units. The lack of H2 0 0±H3 0 DQF-COSYcorrelation confirms that the sugar pucker is in the C2 0-endo

range. The H3 0±H4 0 crosspeaks can be sorted into three groups:weak (A2 and G7), medium (A3, A4 and A5) and strong (C6and T8) to which correspond various values of P. The lowestintensities are associated with the highest P-values in this puckerdomain. Classification of T9, T10 and T11 is not possiblebecause of substantial overlap of the corresponding multiplets.In contrast, the crosspeaks belonging to the central ACGT arewell separated.

All phosphate groups exhibit upfield shifts (,24 p.p.m.except the terminal T11pC12 junction at 23.9 p.p.m.). The31P resonances are easily identified from their connectivitywith the H4 0 protons on the 3 0 side, even in the thyminetract.

Comparison of the ACGT step in the two oligonucleotides. Thefocal point of this study is the ACGT conformation. The mainfeatures of this tetramer observed by NMR emphasize theimpact of neighbouring nucleotides: (a) while the four H3 0±H4 0

DQF-COSY crosspeaks exhibit similar low intensities for CG10,these connectivities appear with various intensities for CG12;(b) a deshielding of all the cytosine protons is observed forCG10 relative to CG12 while protons of the guanine remainroughly unchanged ± the downfield shift could be associatedwith a stacking change of the cytosine within the oligomer as aconsequence of modified helical parameters; and (c) the 31Psignal assigned to the CpG step is strongly deshielded for CG10but only slightly for CG12 ± other junctions of the tetrad arestandard according to 31P NMR criteria for both palindromes(Table 2).

Table 2. Phosphorus chemical shifts of d(ATGACGTCAT)2 and

d(GAAAACGTTTTC)2 (in p.p.m.). Determined from 2D heteronuclear

COLOC spectra recorded in D2O at 301 K. Phosphate buffer, I = 50 mm,

pH = 7.4 + 1 mm EDTA. Phosphorus chemical shifts are referred to

trimethylphosphate at 0.0 p.p.m. Torsional angle difference 1±z is calculated

by means of the empirical equation: 1±z = 254.5 + 72.8 d31P [19].

Decamer (CG10) Dodecamer (CG12)

Phosphorus d 1±z Phosphorus d 1±z

± ± ± G1PA2 ±4�.02 ±38�.2

A1PT2 ±4�.26 ±55�.6 A2PA3 ±4�.13 ±45�.8

T2PG3 ±3�.82 ±23�.6 A3PA4 ±4�.21 ±52�.0

G3PA4 ±4�.05 ±40�.3 A4PA5 ±4�.27 ±56�.0

A4PC5 ±4�.24 ±54�.2 A5PC6 ±4�.19 ±50�.2

C5PG6 ±3�.93 ±31�.6 C6PG7 ±4�.03 ±38�.9

G6PT7 ±4�.42 ±67�.3 G7PT8 ±4�.44 ±68�.7

T7PC8 ±4�.21 ±52�.00 T8PT9 ±4�.38 ±64�.4

C8PA9 ±3�.87 ±27�.2 T9PT10 ±4�.31 ±58�.9

A9PT10 ±4�.20 ±51�.3 T10PT11 ±4�.22 ±52�.4

± ± ± T11PC12 ±3�.91 ±30�.1

Fig. 1. 500 MHz NOESY spectrum of d(GAAAACGTTTTC)2 in H2O/

D2O (90/10) at 283 K. F2 region: 5.5±8.5 p.p.m., F1 region: 1.0±8.5 p.p.m.

q FEBS 1999 Solution structure of two self-complementary oligomers (Eur. J. Biochem. 261) 727

Molecular modelling

Generation of starting structure. Armed with the NMR obser-vations, initial structures are generated by scanning the P-valueonly from 1208 to 2208 in 108 steps (see Materials and methods)and are energy optimized. Three and six minima are obtained forCG10/CG12, respectively. On the basis of structural parameters(helicoidal, sugar pucker, backbone torsion angle) as well as rmscalculations, only two and three structures for each DNAsequence are found (termed CG10a,b/CG12a-c, respectively,Table 3). For any two selected starting structures, the rms isgreater than 0.4 AÊ . For the decamer, the deshielded phosphorusatoms could be interpreted as a mark of a BII state within thethree junctions T2pG3, C5pG6 and C8pA9. These energetic andconformational aspects of BI±BII transitions have been carefullyinvestigated for this DNA fragment [44]. Results of this studyled the authors to conclude that BII conformations are incom-patible with the solution structure of CRE sequence. For thedodecamer, the 31P NMR data undoubtedly prove that alljunctions are BI.

Determination of interproton distances from NOESY experi-ments. Crosspeak volumes of each NOESY experiments areintegrated with the help of gifa, as described in Materials andmethods. By means of the distance extrapolation method [34,35]a first set of distances is evaluated. Each distance is determinedby scaling the initial NOE build-up rate of the correspondingcrosspeak to that found for the cytosine H5±H6 with thedistance fixed at 2.46 AÊ . As input constraints for the refinementprocedure, distances are applied as lower and upper boundscorresponding to ^ 10% of the distance of interest. Distancesinvolving methyl groups are ignored.

Back-calculation of NOESY spectra and iterative refinement.Back-calculations of NOESY correlations are performed withthe help of MORASS as detailed in the Materials and methodssection above. The refinement procedure is monitored followinga five-step strategy [16]. In the first step, only sugar phase angleconstraints are applied, P being the most significant parameterfor defining the structural conformation of DNA. Instead ofscalar couplings, we have preferred to define the sugar pucker-ing by use of NOE-derived H1 0±H4 0 distances, which areknown to be highly dependent on the sugar phase angle. Wehave carefully checked that they are consistent with the phaseranges estimated by inspection of H3 0±H4 0 crosspeak intensityin the DQF-COSY spectra (see NMR results). Moreover,interproton distances provide more accurate constraints thanscalar coupling and have the advantage of being refined. In thesecond stage, the backbone structure is limited by imposing theangle difference 1±z with bounds of ^ 108. In the final steps,

supplementary interproton distances are added to the constraintslist and refined to progressively delimit the structure. First,intranucleotide distances, namely H6/H8±H1 0/H2 0/H2 0 0 andH1 0±H2 0/H2 0 0, are supplied. The most structurally significantinternucleotide distances, e.g. H6/H8±H1 0/H2 0, are then intro-duced. H6/H8±H2 0 0 interresidue distances are applied at thefinal stage of the refinement procedure as they are known to beoverestimated.

For the decamer, both initial structures (namely CG10a-b,rms = 1.00 AÊ ) converge after imposition of the 1±z constraints.At the end of the refinement protocol, the R-factor calculated for71 volumes is 0.29 (mixing time = 160 ms, see Materials andmethods). The counterpart of this R-factor value is a high cost inenergy (159 kJ´mol21) relative to the initial structures (Table 3).

For the dodecamer, refinement of H1 0±H4 0 interprotondistances leads to the convergence of two of the three initialconformers of similar energy and rms between 0.38 and 0.58 AÊ

(Table 3). The two resulting conformers converge only at thethird step of the procedure (intraresidue distances). The finalstructure has an R-factor of 0.36. This result is calculated for 82crosspeak volumes at a mixing time of 80 ms. The energy costof the refinement is 96 kJ´mol21.

Conformational analysis. Structural parameters and a molecularview for each refined structure are presented in Table 4 andFig. 2.

1-Decamer. All sugar puckers lie in the C2 0-endo domain exceptthe terminal T10 sugar (C1 0-exo). Puckering amplitudes aregreater than the mean value for canonical B-DNA and may becorrelated to long intranucleotide H1 0±H2 0 distances. A fewhelical parameters depart from the typical B-type DNA. Largerises (average = 4.11 AÊ ) in conjunction with small twists (33.68)are observed. The positive sign of the propeller twists is morenotable than the high values of this parameter (average 5.38)except for the central C-G base pairs where they are positive andhigh (<108). Because large negative inclinations (221.58) arepresent all along the double helix, the stacking interactionshould not be lost. In contrast, the three negative rolls located atthe T2-G3, C5-G6 and C8-A9 steps induce a loss of stacking atthese points of the nucleotide chain. The calculated cost of thesestacking disruptions is half the classical stacking energies [59].Remarkably, the three associated junctions, T2pG3, C5pG6 and

Table 3. Calculated R-factor and rms for the starting structures

(CG10a,b and CG12a,b,c) and the refined structures (CG10 and

CG12). Energy is given in MJ´mol21.

Conformer Energy R-factor rms

Decamer

CG10a ±1�.995 0�.74 CG10a/CG10b = 1�.00

CG10b ±1�.944 0�.91

CG10 ±1�.835 0�.29

Dodecamer

CG12a ±2�.368 0�.53 CG12a/CG12b = 0�.38

CG12b ±2�.364 0�.56 CG12b/CG12c = 0�.55

CG12c ±2�.365 0�.61 CG12c/CG12a = 0�.58

CG12 ±2�.271 0�.36

Fig. 2. Refined decamer (CG10) and dodecamer (CG12) structures.

728 C. Cordier et al. (Eur. J. Biochem. 261) q FEBS 1999

C8pA9, are characterized by 1±z values between 2308 and2428. Corresponding twists are higher than the average twist ofthis fragment (38.58, 36.18 and 38.58, respectively) but remainconsistent with the canonical 368 twist. These atypical structural

parameters are in good agreement with results publishedpreviously [44] where it has been shown that while the CREsequence does not contain any BII junctions, a few steps,particularly the central CpG, easily undergo the BI±BII

Table 4. Structural parameters for the refined structures CG10 and CG12. Values for parameters which are closed to 0 have been excluded (stretch,

opening, stagger, shear).

Nucleotide Xdisp Ydisp Inclin Tip Propel Opening Rise Tilt Roll Twist Angle

Decamer Decamer

A1-T11 ±0��.8 0����.2 ±20��.7 ±0����.8 5����.8 0����.3 A1/T2 4����.0 1����.4 4����.2 30����.3 3��.0

T2-A12 ±1��.0 0����.1 ±19��.5 0����.4 0����.3 1����.8 T2/G3 4����.0 ±0����.6 ±8����.7 38����.5 2��.8

G3-C13 ±0��.9 0����.3 ±20��.6 ±5����.5 4����.3 0����.9 G3/A4 4����.1 ±1����.4 8����.9 31����.7 5��.0

A4-T14 ±1��.1 0����.1 ±22��.6 ±1����.5 6����.2 ±0����.7 A4/C5 4����.3 ±1����.2 5����.3 32����.8 2��.8

C5-G15 ±1��.0 0����.0 ±24��.2 1����.0 9����.9 0����.8 C5/G6 4����.2 0����.0 ±4����.9 36����.1 2��.9

G6-C16 ±1��.0 0����.0 ±24��.2 ±1����.0 9����.9 0����.8 G6/T7 4����.3 1����.2 5����.3 32����.8 2��.8

T7-A17 ±1��.1 ±0����.1 ±22��.6 1����.5 6����.2 ±0����.8 T7/C8 4����.1 1����.4 8����.9 31����.7 5��.0

C8-G18 ±0��.9 ±0����.3 ±20��.6 5����.5 4����.3 0����.9 C8/A9 4����.0 0����.6 ±8����.7 38����.5 2��.8

A9-T19 ±1��.0 ±0����.1 ±19��.5 ±0����.4 0����.3 1����.7 A9/T10 4����.0 ±1����.4 4����.2 30����.3 3��.0

T10-A20 ±0��.8 ±0����.2 ±20��.7 0����.8 5����.8 0����.3

Average ±1��.0 0����.0 ±21��.5 ±0����.0 5����.3 0����.6 4����.1 0����.0 1����.6 33����.6 0��.5a

Nucleotide Phase Ampli Pucker Chi Gamma Delta Epsil Zeta Alpha Beta Eps-Zet

Decamer

A1 155��.5 44����.3 C2 0-endo ±128����.8 0����.0 140����.9 ±171����.3 ±113����.8 ±66����.8 ±175����.5 ±57����.5

T2 154��.2 48����.8 C2 0-endo ±117����.1 56����.9 142����.5 ±166����.7 ±133����.5 ±55����.7 168����.0 ±33����.2

G3 158��.0 48����.6 C2 0-endo ±110����.7 65����.0 145����.2 ±172����.7 ±122����.5 ±58����.2 179����.3 ±50����.2

A4 153��.2 45����.6 C2 0-endo ±120����.2 51����.4 140����.0 ±176����.1 ±113����.3 ±62����.8 ±178����.0 ±62����.8

C5 152��.4 48����.3 C2 0-endo ±126����.3 59����.3 140����.9 ±171����.0 ±128����.9 ±55����.0 176����.9 ±42����.1

G6 160��.0 44����.1 C2 0-endo ±120����.7 59����.0 143����.7 ±176����.2 ±112����.7 ±61����.9 ±173����.4 ±63����.4

T7 154��.7 45����.2 C2 0-endo ±121����.1 54����.5 140����.8 ±173����.7 ±111����.7 ±63����.8 ±176����.7 ±62����.1

C8 151��.4 47����.4 C2 0-endo ±122����.8 54����.6 139����.8 ±167����.4 ±137����.6 ±54����.9 172����.2 ±29����.9

A9 167��.8 38����.9 C2 0-endo ±110����.5 63����.4 144����.8 ±168����.5 ±107����.2 ±62����.5 175����.0 ±61����.3

T10 137��.7 42����.1 C1 0-exo ±119����.2 52����.3 127����.5

Average 154��.5 45����.3 C2 0-endo ±119����.7 57����.4 142����.1 ±171����.5 ±120����.1 ±60����.2 178����.6 ±51����.4

Nucleotide Xdisp Ydisp Inclin Tip Propel Opening Rise Tilt Roll Twist Angle

Dodecamer Dodecamer

G1-C13 ±1��.3 0����.3 ±5��.0 0����.9 11����.8 0����.0 G1/A2 3����.7 8����.4 ±4����.5 40����.2 5��.2

A2-T14 ±1��.1 0����.2 0��.3 0����.6 ±11����.3 3����.8 A2/A3 3����.3 0����.3 ±1����.1 40����.0 3��.2

A3-T15 ±1��.3 0����.4 3��.7 0����.4 ±25����.3 0����.9 A3/A4 3����.3 0����.3 6����.4 34����.2 4��.6

A4-T16 ±1��.2 0����.4 4��.4 2����.1 ±28����.8 ±0����.6 A4/A5 3����.2 ±2����.7 5����.8 36����.1 4��.5

A5-T17 ±1��.5 0����.2 2��.5 3����.5 ±22����.6 ±3����.8 A5/C6 3����.4 ±0����.1 ±0����.2 31����.1 0��.9

C6-G18 ±1��.2 ±0����.2 2��.4 2����.4 ±14����.3 ±0����.5 C6/G7 3����.7 0����.0 ±7����.6 43����.8 2��.8

G7-C19 ±1��.2 0����.2 2��.4 ±2����.4 ±14����.3 ±0����.5 G7/T8 3����.3 0����.1 ±0����.3 31����.1 0��.9

T8-A20 ±1��.5 ±0����.2 2��.5 ±3����.5 ±22����.5 ±3����.7 T8/T9 3����.2 2����.7 5����.9 36����.0 4��.6

T9-A21 ±1��.2 ±0����.4 4��.4 ±2����.1 ±28����.8 ±0����.6 T9/T10 3����.3 ±0����.2 6����.4 34����.2 4��.6

T10-A22 ±1��.3 ±0����.4 3��.8 ±0����.4 ±25����.4 0����.9 T10/T11 3����.2 ±0����.4 ±1����.1 40����.0 3��.1

T11-A23 ±1��.1 ±0����.2 0��.3 ±0����.6 ±11����.2 3����.8 T11/C12 3����.7 ±8����.4 ±4����.5 40����.2 5��.2

T12-G24 ±1��.3 ±0����.3 ±5��.0 ±1����.0 11����.9 0����.0

Average ±1��.3 0����.0 1��.4 0����.0 ±15����.1 ±0����.0 3����.4 0����.0 0����.5 37����.0 14��.8a

Nucleotide Phase Ampli Pucker Chi Gamma Delta Epsil Zeta Alpha Beta Eps-Zet

Dodecamer

G1 157��.1 41����.4 C2 0-endo ±131����.3 0����.0 140����.3 ±163����.7 ±115����.6 ±67����.9 ±176����.5 ±48����.1

A2 171��.7 40����.7 C2 0-endo ±102����.2 56����.3 148����.1 ±169����.4 ±133����.4 ±66����.2 173����.2 ±36����.0

A3 179��.2 31����.7 C2 0-endo ±103����.8 62����.0 143����.8 ±164����.1 ±101����.8 ±64����.9 173����.2 ±62����.3

A4 159��.1 45����.0 C2 0-endo ±94����.5 49����.0 143����.7 ±172����.0 ±126����.1 ±68����.0 177����.6 ±45����.9

A5 166��.5 40����.3 C2 0-endo ±104����.0 58����.0 145����.2 ±167����.3 ±107����.1 ±67����.1 167����.4 ±60����.2

C6 129��.4 41����.6 C1 0-exo ±122����.9 60����.2 121����.5 ±166����.9 ±126����.7 ±62����.9 172����.1 ±40����.1

G7 178��.9 35����.4 C2 0-endo ±104����.3 65����.1 146����.9 ±177����.5 ±104����.7 ±61����.2 172����.8 ±72����.9

T8 134��.9 38����.5 C1 0-exo ±120����.0 58����.0 124����.6 ±168����.7 ±94����.8 ±66����.3 170����.7 ±73����.9

T9 117��.3 39����.2 C1 0-exo ±119����.2 58����.6 113����.0 ±171����.2 ±102����.1 ±62����.1 169����.9 ±69����.1

T10 146��.4 40����.8 C2 0-endo ±111����.4 59����.1 133����.0 ±170����.8 ±108����.7 ±66����.1 167����.0 ±62����.0

T11 152��.2 45����.9 C2 0-endo ±111����.9 64����.5 139����.5 ±169����.7 ±129����.6 ±61����.1 171����.4 ±40����.0

C12 161��.3 36����.2 C2 0-endo ±128����.7 58����.2 139����.6

Average 154��.5 39����.7 C2 0-endo ±112����.9 59����.0 136����.6 ±169����.2 ±113����.7 ±64����.9 172����.6 ±55����.5

a Given value corresponds to the global curvature of the helical axis.

q FEBS 1999 Solution structure of two self-complementary oligomers (Eur. J. Biochem. 261) 729

transition. No curvature of the helical axis is observed. DNAaxis curvature is one of the structural deformations associatedwith a fixed BII state [44]. The successive rotations of all bases,namely inclinations, along the helix involve a narrowing of theminor groove (decreased width and increased depth).

This CG10 conformation departs strikingly from resultspublished previously [15] on a dodecamer including the CREmotif. In their structural study, Mauffret et al. first generatedtwo initial geometries. Then, in order to obtain a molecularmodelling in agreement with experimental data, they applied aset of NMR constraints including: (a) 1H-1H and 1H-31P coup-ling constants, measured by COSY experiments, for the sugarpseudorotation angles and 1 backbone dihedral angles, respec-tively, and (b) NOE-derived interproton distances. Constraintswere supplied using upper and lower error bounds. Distanceswere not refined. They observed that one of their initial con-formers (i.e. freely energy optimized) was not greatly modifiedby NMR-restrained modelling. This freely minimized conformeris well correlated with their NMR data. The structure reportedby Mauffret et al. (termed CRE12 for convenience) agrees wellwith one of the starting geometries presented in this work. Themain structural features: sugar puckers, helical parameters andbackbone dihedral angles are very similar. In particular, limitingthe comparison to the central ACGT part, we can mention: (a) ahigh central guanine phase angle (<1808); (b) identical CpG riseand twist values (<3.5 AÊ and <438); and (c) a small negativevalue for the 1±z parameter at the CpG junction. Unfortunately,the refinement of CG10 leads to a loss of this conformation infavour of that discussed above. In other words, the structurecommon to both studies (Mauffret et al. and this work) does notstand up to refinement, as after the second stage it is diverted tothe structure presented in Table 4. Consequently, the structuralparameters of the CG10 differ markedly from those found forCRE12. Among other variations, CpG rise, twist and 1±z angle,inclinations and propeller twists are considerably modified. Thepositive roll located at CpG in CRE12 becomes negative in theCG10. A same shift of the CpG roll has been reported previouslyby Lefebvre et al. [16], namely a negative roll for a refinedstructure, as against a positive roll for an imposed BI-typejunction, the 1±z values being kept the same for these two setsof 3D simulations.

Finally, the high energy of the refined model requirescomment. The quite acceptable R-factor value indicates thatthe 3D simulation is compatible with the experimental data. Thefollowing straightforward interpretation is proposed: though nojunction is in a BII state, three junctions can easily undergoa BI-BII transition. Consequently, the solution structure could bemalleable and could include a small percentage of strands withBII junctions.

2-Dodecamer. All the structural parameters are close to thosefound in a B-type oligomer except for the propeller twists whichare high (average = 2158), this latter feature being especiallypronounced in the adenine tract (222.58 to 228.88). Structuralproperties associated with a homopolymeric run of A-T basepairs have been described for the crystallographically determinedfragments d(CGCAAAAAAGCG)-d(GCGTTTTTTCGC) [20]and d(CCAACGTTGG)2 [17]. They are: (a) a high propellertwist at each A-T base pair, which improves the overall stabilityof the helix by creating additional interstrand non-Watson±Crick hydrogen bonds [20]; (b) smaller rises and increasedtwists compared with those found in a canonical B-like DNA;and (c) helicoidal distortions such as a B-DNA bending or anarrowing of the minor groove, which result from the localmotions mentioned above. For the dodecamer studied, a global

axis curvature of 14.88 is found. Rise and twist do not departfrom standard values within the A-T tracts, whereas they havehigh values for the C6pG7 step (3.7 AÊ and 43.88, respectively).Some sugars, namely C6, T8 and T9, leave the C2 0-endo domaintowards the C1 0-exo puckering mode. A high negative roll(27.68) between the base pairs involving C6 and G7 is associ-ated with a loss of stacking energy in this region (<29.19kJ´mol21).

Comparison of the central ACGT step. In order to analyse theimpact of the sequential alignment on the CpG step, we nowfocus our discussion on the ACGT tetramer. The minorgroove is narrower at the CpG within the ACGT tetrad forboth CG10 and CG12. This morphology has already beenobserved by Prive et al. [17] for the crystallographicallydetermined d(CCAACGTTGG)2 structure. On the basis of thisobservation and considering that crystal packing forces couldmodulate the conformation, we propose that the decrease in theminor groove width is a result of a purACGTpyr sequencespecificity. The same profile of 1±z values along the centralbackbone observed for both palindromes is in perfect agreementwith previous results [15] where a low negative value for CpG issandwiched between two high negative values of this variable.In line with this remarkable 1±z, there is a high negative valueof the roll correlated with a destacking of the C-G base pairs.Experimental data reported by several authors for CpG roll invarious DNA sequences are given in Table 5. Examination ofthese data allows the following observations: (a) a tract either ofpurines or of pyrimidines juxtaposed to the CpG motif favours anegative roll at this point ± the longer the tract the morenegative the roll value; (b) the impact of a purine tract on the5 0-CpG side is more marked than that of a pyrimidine tract; (c)neighbouring successive pyrimidines, however, are more effec-tive than a single purine in increasing a negative roll at CpG.Since the sugar pucker mode is highly indicative of DNA morph-ism, the associated parameters are of considerable interest. Inthe central ACGT motif, the phase angle profiles shown inFig. 3 are strikingly dissimilar. Whereas one is flat (CG10), theother (CG12) is jagged with a high±low±high±low series ofphase angles. As a consequence, the C6 and T8 sugars of thedodecamer reach the C1 0-exo domain and the G7 sugar reachesthe C3 0-endo domain. In the same way, for CG10, the twist andglycoside angles are quite regular but for CG12 they arescattered on either side of the average value (Fig. 3 andTable 4). These alternating profiles (phase, x and twist) con-cerning the 12-mer have already been described for othersequences which contain a ACGT tetrad [15±17]. Atypical baseparameters described above for the whole sequences are alsoobserved for the central bases of interest. For the decamer, thelarge rises and inclinations already mentioned are noticeablymore pronounced at the central subunit of the helix. Thesestructural criteria remain canonical for CG12. On the contrary,the propeller twists of the C-G base pairs are unusually high,

Table 5. Selected roll values for CpG step within various DNA

sequences from this work and literature data. Nucleotides containing

purine bases are underlined.

DNA sequence Roll References

GTACGTAC ±0�.9 [35]

CTTCGAAG ±2�.0 [47]

ATGACGTCAT ±4�.9 CG10 (this work)

GAAAACGTTTTC ±7�.6 CG12 (this work)

CATCGATG +3�.6 [47]

730 C. Cordier et al. (Eur. J. Biochem. 261) q FEBS 1999

Fig. 3. Phase, twist and glycosidic angles of the central ACGT tetrad are

compared for CG10 (B) and CG12 (W).

Table 6. Comparison of methods for a 3D-homology search performed

on the ACGT tetrads of the two refined conformers: CG10 and CG12.

The number of atoms contained in one ACGT tetramer is 254 (here,

N1 = N2). The percentage of 3D homology is determined by the SDM

algorithm with:

% �Nc

�N1 � N2�/2

where Nc is the number of common atom pairs; N1 and N2 are the numbers

of atoms in the two fragments compared.

Computation method Rms Similarity CSR

Nc 32 41 74

Nc with identical labels 4 13 50

3D-homology percentage 12�.6 16�.1 29�.1Fig. 4. Superposition of the refined ACGTs (CG10 and CG12) obtained

with: (A) rms (B) similarity procedure and (C) CSR method.

q FEBS 1999 Solution structure of two self-complementary oligomers (Eur. J. Biochem. 261) 731

because of the proximity of tracts rich in A-T. To a lesser extent,the situation is just the opposite for CG10 where the highestpropeller twists are located on paired CG.

In brief: (a) the CG12 central tetrad exhibits the sameconformational properties as those described for other ACGT inpublished studies; (b) as demonstrated by the striking differ-ences in all the sugar pucker, helicoidal and backboneparameters, the two ACGT units are conformationally dissimilar.The calculated rms values for the four nucleotides (1.5 AÊ )confirm this structural heterogeneity.

3D-Homology search

The standard rms and the similarity algorithms afford an overallnumerical criterion to readily picture how far (or close) fromeach other two structures are. The similarity procedure has theclear advantage over the rms analysis that it allows comparisonbetween molecules with different types and numbers of atoms(as the oligomers in this study) with the consequence thatstructural similarity can be shown in chemically differentfragments.

Computation of the percentage of homology was firstrestricted to the ACGT motif common to CG10 and CG12. Inthis case, use of the rms procedure is legitimate. Thus, threeseries of calculations are run on the ACGTs of the CG10 andCG12 by using standard rms followed by the SDM step, whichdetermines the Nc, the similarity procedure or the CSR method(see Materials and methods for details). The results show poor3D-homology (ranging from 30% to 13% using CSR and rmsprocedures, respectively) (Table 6). However, similarity proce-dure and CSR give clearly the best 3D-homology search andfind more common atom pairs as well as aligning atoms withidentical labels properly. When the CSR is performed, thesubstructure is improved as the percentage 3D-homologyreaches 29.1 (versus 12.6 from rms calculation alone) with 74(versus 32) common atom pairs of which 50 (vs. 4) areidentically labelled pairing atoms (Fig. 4).

In a second step, estimation of the structural homology wasextended to the whole CG10 and CG12 molecules using only theCSR and the similarity procedures, the rms being inappropriatein this context. The percentage of homology drops significantlywith respect to that found in the central common tetrad (13%with CSR, Nc = 92) in line with experimental observation. From

the overall similarity calculation, based on the comparison of thedouble-stranded CG10 and CG12, data relative to the homologybetween each single nucleotide chain in CG10 and its homo-logous counterpart in CG12 have been extracted and plotted(Fig. 5). Each correlation is marked with a circle whose sizedepends on the number of superposed atoms. The graph under-lines that similarity arises from overlapping of bases havingeither the same rank or being along both staggered homologouschains (light grey circles and dark grey circles, respectively).The presence of a slight curvature in the CG12 helical axis whileCG10 is quite straight, combined with the large rises observedfor the decamer, explains both types of correlation. When thesystems are freely aligned, atom pairs are distributed quiteregularly along the helices without the overlap between theACGT motifs being particularly strong. Careful inspection ofthe results emphasizes that pairs favour a majority of purine-purine or pyrimidine-pyrimidine linkages (76% of Nc). The fournucleotides of ACGT in the decamer are paired with theirhomologues in the dodecamer to a lesser extent, as only 24 atompairs are involved in the superposition of the ACGT motif (26%of Nc while ACGT accounts for 40% of the full number of CG10nucleotides). A few `off-diagonal' correlations concerningstrand 1 of CG12 with strand 2 of CG10 must not be seen asan artefact of the method, as one of the DNA chains (CG12) iscurved while the other is straight.

These results show clearly that no common substructure canbe extracted from the CG10/CG12 alignment. The pronouncedconformational dissimilarity of the two tetramers prevents a realsuperimposition of this motif.

Conclusion

Determination of the solution structure of CG10 and CG12 bymeans of molecular modelling under NMR constraints, followedby refinement of interproton distances using computation of thefull relaxation matrix, leads to a single conformation for eacholigomer. The good agreement between these optimized struc-tures and the experimental data is reflected by the R-factorvalues of 0.29 and 0.36 for CG10 and CG12, respectively.Conformational analysis of the central ACGT tetrad in bothsequences shows that the structure of CpG is not locked by itsnearest environment (A and T) but depends on its distantsurrounding in a sequence-specific manner. The tetrad in CG12

Fig. 5. Best result of the 3D-homology search

is represented as a graph displaying the

nucleotide correlation between the two oligo-

nucleotides: CG12 (X axis) vs. CG10 (Y axis).

The two strands belonging to the same DNA

sequence are distinguished and nucleotides are

numbered from 1 to 24 (CG12) or 1±20 (CG10).

Each correlation is marked with a circle whose

size represents the number of superposed atoms.

Light grey circles indicate correlations between

nucleotides of a same rank. Dark grey circles are

positioned along a line (dotted line) so that

internucleotide correlations correspond to a shift

of CG10 along CG12 in such a way that the two

ACGTs are face to face. White empty circles are

displayed for all other atom pairs.

732 C. Cordier et al. (Eur. J. Biochem. 261) q FEBS 1999

exhibits the standard B-type helix parameters previouslyobserved in ACGT-containing sequences (with the exceptionof high propeller twists as a result of the presence of the A tract).In contrast, ACGT in CG10 adopts a structure which depends onits distal nucleotide environment, thus giving rise to a specificgeometry extending beyond the central tetrad and which may berecognized as the octameric CRE site where specific binding ofthe cognate protein, CREB, occurs. The significant loss ofstacking energy between base pairs T2-A9 and G3-C8 (andsymmetrically, A9-T2 and C8-G3), which would make inter-calation at these sites favourable, is in line with the observationthat the two inverted ATGA sequences in CRE are the bindingsites for CREB [60]. Furthermore, the energy of the refinedconformer of CG10 (21.835 MJ´mol21) lies far above that ofthe optimized CG12 conformer (22.271 MJ´mol21) with respectto their starting structures (21.994 and 22.35 MJ´mol21,respectively). This result is consistent with the observationthat CRE easily adapts its geometry (by bending) in order toenhance its binding to CREB [61,62].

From this work, the following interpretation concerning theCpG step is proposed: malleability of this site is confirmed.Depending on the fluidity of at least three or four juxtaposednucleotides on both sides, the CpG malleability will be conservedor lost. A pliable nucleotide surrounding, as in CRE, wouldfavour a conformational change of the CpG when it faces thetarget protein. A rigid nucleotide environment, as (A)n or (T)n ofCG12, would block the CpG in a conformation which isprobably not the proper fingerprint of the target protein.

ACKNOWLEDGEMENTS

We thank T. Couesnon for technical assistance with the molecular modelling

study. We are grateful to R. Lavery and K. Zakrzewska for the gift of the

jumna software. We thank B. Hartmann and C. Tisne for helpful discussions

about the 3D-structure refinements and R. Thouvenot for help and

discussions about 31P-NMR experiments. We thank J. S. Lomas for critical

reading.

This work is associated with project ITO2598 at the CNUSC computing

centre. We thank P. Rouzaud, the Parallel Computing staff of the CNUSC

and the members of the Scientific Committee num.7 for the attribution of

computing resources.

REFERENCES

1. Bird, A. (1986) The essential of DNA methylation. Cell 70, 5±8.

2. Sutter, D. & Doerfler, W. (1980) Methylation of integrated adenovirus

type 12 DNA sequences in transformed cells is inversely correlated

with viral gene expression. Proc. Natl Acad. Sci. USA 77, 253±256.

3. Li, E., Beard, C. & Jaenisch, R. (1993) Role for DNA methylation in

genomic imprinting. Nature 366, 362±365.

4. Frommer, M., McDonald, L.E., Millar, D.S., Collis, C.M., Watt, F.,

Grigg, W., Molloy, P.E. & Paul, C.L. (1992) A genomic sequencing

protocol that yields a positive display of 5-methylcytosine residues in

individual DNA strands. Proc. Natl Acad. Sci. USA 89, 1827±1831.

5. Jones, P.A. & Gonzalgo, M.L. (1997) Altered DNA methylation and

genome instability: a new pathway to cancer? Proc. Natl Acad. Sci.

USA 94, 2103±2105.

6. Rideout, W.M., Coetzee, G.A., Olumi, A.F. & Jones, P.A. (1990)

5-Methylcytosine as an endogenous mutagen in the human LDL

receptor and p53 genes. Science 249, 1288±1290.

7. Coulondre, C., Miller, J.H., Farabaugh, P.J. & Gilbert, W. (1978)

Molecular basis of base substitution hotspots in Escherichia coli.

Nature 274, 775±780.

8. Montminy, M.R., Sevarino, K.A., Wagner, J.A., Mandel, G. &

Goodman, R.H. (1988) Identification of a cyclic-AMP-response

element within the rat somatostatin gene. Proc. Natl Acad. Sci. USA

83, 6682±6686.

9. Ziiff, E.B. (1990) Transcription factors: new family gathers at the

cAMP-response site. Trends Genet. 6, 69±72.

10. Krieg, A.M., Yi, A.-K., Matson, S., Waldschmidt, T.J., Bishop, G.A.,

Teasdale, R., Koretzky, G.A. & Klinman, D.M. (1995) CpG motifs in

bacterial DNA trigger direct B-cell activation. Nature 374, 546±549.

11. Klinman, D.M., Yi, A.-K., Beaucage, S.L., Conover, J. & Krieg, A.M.

(1996) CpG motifs present in bacteria DNA rapidly induce

lymphocytes to secrete interleukin 6, interleukin 12, and interferon

gamma. Proc. Natl Acad. Sci. USA 93, 2879±2883.

12. Sato, Y., Roman, M., Tighe, H., Lee, D., Corr, M., Nguyen, M.-D.,

Silverman, G.J., Lotz, M., Carson, D.A. & Raz, E. (1996) Immuno-

stimulatory DNA sequences necessary for effective intradermal gene

immunization. Science 273, 352±354.

13. Sparwasser, T., Miethke, T., Lipford, G., Borschert, K., HaÈcker, H.,

Heeg, K. & Wagner, H. (1997) Bacterial DNA causes septic shock.

Nature 386, 336±337.

14. Cordier, C., Convert, O., Blais, J.C., Couesnon, T., Zakrzewska, K.,

Mauffret, O., Fermandjian, S. & Dodin, G. (1997) Covalent binding of

a bridged pyridinium aldehyde with the self-complementary decamer

[d(ATGACGTCAT)]2. Gel analysis, MALDI mass spectrometry and

NMR studies. J. Chem. Soc. Perkin Trans. 2, 115±121.

15. Mauffret, O., Hartmann, B., Convert, O., Lavery, R. & Fermandjian, S.

(1992) The fine structure of two DNA dodecamers containing the

cAMP responsive element sequence and its inverse. Nuclear magnetic

resonance and molecular simulation studies. J. Mol. Biol. 227, 852±875.

16. Lefebvre, A., Mauffret, O., Hartmann, B., Lescot, E. & Fermandjian, S.

(1995) Structural behavior of the CpG step in two related oligo-

nucleotides reflects its malleability in solution. Biochemistry 34,

12019±12029.

17. PriveÂ, C.G., Yanagi, K. & Dickerson, R.E. (1991) Structure of the

B-DNA decamer C-C-A-A-C-G-T-T-G-G and comparison with iso-

morphous decamers C-C-A-A-G-A-T-T-G-G and C-C-A-G-G-C-C-T-

G-G. J. Mol. Biol. 217, 177±199.

18. Yanagi, K., PriveÂ, C.G. & Dickerson, R.E. (1991) Analysis of local helix

geometry in three B-DNA decamers and eight dodecamers. J. Mol.

Biol. 217, 201±214.

19. Lefebvre, A., Mauffret, O., Lescot, E., Hartmann, B. & Fermandjan, S.

(1996) Solution structure of the CpG containing d (CTTCGAAG)2

oligonucleotide: NMR data and energy calculations are compatible

with a BI/BII equilibrium at CpG. Biochemistry 35, 12560±12569.

20. Nelson, H.C.M., Finch, J.T., Luisi, B.F. & Klug, A. (1987) The structure

of an oligo (dA)´oligo (dT) tract and its biological implications.

Nature 330, 221±226.

21. Piotto, M., Saudek, V. & Sklenar, V. (1992) Gradient-tailored excitation

for single-quantum NMR spectroscopy of aqueous solutions. J. Biomol.

NMR 2, 661±665.

22. Marion, D. & WuÈthrich, K. (1983) Application of phase sensitive two-

dimensional correlated spectroscopy (COSY) for measurements of1H-1H spin-spin coupling constants in proteins. Biochem. Biophys.

Res. Commun. 113, 967±974.

23. Bax, D. & Davis, D.G. (1985) MLEV-17-based two-dimensional

homonuclear magnetisation transfer spectroscopy. J. Magn. Res. 65,

355±360.

24. Brauschweiler, L. & Ernst, R.R. (1983) Coherence transfert by isotropic

mixing: application to proton correlation spectroscopy. J. Magn. Res.

53, 521±528.

25. Piantini, U., Sorensen, O.W. & Ernst, R.R. (1982) Multiple quantum

filters for elucidating NMR coupling networks. J. Am. Chem. Soc. 104,

6800±6801.

26. Rance, M., Sorensen, O.W., Bodenhausen, G., Wagner, G., Ernst, R.R.

& WuÈthrich, K. (1983) Improved spectral resolution in COSY 1H

NMR spectra of proteins via double quantum filtering. Biochem.,

Biophys. Res. Commun. 117, 479±485.

27. Kessler, H., Griesinger, C., Zarbock, J. & Loosli, H.R. (1984)

Assignment of carbonyl carbons and sequence analysis in peptides

by heteronuclear shift correlation via small coupling constants with

broadband decoupling in t1 (COLOC). J. Magn. Reson. 57, 331±336.

28. Nikonowicz, E., Roongla, V., Jones, C.R. & Gorenstein, D.G. (1989)

Two-dimensional 1H and 31P NMR spectra and restrained molecular

q FEBS 1999 Solution structure of two self-complementary oligomers (Eur. J. Biochem. 261) 733

dynamics structure of an extrahelical adenosine tridecamer oligo-

deoxyribonucleotide duplex. Biochemistry 28, 8714±8725.

29. Powers, R., Jones, C.R. & Gorenstein, D.G. (1990) Two-dimensional 1H

and 31P NMR spectra and restrained molecular dynamics structure of

an oligodeoxyribonucleotide duplex refined via a hybrid relaxation

matrix procedure. J. Biomol. Struct. Dynam. 8, 253±294.

30. Nikonowicz, E. & Gorenstein, D.G. (1990) Two-dimensional 1H and 31P

NMR spectra and restrained molecular dynamics structure of a mis-

matched GA decamer oligodeoxyribonucleotide duplex. Biochemistry

29, 8845±8858.

31. Roongla, V., Jones, C.R. & Gorenstein, D.G. (1990) Effect of distortions

in the deoxyribose phosphate backbone conformation of duplex

oligodeoxyribonucleotide dodecamers containing GT, GG, GA, AC,

and GU base-pair mismatches on 31P NMR spectra. Biochemistry 29,

5245±5258.

32. Delsuc, M.A. (1988) A new maximum entropy processing algorithm,

with applications to nuclear magnetic resonance experiments. In

Maximum Entropy and Bayesian Methods (Skilling, J. Ed.), pp.

285±290. Kluwer Academic, Dordrecht.

33. Pons, J.L., Malliavin, T.E. & Delsuc, M.A. (1996) GIFA V4: a complete

package for NMR dataset processing. J. Biomol. NMR 8, 445±452.

34. Baleja, J.D., Germann, M.W., van de Sande, J.H. & Sykes, B.D. (1990)

Solution conformation of purine-pyrimidine DNA octamers using

nuclear magnetic resonance, restrained molecular dynamics and

NOE-based refinement. J. Mol. Biol. 215, 411±428.

35. Baleja, J.D., Moult, J. & Sykes, B.D. (1990) Distance measurement and

structure refinement with NOE data. J. Magn. Reson. 87, 375±384.

36. Reid, B.R., Banks, K., Flynn, P. & Nerdal, W. (1989) NMR distance

measurements in DNA duplexes: sugars and bases have the same

correlation times. Biochemistry 28, 10001±10007.

37. Lavery, R., Sklenar, H., Zakrzewska, K. & Pullman, B. (1986) The

flexibility of the nucleic acids: (II). The calculation of internal energy

and applications to mononucleotide repeat DNA. J. Biomol. Struct.

Dyn. 3, 989±1014.

38. Lavery, R. (1988) Junctions and bends in nucleic acids: a new

theoretical modelling approach. In Structure and Expression 3

(Olson, W.K., Sarma, M.H., Sarma, R.H. & Sundaralingam, M., ed),

pp. 191±211. Adenine Press, New York.

39. Zakrzewska, K. & Lavery, R. (1989) Theoretical studies of groove-

binding drugs with DNA. In Computer-Aided Molecular Design

(Richards, G., ed), pp. 129±145. IBC Technical Services Ltd.,

London.

40. Lavery, R., Zakrzewska, K. & Pullman, A. (1984) Optimized monopole

expansions for the representation of the electrostatic properties of the

nucleic acids. J. Comp. Chem. 5, 363±373.

41. Hingerty, B., Richie, R.H., Ferrel, T.L. & Turner, J.E. (1985) Dielectric

effects in biopolymers: the theory of ionic saturation revisited.

Biopolymers 24, 427±439.

42. Lavery, R. & Sklenar, H. (1988) The definition of generalized helicoidal

parameters and of axis curvature for irregular nucleic acids. J. Biomol.

Struct. Dyn. 6, 63±91.

43. Lavery, R. & Hartmann, B. (1994) Modelling DNA conformational

mechanics. Biophys. Chem. 50, 33±45.

44. Hartmann, B., Piazzola, D. & Lavery, R. (1993) BI±BII transitions

in B-DNA. Nucleic Acids Res. 21, 561±568.

45. Bertrand, H.O., Ha-Duong, T., Fermandjian, S. & Hartmann, B. (1998)

Flexibility of the B-DNA backbone: effects of local and neighbouring

sequences on pyrimidine-purine steps. Nucleic Acids Res. 26,

1261±1267.

46. Post, C.B., Meadows, R.P. & Gorenstein, D.G. (1990) On the evaluation

of interproton distances for three-dimensional structure determination

by NMR using a relaxation rate matrix analysis. J. Am. Chem. Soc.

112, 6796±6803.

47. Kim, S.G. & Reid, B.R. (1992) Automated NMR structure refinement

via NOE peak Vols Application to a dodecamer DNA duplex. J. Magn.

Reson. 100, 382±390.

48. Petitjean, M. (1996) Three-dimensional pattern recognition from

molecular distance minimization. J. Chem. Inf. Comput. Sci. 36,

1038±1049.

49. Petitjean, M., Cordier, C. & Dodin, G. (1997) Structural similarity and

optimal superposition. An application to nucleotides. Chimia, 51, 386.

(Abstracts 36th IUPAC Cong, Geneva, August 17±22, 1997).

50. Petitjean, M. (1998) Interactive maximal common 3D substructure

searching with the Combined SDM/RMS algorithm. Comput. Chim.

22, 463±465.

51. Clore, G.M. & Gronenborn, A.M. (1985) Probing the three-dimensional

structures of DNA and RNA oligonucleotides in solution by nuclear

Overhauser enhancement measurements. FEBS Lett 179, 187±198.

52. WuÈthrich, K. (1986) Resonance assignments and structure determination

in nucleic acids. In NMR of Proteins and Nucleic Acids. Wiley

Interscience, New York.

53. Van de Ven, F.J.M. & Hilbers, C.W. (1998) Resonance assignments of

non-exchangeable protons in B type DNA oligomers, an overview.

Nucleic Acids Res. 16, 5713±5720.

54. Evans, J.N.S. (1995) Biomolecular NMR Spectroscopy. Oxford

University Press Inc, New York.

55. Gorenstein, D.G. (1994) Conformation and dynamics of DNA and

protein-DNA complexes by 31P NMR. Chem. Rev. 94, 1315±1338.

56. Kim, S.G., Lin, L.J. & Reid, B.R. (1992) Determination of nucleic acid

backbone conformation by 1H NMR. Biochemistry 31, 3564±3574.

57. Poncin, M., Hartmann, B. & Lavery, R. (1992) Conformational

sub-states in B-DNA. J. Mol. Biol. 226, 775±794.

58. Sklenar, V. & Bax, A. (1987) Measurement of 1H-31P coupling constants

in double-stranded DNA fragments. J. Am. Chem. Soc. 109, 7525±7526.

59. Saenger, W. (1984) Forces stabilizing associations between bases:

hydrogen bonding and base stacking. In Principle of Nucleic Acid

Structure (Cantor, C.R., ed), pp. 134±140. Springer-Verlag, New York.

60. Paolella, D.N., Palmer, C.R. & Schepartz, A. (1994) DNA targets for

certain bZIP proteins distinguished by an intrinsic bend. Science 264,

1130±1133.

61. Leonard, D.A., Rajaram, N. & Kerppola, T.K. (1997) Structural basis of

DNA bending and oriented heterodimer binding by the basic leucine

zipper domains of Fos and Jun. Proc. Natl Acad. Sci. USA 94,

4913±4918.

62. Nagaich, A.K., Appella, E. & Harrington, R.E. (1997) DNA bending is

essential for the site-specific recognition of DNA response elements

by the DNA binding domain of the tumor suppressor protein p53.

J. Biol. Chem. 272, 14842±14849.

SUPPLEMENTARY MATERIAL

The following Table and Figures are available from HTTP://www.blackwell-synergy.com/Journals/issuelist.asp?journal=ejb/suppcontFig. S1. 500 MHz NOESY spectrum (D2O) of CG10 showing the

connectivities between aromatic and H1 0 protons.

Fig. S2. 500 MHz NOESY spectrum (D2O) of CG12 showing the

connectivities between aromatic and H1 0 protons.

Fig. S3. 500 MHz HOHAHA spectrum (D2O) of CG10 showing the

scalar couplings between sugar protons.

Fig. S4. 500 MHz HOHAHA spectrum (D2O) of CG12 showing the

scalar couplings between sugar protons.

Table S1. Intranucleotide and internucleotide distances obtained from

the refined stuctures for CG10 and CG12.