Structure and Stability of an Immunoglobulin Superfamily Domain from Twitchin, a Muscle Protein of...

16
J. Mol. Biol. (1996) 264, 624–639 Structure and Stability of an Immunoglobulin Superfamily Domain from Twitchin, a Muscle Protein of the Nematode Caenorhabditis elegans Sun Fong 1 , Stefan J. Hamill 1 , Mark Proctor 1 , Stefan M. V. Freund 1 Guy M. Benian 2 , Cyrus Chothia 3 , Mark Bycroft 1 and Jane Clarke 1 * The NMR solution structure of an immunoglobulin superfamily module 1 Centre for Protein of twitchin (Ig 18') has been determined and the kinetic and equilibrium Engineering, MRC Unit of folding behaviour characterised. Thirty molecular coordinates were Protein Folding and Design MRC Centre, Hills Road calculated using a hybrid distance geometry-simulated annealing protocol based on 1207 distance and 48 dihedral restraints. The atomic rms Cambridge, CB2 2QH, UK distributions about the mean coordinate for the ensemble of structures is 2 Department of Pathology 0.55(20.09) Å for backbone atoms and 1.10(20.08) Å for all heavy atoms. Emory University School of The protein has a topology very similar to that of telokin and the titin Ig Medicine, Atlanta, GA domains and thus it falls into the I set of the immunoglobulin superfamily. 30322, USA The close agreement between the predicted and observed structures of Ig 18' demonstrates clearly that the I set profile can be applied in the structure 3 MRC Laboratory of prediction of immunoglobulin-like domains of diverse modular proteins. Molecular Biology, Hills Folding studies reveal that the protein has relatively low thermodynamic Road, Cambridge, CB2 2QH stability, DG H 2 O U–F = 4.0 kcal mol -1 at physiological pH. Unfolding studies UK suggest that the protein has considerable kinetic stability, the half life of the unfolding is greater than 40 minutes in the absence of denaturant. 7 1996 Academic Press Limited Keywords: immunoglobulin; NMR structure; structure prediction; *Corresponding author protein folding; I set Introduction Functionally diverse proteins may evolve by modification of non-structural determining regions of an existing scaffold. Proteins belonging to the immunoglobulin superfamily (IgSF) (Williams & Barclay, 1988; Bork et al ., 1994) are one of the best known examples of this phenomenon. The Ig superfamily is a class of proteins or protein modules with the general fold, first observed in antibodies, that can be described as a b-sandwich. They are also found in cell adhesion molecules, cell surface receptors and muscle proteins. Despite the remarkable variations in function and sequence that exist among the numerous members of the IgSF, only four folding ‘‘sets’’ (V, C1, C2 and I) have been identified (Williams & Barclay, 1988; Harpaz & Chothia, 1994). The folding topologies of these sets share a common structural core of four b-strands (B, C, E and F) but vary in the number and position of the edge strands relative to the common core. The existence of a wide range of sequences that all fold to the same basic structure raises a number of questions concerning the relationship between sequence and structure and the role of conserved residues in the stability and folding of the proteins. In this study, we are concerned with one of the IgSF domains from twitchin (Benian et al ., 1989, 1993), a 753 kDa multi-modular protein located in muscle A-bands of the nematode Caenorhabditis elegans . Twitchin consists of a single serine- threonine protein kinase domain plus 30 IgSF domains and 31 fibronectin type III (FnIII) domains. Twitchin is likely to function both in regulating muscle contraction and in the final stages of sarcomere assembly (Benian et al ., 1989). Electro- physiological experiments with Aplysia twitchin suggest that the normal function of twitchin is to inhibit the rate of relaxation of muscle, and Abbreviations used: IgSF, immunoglobulin superfamily; Ig 18', the 18th Ig-like motif of twitchin; FnIII, fibronectin type III domain; DG H 2 O U–F , the free energy of unfolding in the absence of denaturant; m, the dependence of the free energy of unfolding on [urea]; NOE, nuclear Overhauser enhancement; 3D, three-dimensional; NOESY-HMQC, NOE spectroscopy. 0022–2836/96/480624–16 $25.00/0 7 1996 Academic Press Limited

Transcript of Structure and Stability of an Immunoglobulin Superfamily Domain from Twitchin, a Muscle Protein of...

J. Mol. Biol. (1996) 264, 624–639

Structure and Stability of an ImmunoglobulinSuperfamily Domain from Twitchin, a Muscle Proteinof the Nematode Caenorhabditis elegans

Sun Fong 1, Stefan J. Hamill 1, Mark Proctor 1, Stefan M. V. Freund 1

Guy M. Benian 2, Cyrus Chothia 3, Mark Bycroft 1 and Jane Clarke 1*

The NMR solution structure of an immunoglobulin superfamily module1Centre for Proteinof twitchin (Ig 18') has been determined and the kinetic and equilibriumEngineering, MRC Unit offolding behaviour characterised. Thirty molecular coordinates wereProtein Folding and Design

MRC Centre, Hills Road calculated using a hybrid distance geometry-simulated annealing protocolbased on 1207 distance and 48 dihedral restraints. The atomic rmsCambridge, CB2 2QH, UKdistributions about the mean coordinate for the ensemble of structures is2Department of Pathology 0.55(20.09) Å for backbone atoms and 1.10(20.08) Å for all heavy atoms.

Emory University School of The protein has a topology very similar to that of telokin and the titin IgMedicine, Atlanta, GA domains and thus it falls into the I set of the immunoglobulin superfamily.30322, USA The close agreement between the predicted and observed structures of Ig

18' demonstrates clearly that the I set profile can be applied in the structure3MRC Laboratory ofprediction of immunoglobulin-like domains of diverse modular proteins.Molecular Biology, HillsFolding studies reveal that the protein has relatively low thermodynamicRoad, Cambridge, CB2 2QHstability, DGH2O

U–F = 4.0 kcal mol−1 at physiological pH. Unfolding studiesUKsuggest that the protein has considerable kinetic stability, the half life ofthe unfolding is greater than 40 minutes in the absence of denaturant.

7 1996 Academic Press Limited

Keywords: immunoglobulin; NMR structure; structure prediction;*Corresponding author protein folding; I set

Introduction

Functionally diverse proteins may evolve bymodification of non-structural determining regionsof an existing scaffold. Proteins belonging to theimmunoglobulin superfamily (IgSF) (Williams &Barclay, 1988; Bork et al., 1994) are one of the bestknown examples of this phenomenon. The Igsuperfamily is a class of proteins or proteinmodules with the general fold, first observed inantibodies, that can be described as a b-sandwich.They are also found in cell adhesion molecules, cellsurface receptors and muscle proteins.

Despite the remarkable variations in function andsequence that exist among the numerous membersof the IgSF, only four folding ‘‘sets’’ (V, C1, C2 and

I) have been identified (Williams & Barclay, 1988;Harpaz & Chothia, 1994). The folding topologies ofthese sets share a common structural core of fourb-strands (B, C, E and F) but vary in the numberand position of the edge strands relative to thecommon core. The existence of a wide range ofsequences that all fold to the same basic structureraises a number of questions concerning therelationship between sequence and structure andthe role of conserved residues in the stability andfolding of the proteins.

In this study, we are concerned with one of theIgSF domains from twitchin (Benian et al., 1989,1993), a 753 kDa multi-modular protein located inmuscle A-bands of the nematode Caenorhabditiselegans. Twitchin consists of a single serine-threonine protein kinase domain plus 30 IgSFdomains and 31 fibronectin type III (FnIII) domains.Twitchin is likely to function both in regulatingmuscle contraction and in the final stages ofsarcomere assembly (Benian et al., 1989). Electro-physiological experiments with Aplysia twitchinsuggest that the normal function of twitchin is toinhibit the rate of relaxation of muscle, and

Abbreviations used: IgSF, immunoglobulinsuperfamily; Ig 18', the 18th Ig-like motif of twitchin;FnIII, fibronectin type III domain; DGH2O

U–F , the freeenergy of unfolding in the absence of denaturant; m,the dependence of the free energy of unfolding on[urea]; NOE, nuclear Overhauser enhancement; 3D,three-dimensional; NOESY-HMQC, NOE spectroscopy.

0022–2836/96/480624–16 $25.00/0 7 1996 Academic Press Limited

Structure and Stability of a Twitchin Ig Domain 625

cAMP-dependent phosphorylation of twitchin re-lieves this inhibition (Probst et al., 1994). Thephysiological substrate for Aplysia twitchin is likelyto be the regulatory myosin light chains (Heierhorstet al., 1995), although this is not yet clear forC. elegans twitchin. To date, only the structure andfunction of the recombinant kinase domain contain-ing the autoregulatory sequence have been exam-ined in detail (Hu et al., 1994; Lei et al., 1994).The domain is capable of phosphorylatingmodel substrates in a self-regulated fashion (Leiet al., 1994) and in a calcium and S100A12 protein-dependent manner (Heierhorst et al., 1996).

Initially, it was believed that the sequence patternof the IgSF domains in twitchin implied a structurelike that of the C2 set (Williams & Barclay, 1988) inthe Ig superfamily. Harpaz & Chothia (1994),however, argued that (1) immunoglobulin domainsthat occur in muscle proteins, cell adhesionmolecules and surface receptors usually belong towhat was then a new ‘‘I’’ (intermediate) set and (2)have structures very similar to that first observed intelokin (Holden et al., 1992) and which couldlargely be predicted on the basis of the key residuesfound in telokin. The prediction was shown to beaccurate in recent published structures of titinIg-like modules (Pfuhl & Pastore, 1995; Improtaet al., 1996) and the cell adhesion molecule VCAM(Jones et al., 1995). The structures of the corre-sponding domains of twitchin were also predictedto be I set prior to the experimental determination.

Preliminary data indicated that Ig 18', the Ig-likemotif of the ninth FnIII-FnIII-Ig repeat of twitchin,was a good candidate for NMR and folding studies.Thus, a high resolution structure of Ig 18' wasdetermined and the folding and stability werecharacterised.

Results

Sequence-specific assignment

Sequential assignment of Ig 18' was achieved bythe method of Wuthrich (1986). Spin systems wereidentified or classified into specific groups of aminoacids and they were connected into the sequenceaccording to short-range and long-range backboneNOE patterns via the use of the 3D NOESY-HMQC.A 3D CBCA(CO)NH experiment was employed toconfirm the spin-systems as well as secondarystructure assignment obtained from the homonu-clear and 15N experiments. The peptide bondconfiguration of Pro30 was assigned cis because ofthe presence of a Ha(i ) − Ha(i + 1) NOE forresidues 29 and 30 and was confirmed bycomparing the Cb chemical shift with the randomcoil cis-trans shift values (Lubienski et al., 1994).

Secondary structure elements

Figure 1 summarises the short-range sequentialNOEs observed in the spectra of Ig 18'. Character-

Figure 1. A summary of short-range NOEs (=i − j = < 4).The strength of the NOEs (weak, medium and strong) arereflected by the thickness of the bars (thin, medium andthick, respectively). NOE data were extracted from theNOESY spectra with mixing time of 100 ms. Open circlesindicate amide protons that were protected against 2H2Oexchange after 24 hours while those protected for at leastthree hours are labelled with filled circles. Residues thathave 3JHNHa larger than 8.5 Hz are labelled with opensquares. Residues that have 3JHNHa smaller than 4.0 Hz arelabelled with filled squares.

istic extended strand NOE patterns (strong con-secutive Ha(i ) − HN(i + 1) and weak to mediumHb(i ) − HN(i + 1)) can be seen throughout thesequence. Residues 67 to 70 which are located at theE–F turn of the protein, give a distinctive 310 helixpattern, and this observation is supported by therelatively small 3JHNHa. A similar structure is seen inthe corresponding region in telokin and VCAMdomain 1, whereas this region is found to berelatively unstructured in titin M5.

Figure 2(a) suggests that the domain contains sixmajor extended strands characterised by strongnegative Dppm values, and possibly two minorstrands, plus one helical turn at the E–F loop (withpositive values of Dppm over consecutive residues),confirmed by the NOE pattern (see above). With theaid of long-range backbone NOEs (Figure 2(b)), itwas apparent at this stage that the protein containstwo main b-sheets, each consisting of fourb-strands.

Tertiary structure

The three-dimensional structure of Ig 18' wascalculated from a total of 1207 distance and 48dihedral angle restraints (see Materials andMethods). A total of 150 molecular coordinateswere calculated and 30 coordinates with the lowest

Structure and Stability of a Twitchin Ig Domain626

Figure 2. (a) Histogram showing the differencebetween the observed secondary shift (obsSS) (themeasured 13C chemical shift minus the random coil 13Cchemical shift) of the Ca and Cb of each residue(Dppm = obsSS Ca − obSS Cb). A three-point smoothingfunction was applied to the raw data. (b) Diagonal plotof the backbone NOE distance restraints used for thegeneration and refinement of Ig 18'. The squaresrepresent residue pairs that show Ha(i ) − HN(j ) NOE.The filled circles represent HN(i ) − HN(j ) NOEs. Theinverted triangles indicate Ha(i ) − Ha(j ) NOEs.

Figure 3. (a) Histogram showing the distributionof NOE restraints. The height equals the sum of allNOE derived restraints for each residue. Black, greyand white bars correspond to long-range, short-rangeand intra-residue restraints, respectively. (b) The averagerms deviations of the coordinates of the 30 finalstructures from the averaged coordinates. Filledcircles and continuous line represent the backboneheavy atoms (Ca, C', N and O) whereas crossesand broken line reflect all non-hydrogen atoms.(c) Surface accessibility for each residue calculated forthe average minimized coordinates with a probe radiusof 1.6 A (X-PLOR). Values were obtained from thesummation of the accessible areas of all atoms for eachresidue.

NOE energies were selected. None of the selectedcoordinates have NOE violations larger than 0.3 Anor dihedral angle violations over 5°. The averagerms difference among the 30 coordinates is0.55(20.09) A for the backbone heavy atoms and1.10(20.08) A for all non-hydrogen atoms. The rmsdeviation distribution along the protein polypep-tide backbone (Figure 3(b)) shows that the structureis well defined in the b-sheet regions, andreasonably defined at the terminal regions andmost of the turns except the short stretch at thebeginning of the C–D loop. The largest rmsdifference was observed for Asp40 (ca 2 A forthe backbone atoms and 4 A for all heavy atoms);the region 40 to 42 has relatively large solventaccessibility (Figure 3(c)). The Ramachandran plotfor the family of the 30 coordinates (not shown)reveals that most of the residues have energetically

Structure and Stability of a Twitchin Ig Domain 627

Figure 4. (a) Fluorescence and (b) CD spectra (arbitraryunits) of the native Ig 18' (continuous line) and Ig 18'unfolded in 6 M urea (broken line) at pH 7.0 (50 mMpotassium phosphate), 20°C. In (a) excitation wavelengthis 280 nm. Ellipticity was monitored using a 2 cmpathlength cell.

Figure 5. Denaturation of Ig 18' monitored byfluorescence (w) and far UV CD (r) at pH 7.0, 20°C(arbitrary units). Fluorescence excitation wavelength was280 nm.

spectrum (Figure 4) at pH 7.0 shows an intensitymaximum at 320 nm due presumably to the singletryptophan, Trp36, located at the centre of thehydrophobic core (see Discussion). In the presenceof 6 M urea at the same pH, the emission maximumshifts to 350 nm and the intensity drops signifi-cantly to approximately 50% of the original value.The far-UV CD spectrum is characteristic of b-sheet.Upon denaturation by urea, a decrease in ellipticityat 210 to 230 nm is observed. The fluorescenceand CD signals can be completely recovered onrenaturation from urea and acid. The protein isprone to aggregate when unfolded at low ureaconcentrations and this hampers experimentsrequiring samples with higher protein concen-tration. Thus we failed to acquire useful data fromcalorimetry and near-UV CD spectroscopy whichrequires higher concentration of protein. Tempera-ture denaturation can be monitored by far-UV CDand the Tm is estimated to be 57°C at pH 7.0. Datafrom urea denaturation experiments employing CDand fluorescence probes fit to a two-state model(equation (2), Figure 5), the m value (thedependence of the free energy of unfolding on[urea]) and DGH2O

U − F obtained from both methods arethe same within error (Table 1), consistent with atwo-state equilibrium unfolding.

Ig 18' is thermodynamically more stable at higherpH (Table 1); however, it is also more prone toaggregation, especially at high protein concen-tration (01 mM) upon incubation at room tempera-ture for more than ten hours. At pH 4.5, the proteindid not show visible aggregation upon incubationfor several days, but it gradually unfolded asmonitored by CD and NMR spectroscopy (datanot shown). To seek a balance between stabilitywith respect to unfolding at low pH andaggregation at high pH, the solution structure was

favourable backbone conformations and thosenon-glycine residues that scatter outside theallowed region are in general located either attight turns or at regions with relatively highrms deviation. Residues that consistently possesspositive f angles among the 30 structures are Ser9,Asp31, Asp40, Ser41 and Glu80. The global fold iswell defined, because of the presence of a largenumber (458) of long-range NOEs (=i − j = > 4).

Equilibrium folding studies

To analyse the thermodynamic stability of Ig 18',unfolding was monitored by fluorescence and CDspectroscopy under different conditions (urea,pH and temperature). The fluorescence emission

Structure and Stability of a Twitchin Ig Domain628

Table 1.Equilibrium denaturation of Ig 18'

Urea50%a m DGH2O

U − Fb

pH Method (M) (kcal mol−1 M−1) (kcal mol−1)

4.5 Fluorescence 2.27 (20.02) 1.3 (20.1) 2.95 (20.23)5.0 Fluorescence 2.42 (20.12) 1.1 (20.2) 3.14 (20.24)6.0 Fluorescence 2.74 (20.05) 1.4 (20.1) 3.56 (20.27)7.0 Fluorescence 3.06 (20.04) 1.3 (20.1) 3.97 (20.31)7.0 CD 2.83 (20.13) 1.3 (20.3) 3.68 (20.87)

The standard errors of the data are given in parentheses.a The concentration of urea at which 50% of the protein is unfolded.b A mean m value of 1.3 (20.1) kcal mol−1 M−1 is used for determining DGH2O

U − F fromfluorescence data. U50% values can be determined with a high degree ofreproducibility but m values are hard to determine accurately, particularly where theU50% is low (Serrano et al., 1992).

determined at pH 4.9 at a lower temperature(283 K). It is possible that the increased stability atpH 7.0 results from the de-protonation of the buriedHis20.

Kinetic unfolding studies

The folding and unfolding of Ig 18' weremonitored by fluorescence, employing urea andpH jump techniques. Refolding is complicated,involving at least three phases and is relativelyslow and dominated by the presence of cisPro30 (results not shown, a detailed analysis ofrefolding will be reported elsewhere). Unfolding,however, is monophasic, and the rate constant ofunfolding, ku, increases exponentially with increas-ing urea concentration (Figure 6) according toequation (1):

ln ku = ln k0u + mu[urea] (1)

The rate constant obtained by an extrapolation to0 M urea, k0

u = 2.8 × 10−4 s−1. Refolding was also

followed by far-UV CD spectroscopy after manualmixing. The rate constants obtained by fluorescenceand CD measurements are the same withinexperimental error.

Discussion

Structural overview

Ig 18' is composed of two b-sheets, packedtogether into a b-sandwich, and a 310 helix thatpacks against the hydrophobic core (Figures 7 and8). The first sheet consists of strands A, B, E and Dhydrogen-bonded in an anti-parallel arrangementwhereas the second sheet consisting of strands A',G, F and C, is arranged with A' and G in paralleland G, F, C anti-parallel. The topology of thestrands is shown in Figure 8. Both b-sheets areright-hand twisted and are stacked on top of oneanother enclosing the hydrophobic core. Of theseven loops that connect together the regularb-strands, D–E and F–G can be identified as typeI and A'–B as type II b-hairpin. The mainhydrophobic core of the domain is formed aroundthe aromatic residues including Trp36, Phe26,Phe88, Tyr73, Phe62 and His20 (Figure 7). It isapparent from the structures that the protein coreis compact, which is reflected by the negativeLennard-Jones van der Waals’ energy (Table 2).Nonetheless, not all of the core residues arewell-shielded from solvent. Surface exposureanalysis indicates that a few core residuesdistributed along the edges of the sheets havesolvent accessible surface of 12 to 35 A2. There areseveral exposed hydrophobic groups on the proteinsurface. The largest surface hydrophobic patch,consisting of Leu50, Phe61, Pro63 and Thr19, islocated on the B–E–D sheet.

The remaining loops are either less regular(C–D) or non-standard in nature (B–C, E–F, A–A').The A–A' region exhibits two distinct confor-mations which satisfy the observed experimentalrestraints. There is intensity variation over thecross-peaks in the 1H-15N HSQC spectrum corre-sponding to the residues forming the A–A' region.Leu6 and Arg10, located at the two ends of the A–A'

Figure 6. Urea dependence of the natural logarithm ofthe unfolding rate constant (ku, in s−1) of Ig 18' at 20°C,pH 7.0. There is a linear dependence of ln ku on [urea].Extrapolation to 0 M urea gives a kH2O

u = 2.8 × 10−4 s−1.

Structure and Stability of a Twitchin Ig Domain 629

Figure 7. (a) Stereoview showing the superposition of the Ca, C', and N backbone atoms for the 30 molecularcoordinates of Ig 18'. (b) Stereoview showing the superposition of residues in the main hydrophobic core for the 30molecular coordinates inserted on the backbone trace of the minimized average structure. Residues selected: Ile5, Ile12,Ile14, His20, Leu22, Val24, Phe26, Ala34, Trp36, Val38, Leu45, Leu49, Val51, Thr58, Ile60, Phe62, Ala65, Tyr73, Leu75,Val77, Phe88, Val90, Val92. Trp36 and Tyr73 are shown black. Core residues that have solvent accessible surface > 10 A2

are shown in lightest grey scale.

loop, give rise to cross-peaks of fairly weakintensity (Arg10 has the weakest signal in theHSQC spectrum) while residues located in themiddle of the loop, Thr7, Ala8 and Ser9, haverelatively strong signals. Such intensity variationmight indicate motion of this loop on anintermediate (ms) timescale. NMR experimentsprobing the dynamic behaviour of the protein arein progress. In the long C–D loop, the N-terminalstretch (residues 39 to 42) is disordered and nopreference for any particular conformation isobserved. However, the following three residues,Ala43, Ala44 and Leu45 are more ordered and this

region resembles a small distorted b-strand whichmatches the C' strand of the I set.

Predicted and observed structures for thetwitchin Ig 18' domain

The ‘‘key’’ residues in a protein are thoseprimarily responsible for its three-dimensionalstructure, through their packing, hydrogen bond-ing, or their ability to take up unusual confor-mations. In Table 3 we give the sequence of the firstknown member of the I set, telokin, and its keyresidues are shown in bold type. The role of these

Structure and Stability of a Twitchin Ig Domain630

Figure 8. Schematic picture of the three-dimensionalfolding topology of Ig 18' generated from the minimizedaverage structure with the program MolScript (Kraulis,1991).

in telokin, the same residues or suitable alternativesas described by Harpaz & Chothia (1994) andChothia et al. (1989). We found that in Ig 18' some29 of the 35 sites homologous to the key residuesin telokin do clearly contain the same residue ora suitable alternative (Table 3). Together, theyindicated that Ig 18' would have the same structureas telokin in the regions of the A, A', B, C, E, F andG strands and the A'–B, B–C, E–F and F–G loops.The absence of a hydrophobic residue at position 9in Ig 18' indicated a different unknown structure forthe A–A' loop. Though in Ig 18' regions equivalentto the C' and D strands were clearly present, exactlywhich of the residues corresponded to the telokinkey residues of these strands was uncertain and sothe relative position, and the extent of the structuralsimilarity, of these regions was also uncertain(Table 3).

Subsequently, the observed structure of Ig 18'was compared with that of telokin and thepredicted structure. Of the 74 residues predicted tohave the same conformation as telokin, 71 do so; thethree residues that do not are all immediatelyadjacent to loops that differ in conformation in thetelokin and Ig 18' (Table 3). In all, 78 residues inthe two structures have the same conformation (therms difference in the positions of their main-chainatoms is 1.4 A). The seven residues not predicted asbeing like those in telokin are in the C' and Dstrands where the absence of clear homologues forthe key residues lead to their being labelleduncertain (see above).

Previously, the predictions that the M5 domain oftitin and domain 1 of VCAM belong to the I set andhave structures close to that of telokin was shownto be true by the subsequent experimentaldetermination of their structures (Jones et al., 1995;Pfuhl & Pastore, 1995). Those results, together withthe close similarity of the observed and predictedstructures of Ig 18', demonstrate that I set keyresidue patterns derived from the known structurescan be used to identify, from sequences, othermembers of the I set and give predictions of theirthree-dimensional structure that are very largelyaccurate.

Comparison of the hydrophobic cores of I setproteins with known structures

It is well established that the hydrophobic coreplays important roles in governing (1) the uniqueglobular fold, (2) the stability (Kellis et al., 1988;Karpusas et al., 1989; Eriksson et al., 1992), and (3)the activity of proteins (Lim & Sauer, 1991; Milla &Sauer, 1995; Axe et al., 1996). Structural analysis ofthe hydrophobic core packing of 11 immunoglobu-lin domains (Lesk & Chothia, 1982) suggested thatseveral defined mechanisms are involved foradaptation to core mutations while maintaining thesame fold.

With the availability of several structures of I setproteins, it is now possible to compare and contrastthe hydrophobic core of this subclass of protein in

residues in the structure has been describedpreviously (Harpaz & Chothia, 1994; Bateman et al.,1996). Before the structure determination describedhere, the sequence of Ig 18' was aligned by handwith that of telokin and examined to see if itcontained, at sites homologous to the key residues

Table 2.Structural statistics of the 30 Ig 18' structuresParameter �SA�a �SA�mb

Deviation from ideal geometryBonds (A) 0.003120.0001 0.00307Angles (deg) 0.6920.01 0.68Impropers (deg) 0.5120.02 0.51

XPLOR energies (kcal mol−1)ENOE

c 54.725.4 51.2Etor

d 0.2320.29 0.13EvdW

e 29.124.0 28.3EL−J

f − 209.1230.3 − 233.6

rms deviations from experimental restraintsDistance (A) 0.03020.001 0.029Dihedral angle (deg) 0.2420.15 0.21

All deviations are quoted 2one standard deviation.a �SA� is the 30 final lowest NOE energy structures.b �SA�m is the restrained minimized average structure

derived from the 30 structures.c The square-well NOE (ENOE) was calculated with a force

constant of 50 kcal mol−1 A−2.d The square-well torsional angle (Etor) was calculated with a

force constant of 200 kcal mol−1 rad−2.e The quadratic van der Waals term was calculated with a

force constant of 4 kcal mol−1 and the van der Waals radii wereset to 0.8 times of the standard CHARMm values.

f The Lennard-Jones potential was not used during any stageof the structure calculations.

Structure and Stability of a Twitchin Ig Domain 631

Table 3.The alignment of telokin with the predicted and observed structures oftwitchin Ig 18'

For telokin * indicates key residues; those in bold type are conserved in Ig 18'.For the predicted structure, PIg 18':

Upper case letters indicate regions predicted to have the same structure as telokin;lower case roman letters indicate structure predicted to be different to that in telokinlower case italic letters indicate regions of uncertain alignment.

For the observed structure, OIg 18':Upper case letters indicate the regions observed to have the same structure astelokin: the telokin residues 40 to 44, 51 to 78, 80 to 82, 89 to 92 and 96 to 133superpose on 1 to 5, 12 to 39, 43 to 45, 50 to 53 and 56 to 93 of twitchin with anrms difference in position of 1.4 A;lower case letters indicate regions observed to have conformations different to thatin telokin.

Residue numbers are given for Ig 18'. Residue with given number is that above thelast digit.

detail (Figure 9). The residues which compose thehydrophobic core of the available I set structureswere overlaid according to the alignment shown inthe Figure legend. The residues buried in thecommon core are all I profile key residues, and theyhave similar arrangements and packing volumes.The Ca atoms of these residues can be fitted with anrms deviation of approximately 1.5 A, which isgood agreement considering the differences in thequality and method of structure determinations.The residues at the centre of the core, whichcorrespond to the ‘‘pin’’ region (Lesk & Chothia,1982) of immunoglobulin domains, have the lowestrms deviation, and the absolutely conserved coreTrp (Trp36 in Ig 18') and Tyr (Tyr73 in Ig 18') havethe most conserved side-chain conformations(Figure 9).

In Ig 18', the lack of a hydrophobic residueat position 9 (occupied by Ser9 in Ig 18') of the Iprofile is compensated by having I12 protrudingfurther into the core and, by subtle localconformational changes of the side-chains ofneighbouring residues. The main consequence oflacking the hydrophobic residue at this position isa different conformation of the A–A' region (seeprediction).

In titin-M5, all of the core residues are presentexcept at position 3 of the profile, which is usuallya proline. In the sequence of M5, this position isalanine. The absence of this contact may be a resultof choosing the modular boundary too far into thesequence of this folding unit. Extending the module

two more native residues upstream from the Nterminus might allow this part of the core to befilled by the side-chain of the missing alanine, andmay affect the stability of the module. This haspreviously been demonstrated in titin-M11 andtitin-Ab1 (Politou et al., 1994a).

Domain 1 of VCAM (VCAM-d1)—which is theonly extracellular I set structure available to date†,contains two disulphide bonds. The lack of acis-proline at position 30 of the I profile causes theB–C turn to be more extended than in the othermodules. Nonetheless, the side-chains of the twocore residues (Phe1 and Pro31) adjacent to this loopdo not deviate significantly from the positionsoccupied by those of the other modules. Of the twodisulphide linkages, one is not conserved amongthe extracellular I set members. This disulphidebond holds together the B–C and F–G loops inVCAM-d1. In telokin and Ig 18', these two loopsinteract via a conserved asparagine in the F–G loop(see notes to Figure 9), which can form aside-chain–main-chain hydrogen bond with acarbonyl oxygen on the B–C loop. The uniquefeature of this module, besides the disulphidebonds, is the substitution of a hydrophobic residueby a glycine (Gly45) at position 50 of the I-profile.

In conclusion, the examination of the hydro-phobic cores of several I set proteins reveals that theI set core is flexible and can tolerate considerablevariation of hydrophobic residues particularly onthe edges of the core. The only absolute invariantsof the hydrophobic cores are the tryptophan andtyrosine in the pin region, and their side-chainconformations are conserved. The result of thecomparison of the available structures agrees wellwith that previously observed in immunoglobulindomains. The composition variation of the cores

† After submission of the manuscript, the structureof the first domain of NCAM, a second extracellularI-set structure, has been published (Thomsen et al.,1996). This structure was not included in our analysis.

Structure and Stability of a Twitchin Ig Domain632

Figure 9. Stereoview of an overlay of the hydrophobic core residues of telokin, Ig 18', titin M5 and VCAM-d1. Thealignment was a structural alignment based on the I set profile shown below. Colours: telokin, purple (backboneshown); Ig 18', cyan; titin M5, red; VCAM-d1, green.

The residues that have the same structure as telokin are shown in upper case letters; those that differ in conformationare shown in lower case italic letters. The strand regions indicated on alignment tables are those found in the telokinstructure. I set key residues are shown in bold letters. Data for the structures of the telokin, titin M5 domain and domain1 of VCAM are taken from Holden et al. (1992), Pfuhl & Pastore (1975), Jones et al. (1995), respectively.

is generally accompanied by displacements ofb-sheets and local conformational changes of theside-chain, such that closely packed cores aremaintained. This structural description providesthe basis for understanding the role of the keyhydrophobic residues in the I-set proteins.

The structure of the other twitchin Ig domains

The results reported in the previous two sectionsimply that, using the known structures of Ig 18' andtelokin, we can predict outline structures for theother 29 twitchin Ig domains. In Table 4 we give analignment of the sequences of telokin and Ig 1'–30'.At the top of the Table the sequence of Ig 18' is

shown in uppercase Roman letters. It is followed bythe sequence of telokin in upper case Roman lettersfor the regions that have the same conformation asIg 18' and in upper case italic letters in the regionsthat differ: the A–A', C–C', C'–D and D–E loops.Key residues are shown in bold type. The sequencesof Ig 1'–30' are shown in upper case letters for theregions where the key residues indicate they havethe same conformation as Ig 18' and/or telokin (seefootnotes to Table 4). Regions which are uncertainor expected to have different conformations areshown in lower case letters.

Inspection of Table 4 shows that the similarity inthe key residue implies that Ig 1', 4', 10'–17', and19'–28' have very similar conformations to Ig 18'

Structure and Stability of a Twitchin Ig Domain 633

and telokin over more than 80% of their structure.For most of these domains only the A–A' C–C' andD–E loops show clear differences. Ig 2', 3', 6'–9', and29'–30' are somewhat more divergent and have addi-tional differences in at least the B–C and F–G loops.It is intriguing that these more divergent Ig domainsare located at the extreme ends of the twitchinmolecule. Note that in most of the Ig domains theC–C' loop and the beginning of the D strand haveconformations like telokin rather than Ig 18'.

Folding studies

Ig 18' has a low free energy of unfolding(DGH2O

U − F = 4.0 kcal mol−1), but significant thermalstability (Tm = 57°C). In a recent study, the stabilityof a number of I set modules of titin have beencompared (Politou et al., 1994b), and the stability ofthe Ig 18' falls within the same range in terms ofboth the m-value and the DGH2O

U − F of those modules.Though the m-value is relatively low for a globularprotein (1.3 kcal mol−1 M−1), it is consistent over therange of titin modules (Politou et al., 1994b, 1996),and for other b-sandwich proteins (S.J.H. & J.C.,unpublished data). As the m-value reflects thechange in the number of binding sites fordenaturant as the protein unfolds, the consistencyis expected in b-sandwich modules with similarbasic fold.

The I set Ig modules in the giant muscle proteinssuch as twitchin and titin are likely to serve severaldifferent roles. In most of twitchin, and in theA-band portion of titin, these modules probablybind to myosin and other lattice proteins, as hasbeen shown for combinations of FnIII and Igdomains from titin (Labeit et al., 1992), and alsotelokin (Shirinsky et al., 1993) and the C-terminal Igdomain of C-protein (Okagaki et al., 1993). Thepostulated function of the very large tandem arraysof Ig modules (up to 90; Labeit & Kolmerer, 1995)in the I-band portion of titin is different, probablyproviding length to the titin polypeptide andpossibly, elasticity, rather than interacting withother proteins. It has been suggested that these Igdomains might unfold and refold as the musclefibres shorten and lengthen during muscle contrac-tion-relaxation (Erickson, 1994). This would requirethat the proteins fold and unfold rapidly, withinmilliseconds. The relatively low DGH2O

U − F mightsuggest that this would not involve too high anenergy cost. However, it is clear from the datapresented here that the Ig module Ig 18' does nothave unfolding kinetics consistent with such amodel, the half life of unfolding in 0 M denaturantextrapolated from the unfolding data is >40minutes. Preliminary studies indicate that thehalf-time for refolding of the module with allproline residues in the correct starting confor-mation is 00.5 second (unpublished data). It hasbeen demonstrated by Politou et al. (1996) that thedomains of titin unfold and refold independentlyand that modular stability is unaffected by adjacentmodules. Slow unfolding kinetics will ensure that,

despite the relatively low stability, these modulesare kinetically stable.

Conclusions

Having obtained a good quality structure for Ig18', we have demonstrated that the I profile can beapplied to identify and predict largely accuratestructures for IgSF domains belonging to the I set,found in many biologically important proteins. Thiswill also facilitate the modelling of the structure oftwitchin repeats, and contribute to the molecularreconstruction of the whole protein. Ig 18' belongsto the ‘‘elongated’’ version of the muscle IgSFdomains (with longer B–C and F–G loops comparedwith titin I-band modules (Improta et al., 1996)),and Table 4 suggests that the majority of twitchinIgSF modules are similar to Ig 18' in this respect. Itis possible that the structural conservation in theB–C and F–G loops, which correspond to thehypervariable loops (Chothia & Lesk, 1987; Chothiaet al., 1989) for antigen binding in V domains ofantibodies, may be associated with interdomaininteractions with preceding FnIII domains intwitchin repeats.

Having shown that the key residues of the I setprofile allow accurate structure prediction, furtherwork is now needed to elucidate the role of theseresidues in determining the thermodynamics andkinetics of folding of these modules.

Materials and Methods

Gene subcloning

The plasmid containing the Ig 18' gene, pRSET-Ig 18'was constructed by ligation of a sticky end insertobtained from PCR amplification followed by NdeI andEcoRI digestion, to the purified, linearized parent vector(pRSETC, Invitrogen) doubly digested at the samerestriction sites. The template for PCR amplification wasin the form of pGEX-2T (Pharmacia Biotech) plasmidclone, with the gene inserted at BamHI and EcoRI sites.The ligated plasmid was transformed into Escherichia colistrain TG2 and recovered plasmid was sequenced.

Protein expression and purification

Uniformly 15N labelled, 13C-15N doubly labelled and10% 13C labelled Ig 18' were produced in the E. coli strainJM109 carrying the pRSET-Ig 18' plasmid. The cells weregrown in a M9 minimal medium with the appropriatelabelling substrates following the standard procedures.13C labelled glucose was used as the carbon source in thedoubly labelled and 10% 13C samples preparation. Cellswere grown at 37°C in the corresponding mediumcontaining ampicillin (50 mg/ml) with shaking. IPTG andM13 helper phage containing the T7 polymerase genecontrolled by the lac promoter (Invitrogen Corp. XPRESSSystem kit) were added to final concentrations of 0.2 mMand approximately 1012 pfu/l, respectively at A600 = 0.3 to0.4, and the culture was shaken for another 14 hours,harvested by centrifugation at 4°C, and resuspended inPBS containing 0.2 mM PMSF on ice. The cells weredisrupted by sonication and centrifuged at 18,000 rpm topellet insoluble matter. SDS-PAGE analysis showed that

Table 4Twitchin immunoglobulin superfamily domains 1

1. The sequence of telokin is shown in upper case italic letters in the regions that differ from Ig 18'.2. Key residues are shown in bold type.3. Domains Ig1' and Ig2' have insertions in loop regions that are represented in the Table by the letters x, z, u and j. The sequences

of these regions are:

1': z = lggsl2': x = dggalivm; z = tpvakwmk; u = aifsdlg & j = rgpsssdagqyrcnirndqgetnanlalnf

4. The sequences of Ig 1'–17' and 19'–30' are shown in upper case letters for the regions where the key residues indicate they havethe same conformation as Ig18' and/or telokin. Regions which are uncertain or expected to have different conformations are shownin lower case letters.

to 30 aligned with domain 18 and telokin

5. Residues in strands on the edge of a b-sheet, or at the beginning and end of an interior strand, have side-chains that point towardsthe interior. Normally a hydrophobic residue is expected and found. However, hydrophilic residues with long side-chains arealso possible; they pack the first hydrophobic part in the interior of the structure and their polar heads on the surface. In a fewof the Ig domains, residues of this kind are found at sites equivalent to 5, 12, 14, 20 and 83 in Ig18'.

6. The residues at the sites equivalent to 17 in Ig18' are at the apex of a sharp turn and has f, c torsion angles that produce somesteric strain if the residue is not Gly, Asn or Asp. This favours the conservation of Gly at this site but, as can be seen in knownimmunoglobulin structures, the strain produced by other residues is not so large as to prevent their occurrence or to producea change in conformation.

Structure and Stability of a Twitchin Ig Domain636

only trace amounts of the Ig 18' was in the supernatantand the majority of it was deposited in inclusion bodies.The inclusion pellet was resuspended in cold 50 mMTris-HCl containing 300 mM sodium chloride (pH 7.2).The suspension was sonicated on ice and the solid matterwas pelleted by centrifugation. The inclusion bodies wereresuspended in 50 mM acetate buffer containing 4 Mguanidine hydrochloride (pH 4.5) and stirred continu-ously at room temperature for one hour. Insoluble celldebris was removed by centrifugation and the solublefraction was passed through a 0.2 mm filter and appliedto a Pharmacia Hiload 26/60 Superdex 200 Prep gradegel filtration column pre-equilibrated with 50 mM acetatebuffer, 4 M guanidine hydrochloride (pH 4.5). Ig 18' waseluted with the same buffer at a flow rate of 3 ml/minand small aliquots of protein containing fractions wereassayed by SDS-PAGE gel after 20% trichloroacetic acidprecipitation. The eluted, essentially pure, proteinsolution was concentrated to 1 ml in an AmiconCentriprep 10 and was renatured by dilution. Thedenatured protein was added in 50 ml portions over fiveminutes to 30 ml of rapidly stirred 25 mM acetate buffer(pH 4.9) at 4°C. The diluted solution was dialysedextensively against water. The dialysed protein was flashfrozen in liquid nitrogen and stored at −70°C. Typically70 mg of pure protein could be obtained per litre of M9.Unlabelled Ig 18' was obtained in a similar manner exceptthe culture was grown on LB medium and was shakenfor a six hour period after induction before harvesting.

NMR sample preparation

The purified protein was concentrated to 1.4 to 2.0 mMin 450 ml with an Amicon Centriprep 3. 90% H2O/10%2H2O samples were prepared by adding 50 ml of 2H2Ofollowed by adjustment of the pH of the solution to4.95(20.05) at 10°C with small aliquots of 0.2 M solutionsof NaOH and HCl. 2H2O samples were prepared byadding 5 ml of 2H2O to the concentrated protein followedby reconcentration, and the process was repeated threetimes. Final p2H of the solution was adjusted to4.60(20.05) at 10°C with small aliquots of 0.2 M solutionsof NaO2H and 2HCl. The 15N sample for H/2H exchangeexperiment was prepared by diluting the concentratedprotein with 5 ml of 5 mM deuterated acetate buffer in2H2O, p2H 4.60(20.05) at 10°C and reconcentrated to500 ml and then another 10 ml of the same buffer wasadded and the solution reconcentrated as soon aspossible.

NMR experiments

Data acquired on Bruker DMX 600 were processedwith standard Bruker UXNMR and subsequently, thetransformed spectra were converted into FELIX 2.30(Biosym Technologies) matrix format for analysis. Otherspectral data were processed and analysed with FELIX2.30 only. Structures were displayed and analysed withMOLMOL, INSIGHTII, RASMOL and QUANTA.

Homonuclear experiments1H-1H homonuclear experiments were carried out at

293 K on a Bruker DMX 600 spectrometer equipped witha 8 mm broad band probe and a 1 mM protein sample in2H2O was used. 2D DQF-COSY, TOCSY and NOESYwere recorded with 2048 complex data points in t2 and512 real points in t1 with a spectral width of 9000 Hz inboth dimensions. Phase-sensitive spectra were obtained

by the time-proportional phase increment method (TPPI).The water signal was suppressed by on-resonancepresaturation. NOESY and TOCSY mixing-times were100 and 50 ms, respectively, and the MLEV-17 (Bax &Davis, 1985) mixing sequence was employed forobtaining Hartman-Hahn transfer in the TOCSY. Mildgaussian and sine-bell (p/3) window functions were usedin F2 and F1, respectively and spectra were processed toyield matrices with 4096 (F2) × 1024 (F1) real data points.

Heteronuclear experiments

All heteronuclear experiments were carried out at283 K on a Bruker AMX 500 spectrometer equipped withan inverse triple resonance single axis gradient probe andan external fourth channel, and the protein concentrationused was 1.4 to 2 mM. In all the NMR experiments, thecarrier frequencies were set at 4.72, 116.3 and 175.2 ppmfor 1H, 15N and 13C carbonyl, respectively.

1H-15N experiments were acquired on a uniformlylabelled 15N sample in 10% 2H2O and solvent suppressionwas achieved by the WATERGATE (Piotto et al., 1992)gradient sequence for the 3D TOCSY-HMQC (Marionet al., 1989) and 3D NOESY-HMQC (Marion et al., 1989).During acquisition, 15N was decoupled by the WALTZ-16sequence and the TPPI method was used for quadraturedetection in indirect dimensions. The 1H-15N HSQC(Bodenhausen & Ruben, 1980; Kay et al., 1992) wasrecorded with two transients per increment, 1024complex data points in t2 and 256 real points in t1

employing a gradient selected sensitivity enhanced pulseprogram. The spectrum was collected with spectralwidths of 6024.1 Hz (F2) × 2026.3 Hz (F1). The TOCSY-HMQC was recorded with 16 transients per increment,1024 complex data points in 1H (F3), 57 real points in 15N(F2) and 256 real points in 1H (F1). Data were acquiredwith spectral widths of 8064.5 Hz (F3), 2000 Hz (F2) and6500 Hz (F1). TOCSY mixing was achieved by theDIPSI-2 sequence with 55 ms mixing time. The NOESY-HMQC was collected with 16 transients per increment,1024 complex points in 1H (F3), 62 real points in 15N (F2)and 196 real points in 1H (F1) with spectral widths of8064.5 Hz (F3), 2000 Hz (F2) and 8064.5 Hz (F1). Themixing time of the experiment was 100 ms. The two 3Dexperiments were processed to yield matrices with512 × 128 × 512 real points.

The triple resonance experiments HNCO (Grzesiek &Bax, 1992b) and CBCA(CO)NH (Grzesiek & Bax, 1992a)were recorded with a 13C-15N doubly labelled sample in10% 2H2O. The gradient enhanced HNCO experimentwas acquired with 16 transients per increment,2048 × 64 × 50 complex data points and spectral widths of6024.1, 1382.7 and 1572.3 Hz. Water suppression in theCBCA(CO)NH experiment was achieved by the WATER-GATE sequence and data were acquired with 16transients per increment, 1024 × 116 × 38 complex pointsand spectral widths of 6024.1, 8928.57 and 1572.3 Hz.Quadrature detection in the indirect dimensions in thesetwo experiments was achieved by the States-TPPImethod.

The constant time 1H-13C HSQC, 3D HCCH-TOCSY(Bax et al., 1990) and 3D NOESY-HSQC (Muhandiramet al., 1993) were recorded on a 13C-15N doubly labelledsample in 2H2O. The 13C carrier frequencies in thesespectra were set at 40 ppm. The two 3D spectra werecollected with 16 transients per increment and spectralwidths of 7042.2, 7042.2 and 2513.8 Hz. The number ofcomplex data points acquired for HCCH-TOCSY andNOESY-HSQC were 1024 × 196 × 54 and 2048 × 196 × 58,

Structure and Stability of a Twitchin Ig Domain 637

respectively. In the TOCSY experiment, the DIPSI-3sequence was employed for mixing (23 ms) whereas theNOESY mixing time was 100 ms. The HSQC spectrumwas acquired with 16 transients per increment, 2048 × 512complex points and spectral widths of 7042.2 and2513.8 Hz.

A 1H-13C correlation spectrum was acquired on a 10%13C labelled sample in 2H2O for stereospecific assignmentof valine and leucine methyl groups (Szyperski et al.,1992). It was recorded with 128 transients per increment,2048 × 1024 complex points and spectral widths of 6329.1and 4300.2 Hz. The 13C carrier frequency was set at12.5 ppm.

Structure calculation

Interproton distance restraints

NOEs were derived from the 2D NOESY, 3D 1H-15NNOESY-HMQC and 3D 1H-13C HSQC-NOESY spectradescribed above. A total of 1117 unique and unambigu-ous interproton NOEs were extracted and converted intodistance restraints according to their intensities. The dis-tance bounds of the restraints were set to 1.8 to 2.7, 1.8to 3.5 and 1.8 to 5.0 A for strong, medium and weak NOEintensities, respectively. Upper limits for NOEs fromnon-stereospecifically assigned methylene groups androtationally averaged aromatic protons were correctedfor by r−6 averaging. An additional 0.5 A was added to theupper limit for each methyl group involved in aconstraint.

Torsion angle restraints

The backbone c torsion angle restraints were derivedfrom the 3JHNHa coupling constant data, which wasdetermined by the method of Stonehouse & Keeler (1995).The restraints were set to −120(240)° for 3JHNHa greaterthan 8.0 Hz, and −50(230)° for those less than 5.0 Hz. Nof angle restraints were used. x1 angle restraints andstereospecific assignments for b-methylene groups werederived from the 3JHaHb coupling patterns in thehomonuclear DQF-COSY and the intra-residue HN-HbNOE strengths. The allowed deviation for x1 angle wasset to 250°. A total of 36 c and 12 x1 restraints wereemployed in the structure calculation.

Hydrogen bond restraints

Hydrogen bond restraints were included for backboneamides that exchanged at intermediate to slow rates(protected for >three hours) in the exchange experiment.Restraints were only included for amide protonsinvolved in regular secondary structure. Restraints wereemployed after initial rounds of structure calculation,and only when the hydrogen bond acceptors wereunambiguously located. Two distance restraints wereemployed for each hydrogen bond: HN : O (1.8 to2.0 A) and N : O (2.7 to 3.0 A). A total of 90 distancerestraints were derived from hydrogen bonds.

Structure calculation

Structure calculations were based on the restraintsdescribed above, excluding x1 restraints and hydrogenbond restraints which could not be assigned to regularb-sheet structure on the basis of NOE patterns. Allpeptide bonds were restrained to be planar and trans,except those preceding proline residues. The bondbetween Ala29 and Pro30 was restrained to be planar and

cis on the basis of the existence of a mediumHa(i )-Ha(i + 1) NOE. Other proline peptide bonds weretreated as freely rotatable during the initial runs. Basedon the preliminary structures, x1 angle and the hydrogenbond restraints were introduced into the restraint setsand unassigned proline peptide bonds were fixed to transand planar. Structure calculations were carried outemploying a hybrid distance geometry-simulated anneal-ing protocol using the program XPLOR. Quadratic vander Waals’ repulsion terms and square-well quadraticpotential terms, for inter-proton distance and torsionangles, were included in the force field. The van derWaals’ hard sphere radii were set to 0.8 times theCHARMm values. Attractive empirical energy termswere excluded from the protocol. Final structures wereobtained after 1000 cycles of restrained Powell energyminimization.

Equilibrium denaturation

Fluorescence spectroscopy

Unfolding was monitored by fluorescence spec-troscopy with excitation at 280 nm and emission at320 nm. For each data point collected, 100 ml of a solutionof 09 mM Ig 18' in 450 mM buffer (sodium acetate pH 4.5or 5.0, or potassium phosphate pH 6.0 or 7.0), was addedto an Eppendorf tube containing 800 ml of the appropriateurea solution (final [protein] = 1 mM, final [buffer] =50 mM). Urea solutions were prepared gravimetrically.The solution was pre-incubated at 20°C for at least twohours. Fluorescence was measured in thermostattedcuvette holders at 20°C, with temperature monitored bya thermocouple in the cuvette held above the light beam.

Circular dichroism

Unfolding was monitored by following optical rotationat 220 nm at 20°C. Each solution contained 04 mM Ig 18'and appropriate urea concentration in 50 mM phosphatebuffer (pH 7.0).

Analysis of equilibrium denaturation data

The data were fitted to equation (2), which assumesthat the fluorescence (or ellipticity) of the folded andunfolded states are dependent on denaturant concen-tration (Clarke & Fersht, 1993):

F =(aF + bF[D]) + (aU + bU[D])exp4m([D] − [D]50%)/RT5

(1 + exp4m([D] − [D]50%)/RT5)(2)

where F is the fluorescence (or ellipticity) at the given[denaturant], aF and aU are the intercepts and bF and bU

are the slopes of the baselines at low (F) and high (U)denaturant concentrations, respectively, [D] is theconcentration of the denaturant, [D]50% is the concen-tration of denaturant at which half of the protein isunfolded, m is the slope of the transition, R is the gasconstant and T is temperature in K. The data were fittedto this equation by non-linear least squares analysis usingthe general curve fit option of the KaleidaGraph(Abelbeck Software) program, which gives the calculatedstandard errors for individual experimental measure-ments of m and [D]50%.

It has been shown experimentally that the free energyof unfolded proteins in the presence of denaturant(DGD

U − F) is linearly related to the concentration ofdenaturant (Pace, 1986) (equation (3)),

Structure and Stability of a Twitchin Ig Domain638

DGDU − F = DGH2O

U − F − m[D] (3)

thus the value of the apparent free energy of unfoldingin the absence of denaturant, DGH2O

U − F can be determinedusing equation (4):

DGH2OU − F = m[D]50% (4)

Unfolding experiments

All the rapid mixing experiments were performed in anApplied Photophysics SF.18MV stopped flow apparatus,and followed by fluorescence, with excitation at 280 andemission monitored at wavelengths >320 nm using acut-off filter. Unfolding was initiated by rapidly mixingone volume of protein (09 mM) with ten volumes of aconcentrated urea solution at 20°C. Both solutionscontained 50 mM phosphate buffer (pH 7.0) (30.5 mMNa2HPO4, 19.5 mM NaH2PO4). The data from at leastfour experiments were averaged and were fitted to thefollowing equation by non-linear least squares analysisusing the general curve fit option of the KaleidaGraphprogram.

F(t) = C1 + F0e−kut − C2t (5)

where F (t) is the observed fluorescence, F0 is thefluorescence at time, t = 0, ku is the rate constant ofunfolding, C1 is the offset for the final fluorescence, andC2t is a term allowing for baseline instability. Theresiduals of the experimental data were plotted to checkfor systematic deviations.

The unfolding was also monitored using circulardichroism (CD). The unfolding reactions were initiatedby manual injection of 450 ml of a 30 mM solution of Ig 18'containing 50 mM phosphate buffer at pH 7.0, into4500 ml of phosphate buffered urea solution (pH 7.0). Thecurves were fitted to a single exponential function(equation (5)).

Coordinates have been deposited in the BrookhavenData Bank. PDB ID codes: 1WIT (restrained minimizedaverage structure) and 1WIU (ensemble of 30 structures).

AcknowledgementsWe thank Professor Alan Fersht for helpful discussion.

S.F. is supported by the Croucher Foundation. M.B. issupported by a Zeneca/DTI/MRC LINK program.

ReferencesAxe, D. D., Foster, N. W. & Fersht, A. R. (1996). Active

barnase variants with completely random hydro-phobic cores. Proc. Natl Acad. Sci. USA, 93,2172–2175.

Bateman, A., Eddy, S. R. & Chothia, C. (1996). Membersof the immunoglobulin superfamily in bacteria.Protein Sci. 5, 1936–1942.

Bax, A. & Davis, D. G. (1985). MLEV-17-basedtwo-dimensional homonuclear magnetization trans-fer spectroscopy. J. Magn. Reson. 65, 355–360.

Bax, A., Clore, G. M. & Gronenborn, A. M. (1990). 1H-1Hcorrelation via isotropic mixing of 13C magnetization,a new three-dimensional approach for assigning 1Hand 13C spectra of 13C-enriched proteins. J. Magn.Reson. 88, 425–431.

Benian, G. M., Kiff, J. E., Neckelmann, N., Moerman,D. G. & Waterson, R. H. (1989). Sequence of an

unusually large protein implicated in regulation ofmyosin activity in C. elegans. Nature, 342, 45–50.

Benian, G. M., L’Hernault, S. W. & Morris, M. E. (1993).Additional sequence complexity in the muscle geneunc-22 and its encoded protein twitchin of Caenorhab-ditis elegans. Genetics, 134, 1097–1104.

Bodenhausen, G. & Ruben, D. L. (1980). Naturalabundance nitrogen-15 NMR by enhanced hetero-nuclear spectroscopy. Chem. Phys. Letters, 69,185–188.

Bork, P., Holm, L. & Sander, C. (1994). The immunoglob-ulin fold. J. Mol. Biol. 242, 309–320.

Chothia, C. & Lesk, A. M. (1987). Canonical structures forthe hypervariable regions of immunoglobulins.J. Mol. Biol. 196, 901–917.

Chothia, C., Lesk, A. M., Tramontano, A., Levitt, M.,Smith-Gill, S. J., Air, G., Sheriff, S., Padlan, E. A.,Davies, D., Tulip, W. R., Colman, P. M., Spinelli, S.,Alzari, P. M. & Poljak, R. J. (1989). Conformations ofimmunoglobulin hypervariable regions. Nature, 342,877–883.

Clarke, J. & Fersht, A. R. (1993). Engineering disulfidebonds as probes of the folding pathway of barnase:increasing the stability of proteins against the rate ofdenaturation. Biochemistry, 32, 4322–4329.

Erickson, H. P. (1994). Reversible unfolding of fibronectintype III and immunoglobulin domains provides thestructural basis for stretch and elasticity of titin andfibronectin. Proc. Natl Acad. Sci. USA, 91, 10114–10118.

Eriksson, A. E., Baase, W. A., Zhang, X. J., Heinz, D. W.,Blaber, M., Baldwin, E. P. & Matthews, B. W. (1992).Response of a protein structure to cavity-creatingmutations and its relation to the hydrophobic effect.Science, 255, 178–183.

Grzesiek, S. & Bax, A. (1992a). Correlating backboneamide and side chain resonances in larger proteinsby multiple relayed triple resonance NMR. J. Am.Chem. Soc. 114, 6291–6293.

Grzesiek, S. & Bax, A. (1992b). Improved 3D triple-reson-ance NMR techniques applied to a 31 kDa protein.J. Magn. Reson. 96, 432–440.

Harpaz, Y. & Chothia, C. (1994). Many of theimmunoglobulin superfamily domains in cell ad-hesion molecules and surface receptors belong to anew structural set which is close to that containingvariable domains. J. Mol. Biol. 238, 528–539.

Heierhorst, J., Probst, W. C., Kohanski, R. A., Buku, A. &Weiss, K. R. (1995). Phosphorylation of myosinregulatory light chains by the molluscan twitchinkinase. Eur. J. Biochem. 233, 426–431.

Heierhorst, J., Kobe, B., Feil, S. C., Parker, M. W., Benian,G. M., Weiss, K. R. & Kemp, B. E. (1996). Ca2+/S100regulation of giant protein kinases. Nature, 380,636–639.

Holden, H. M., Ito, M., Hartshorne, D. J. & Rayment, I.(1992). X-ray structure determination of telokin, theC-terminal domain of myosin light chain kinase, at2.8 A resolution. J. Mol. Biol. 227, 840–851.

Hu, S.-H., Parker, M. W., Lei, J. Y., Wilce, M. C. J., Benian,G. M. & Kemp, B. E. (1994). Insights intoautoregulation from the crystal structure of twitchinkinase. Nature, 369, 581–584.

Improta, S., Politou, A. & Pastore, A. (1996). Immuno-globulin-like modules from titin I-band: extensiblecomponents of muscle elasticity. Structure, 4,323–337.

Jones, E. Y., Harlos, K., Bottomley, M. J., Robinson, R. C.,Driscoll, P. C., Edwards, R. M., Clements, J. M.,

Structure and Stability of a Twitchin Ig Domain 639

Dudgeon, T. J. & Stuart, D. I. (1995). Crystal-struc-ture of an integrin-binding fragment of vascularcell-adhesion molecule-1 at 1.8 angstrom resolution.Nature, 373, 539–544.

Karpusas, M., Baase, W. A., Matsumura, M. & Matthews,B. W. (1989). Hydrophobic packing in T4 lysozymeprobed by cavity-filling mutants. Proc. Natl Acad. Sci.USA, 86, 8237–8241.

Kay, L. E., Keifer, P. & Saarinen, T. (1992). Pureabsorption gradient enhanced heteronuclear singlequantum correlation spectroscopy with improvedsensitivity. J. Am. Chem. Soc. 114, 10663–10665.

Kellis, J. T., Nyberg, K., Sali, D. & Fersht, A. R. (1988).Contribution of hydrophobic interactions to proteinstability. Nature, 333, 784–786.

Kraulis, P. (1991). MolScript, a program to produce bothdetailed and schematic plots of protein structures.J. Appl. Crystallog. 24, 946–950.

Labeit, S. & Kolmerer, B. (1995). Titins: giant proteins incharge of muscle ultrastructure and elasticity.Science, 270, 293–296.

Labeit, S., Gautel, M., Lakey, A. & Trinick, J. (1992).Towards a molecular understanding of titin.EMBO J. 11, 1711–1716.

Lei, J., Tang, X., Chambers, T. C., Pohl, J. & Benian, G. M.(1994). Protein kinase domain of twitchin has proteinkinase activity and an autoinhibitory region. J. Biol.Chem. 269, 21078–21085.

Lesk, A. M. & Chothia, C. (1982). Evolution of proteinsformed by b-sheets II. The core of the immunoglob-ulin domains. J. Mol. Biol. 160, 325–342.

Lim, W. A. & Sauer, R. T. (1991). The role of internalpacking interactions in determining the structureand stability of a protein. J. Mol. Biol. 219, 359–376.

Lubienski, M. J., Bycroft, M., Freund, S. M. V. & Fersht,A. R. (1994). Three-dimensional solution structureand 13C assignments of barstar using nuclearmagnetic resonance spectroscopy. Biochemistry, 33,8866–8877.

Marion, D., Driscoll, P. C., Kay, L. E., Wingfield, P. T.,Bax, A., Gronenborn, A. M. & Clore, G. M. (1989).Overcoming the overlap problem in the assignmentof 1H NMR spectra of larger proteins by use of three-dimensional heteronuclear 1H-15N Hartmann-Hahn-multiple quantum coherence and nuclear Over-hauser-multiple quantum coherence spectroscopy.Application to interleukin 1b. Biochemistry, 28,6150–6156.

Milla, M. E. & Sauer, R. T. (1995). Critical side-chaininteractions at a subunit interface in the arc repressordimer. Biochemistry, 34, 3344–3351.

Muhandiram, D. R., Farrow, N. A., Xu, G. Y.,Smallcombe, S. H. & Kay, L. E. (1993). A gradient 13CNOESY-HSQC experiment for recording NOESYspectra of 13C-labeled proteins dissolved in H2O.J. Magn. Reson. ser. B, 102, 317–321.

Okagaki, T., Weber, F. E., Fischman, D. A., Vaughan,K. T., Mikawa, T. & Reinach, F. C. (1993). The majormyosin-binding domain of skeletal muscle MyBP-C(C protein) in the COOH-terminal, immunoglobulinC2 motif. J. Cell. Biol. 123, 619–626.

Pace, C. N. (1986). Determination and analysis of ureaand guanidinium denaturation curves. MethodsEnzymol. 131, 266–279.

Pfuhl, M. & Pastore, A. (1995). Tertiary structure of animmunoglobulin-like domain from the giant muscleprotein titin: a new member of the I set. Structure, 3,391–401.

Piotto, M., Saudek, V. & Sklenar, V. (1992). Gradient-

tailored excitation for single-quantum NMR spec-troscopy of aqueous solutions. J. Biomol. NMR, 2,661–665.

Politou, A. S., Gautel, M., Joseph, C. & Pastore, A. (1994a).Immunoglobulin-type domains of titin are stabilizedby amino-terminal extension. FEBS Letters, 352,27–31.

Politou, A. S., Gautel, M., Pfuhl, M., Labeit, S. & Pastore,A. (1994b). Immunoglobulin-type domains of titin:same fold, different stability? Biochemistry, 33,4730–4737.

Politou, A. S., Gautel, M., Improta, S., Vangelista, L. &Pastore, A. (1996). The elastic I-band region of titinis assembled in a ‘‘modular’’ fashion by weaklyinteracting Ig-like domains. J. Mol. Biol. 255, 604–616.

Probst, W. C., Cropper, E. C., Heierhorst, J., Hooper, S. L.,Jaffe, H., Vilim, F., Beushausen, S., Kupfermann, I. &Weiss, K. R. (1994). cAMP-dependent phosphoryl-ation of Aplysia twitchin may mediate modulation ofmuscle contractions by neuropeptide cotransmitters.Proc. Natl Acad. Sci. USA, 91, 8487–8491.

Serrano, L., Kellis, J. T., Jr, Cann, P., Matouschek, A. &Fersht, A. R. (1992). The folding of an enzyme II.Substructure of barnase and the contribution ofdifferent interactions to protein stability. J. Mol. Biol.224, 783–804.

Shirinsky, V. P., Vorotnikov, A. V., Birukov, K. G.,Nanaev, A. K., Collinge, M., Lukas, T. J., Sellers, J. R.& Watterson, D. M. (1993). A kinase-related proteinstabilizes unphosphorylated smooth muscle myosinminifilaments in the presence of ATP. J. Biol. Chem.268, 16578–16583.

Stonehouse, J. & Keeler, J. (1995). A convenient andaccurate method for the measurement of the valuesof spin-spin coupling constants. J. Magn. Reson. ser.A, 112, 43–57.

Szyperski, T., Neri, D., Leiting, B., Otting, G. & Wuthrich,K. (1992). Support of 1H NMR assignments inproteins by biosynthetically directed fractional13C-labelling. J. Biol. NMR, 2, 323–334.

Thomsen, N. K., Soroka, V., Jensen, P. H., Berezin, V.,Kiselyov, V. V., Bock, E. & Poulsen, F. M. (1996). Thethree-dimensional structure of the first domain ofneural cell adhesion molecule. Nature Struct. Biol. 3,581–585.

Williams, A. F. & Barclay, A. N. (1988). The immunoglob-ulin superfamily—domains for cell surface recog-nition. Annu. Rev. Immunol. 6, 381–405.

Wuthrich, K. (1986). NMR of Proteins and Nucleic Acids.John Wiley & Sons, New York.

Edited by R. Huber

(Received 24 June 1996; received in revised form23 September 1996; accepted 27 September 1996)

Supplementary material, comprising a table ofresonance assignments, is available from JMBOnline.