Probing protein structure by limited proteolysis

23
Review Probing protein structure by limited proteolysis * Angelo Fontana ½ , Patrizia Polverino de Laureto, Barbara Spolaore, Erica Frare, Paola Picotti and Marcello Zambonin CRIBI Biotechnology Centre, University of Padua, Padua, Italy Received: 10 May, 2004; accepted: 12 May, 2004 Key words: limited proteolysis, protein domains, complementing fragments, protein flexibility, molten globule, mass spectrometry, apomyoglobin, lysozyme, a-lactalbumin, cytochrome c, human growth hormone Limited proteolysis experiments can be successfully used to probe conformational features of proteins. In a number of studies it has been demonstrated that the sites of limited proteolysis along the polypeptide chain of a protein are characterized by enhanced backbone flexibility, implying that proteolytic probes can pinpoint the sites of local unfolding in a protein chain. Limited proteolysis was used to analyze the partly folded (molten globule) states of several proteins, such as apomyoglobin, a-lactalbumin, calcium-binding lysozymes, cytochrome c and human growth hor- mone. These proteins were induced to acquire the molten globule state under spe- cific solvent conditions, such as low pH. In general, the protein conformational fea- tures deduced from limited proteolysis experiments nicely correlate with those deriv- ing from other biophysical and spectroscopic techniques. Limited proteolysis is also most useful for isolating protein fragments that can fold autonomously and thus be- have as protein domains. Moreover, the technique can be used to identify and pre- pare protein fragments that are able to associate into a native-like and often func- tional protein complex. Overall, our results underscore the utility of the limited pro- teolysis approach for unravelling molecular features of proteins and appear to Vol. 51 No. 2/2004 299–321 QUARTERLY * Presented as invited lecture at the 29 th Congress of the Federation of European Biochemical Societies, Warsaw, Poland, 26 June–1 July 2004. ½ Corresponding author: Prof. Angelo Fontana, CRIBI Biotechnlogy Centre, University of Padua, Viale G. Colombo 3, 35121 Padua, Italy; phone (39 049) 827 6156; fax (39 049) 827 6159; e-mail: [email protected] Abbreviations: AFU, autonomous folding unit; apoMb, apomyoglobin; CD, circular dichroism; CYT, cytochrome c; E:S, enzyme to substrate ratio; HD, hydrogen/deuterium; hGH, human growth hor- mone; LA, a-lactalbumin; LYS, lysozyme; MG, molten globule; NMR, nuclear magnetic resonance; 3D, three-dimensional; nicked protein, a protein with a single peptide bond cleaved.

Transcript of Probing protein structure by limited proteolysis

Review

Probing protein structure by limited proteolysis�

Angelo Fontana�, Patrizia Polverino de Laureto, Barbara Spolaore, Erica Frare,Paola Picotti and Marcello Zambonin

CRIBI Biotechnology Centre, University of Padua, Padua, Italy

Received: 10 May, 2004; accepted: 12 May, 2004

Key words: limited proteolysis, protein domains, complementing fragments, protein flexibility,molten globule, mass spectrometry, apomyoglobin, lysozyme, �-lactalbumin, cytochrome c,human growth hormone

Limited proteolysis experiments can be successfully used to probe conformationalfeatures of proteins. In a number of studies it has been demonstrated that the sitesof limited proteolysis along the polypeptide chain of a protein are characterized byenhanced backbone flexibility, implying that proteolytic probes can pinpoint thesites of local unfolding in a protein chain. Limited proteolysis was used to analyzethe partly folded (molten globule) states of several proteins, such as apomyoglobin,�-lactalbumin, calcium-binding lysozymes, cytochrome c and human growth hor-mone. These proteins were induced to acquire the molten globule state under spe-cific solvent conditions, such as low pH. In general, the protein conformational fea-tures deduced from limited proteolysis experiments nicely correlate with those deriv-ing from other biophysical and spectroscopic techniques. Limited proteolysis is alsomost useful for isolating protein fragments that can fold autonomously and thus be-have as protein domains. Moreover, the technique can be used to identify and pre-pare protein fragments that are able to associate into a native-like and often func-tional protein complex. Overall, our results underscore the utility of the limited pro-teolysis approach for unravelling molecular features of proteins and appear to

Vol. 51 No. 2/2004

299–321

QUARTERLY

�Presented as invited lecture at the 29th Congress of the Federation of European Biochemical Societies,Warsaw, Poland, 26 June–1 July 2004.

�Corresponding author: Prof. Angelo Fontana, CRIBI Biotechnlogy Centre, University of Padua, VialeG. Colombo 3, 35121 Padua, Italy; phone (39 049) 827 6156; fax (39 049) 827 6159; e-mail:[email protected]

Abbreviations: AFU, autonomous folding unit; apoMb, apomyoglobin; CD, circular dichroism; CYT,cytochrome c; E:S, enzyme to substrate ratio; HD, hydrogen/deuterium; hGH, human growth hor-mone; LA, �-lactalbumin; LYS, lysozyme; MG, molten globule; NMR, nuclear magnetic resonance; 3D,three-dimensional; nicked protein, a protein with a single peptide bond cleaved.

prompt its systematic use as a simple first step in the elucidation of structure-dynam-ics-function relationships of a novel and rare protein, especially if available in min-ute amounts.

A variety of spectroscopic techniques areavailable for monitoring conformational tran-sitions of proteins in solution, the most com-monly used being circular dichroism (CD).Far-UV CD measurements allow to evaluatethe overall features of the secondary struc-ture of proteins, as well as to quantify the rel-ative proportions of �-helix, �-sheet and ran-dom coil. Tertiary structure and the micro-environment of aromatic chromophores in aprotein can be monitored by near-UV CD, flu-orescence emission and differential absorp-tion in the UV region. Also infrared andRaman spectroscopy can be used to estimateconformational features and transitions ofproteins. Nowadays, the most useful and in-formative technique for the structural eluci-dation of (small) proteins in solution is NMRspectroscopy. However, usually NMR re-mains plagued by heavy instrumentation re-quirements, needs a millimolar concentrationof a non-aggregating protein solution and, inaddition, is of limited success in the detailedanalysis of partly folded and fluctuatingstates of proteins due to resonance broaden-ing and/or lack of sufficient chemical shiftdispersion. Even X-ray crystallography can beutilized for protein structure analysis only ifsuitable protein crystals are available and,moreover, it is of no use with dynamic proteinsystems. Therefore, no one technique is fullysuperior to all others and each one has advan-tages and drawbacks.In this review we will summarise the results

of our studies aimed to use limited proteoly-sis experiments for probing molecular fea-tures of proteins in their native or partlyfolded states. The key features of the site(s) ofprotein cleavage by proteolytic probes are dis-cussed. Even the aggregation process of pro-teins to protofibrils and amyloid deposits canbe monitored by this technique. Moreover,protein fragments that can fold autono-mously and/or can associate to form na-

tive-like nicked protein species have been pre-pared by limited proteolysis of several pro-teins. It is shown that the simple biochemicaltechnique of limited proteolysis can provideimportant information on the structure anddynamics of proteins, thus complementingthe results that can be obtained by usingother biophysical and spectroscopic tech-niques. We have not attempted here to coverthe vast amount of literature dealing withproteolysis of proteins, so that the readerlikely will find a personal selection of subjectsand omissions as well. Rather we will presenthere what we consider the most significant re-sults of our studies conducted over the pasttwo decades. The reader may find additionalinformation in our previous review articlesdealing with the limited proteolysis phenome-non (Fontana, 1989; Fontana et al., 1989;1993; 1997a; 1999).

THE RATIONALE OF THE

TECHNIQUE

Proteolysis of a protein substrate can occuronly if the polypeptide chain can bind andadapt to the specific stereochemistry of theprotease’s active site (Schechter & Berger,1967). However, since the active sites of pro-teases have not been designed by nature to fitthe specific sequence and fixed stereo-chemistry of a stretch of at least 8–10 amino-acid residues of a particular globular protein,an induced-fit mechanism of adaptation of theprotein substrate to the active site of the pro-tease is required for binding and formation ofthe transition state of the hydrolytic reaction(Herschlag, 1988). Therefore, the native rigidstructure of a globular protein cannot act assubstrate for a protease, as documented bythe fact that folded proteins under physiologi-cal conditions are rather resistant to proteoly-

300 A. Fontana and others 2004

sis. This is no longer the case when the fullyunfolded state (U) of a globular protein existsat equilibrium with the native state (N). How-ever, the N � U equilibrium is much shiftedtowards the native state under physiologicalconditions, according to the Boltzmann rela-tionship �G = –RTln[U]/[N], where �G is5–15 kcal/mol. Therefore, only a tiny fraction(10–6–10–9) of protein molecules is in the Ustate that is suitable for proteolysis. Conse-quently, native globular proteins are ratherresistant to proteolytic degradation, as a re-sult of the fact that the N � U equilibrium ac-tually dictates and regulates the rate of prote-olysis.Nevertheless, as shown in the Scheme of

Fig. 1, even native globular proteins can be at-tacked by a protease and, in a number ofcases, it has been shown that the peptidebond fission occurs only at one (or a few) pep-tide bond(s). This results from the fact that aglobular protein is not a static entity as can beinferred by a picture of its crystallographi-cally determined 3D-structure, but instead isa dynamic system capable of fluctuationsaround its average native state at the level ofboth side chains and polypeptide backbone.Indeed, crystallographers analyze this pro-tein mobility in terms of B-factor for both sidechains and C�-backbone (Frauenfelder et al.,1979; Sternberg et al., 1979; Ringe & Petsko,1985). The main chain B-factor is a measureof average displacements of the polypeptidechain from its native structure, so that it canexperience displacements leading to some lo-cal unfolding. It can be envisaged that thesehigher energy, locally unfolded states arethose required for a native protein to be at-tacked by a proteolytic enzyme. Evidence forthis mechanism of local unfolding requiredfor limited proteolysis has been provided bydemonstrating a close correspondence be-tween sites of limited proteolysis and sites ofhigher backbone displacements in the316-residue polypeptide chain of thermolysin(Fontana et al., 1986). It is plausible to sug-gest that limited proteolysis derives also from

the fact that a specific chain segment of thefolded protein substrate is sufficiently ex-posed to bind at the active site of the prote-ase. However, the notion of exposure/protru-sion/accessibility is a required property, butclearly not at all sufficient to explain the se-lective hydrolysis of just one peptide bond,since it is evident that even in a small globu-lar protein there are many exposed sites (theall protein surface) which could be targets ofproteolysis. Instead, enhanced chain flexibil-ity (segmental mobility) appears to be the keyfeature of the site(s) of limited proteolysis(Fontana et al., 1986; 1997a).

The results obtained with thermolysin(Fontana et al., 1986) are in line with thosederived from limited proteolysis experimentsconducted on a variety of other proteins ofknown three-dimensional (3D) structure(Fontana et al., 1993; 1997a). In many cases,limited proteolysis was observed to occur atsites of the polypeptide chain displaying highsegmental mobility or poorly resolved in theelectron density map, implying significantstatic/dynamic disorder. Therefore, it wasconcluded that limited proteolysis of a globu-

Vol. 51 Probing protein structure by limited proteolysis 301

Figure 1. Schematic view of the mechanism ofproteolysis of a globular protein.

A dual mechanism of protein degradation is shown, theone involving as substrate for proteolysis the (fully) un-folded protein and the other the native form of the pro-tein. In this last case proteolysis is limited and occursat flexible site(s), leading to a nicked protein speciesthat can unfold and then be degraded to small pep-tides.

lar protein occurs at flexible loops and, in par-ticular, that chain segments in a regular sec-ondary structure (such as helices) are notsites of limited proteolysis. Indeed, Hubbardet al. (1991; 1994) conducted modelling stud-ies of the conformational changes requiredfor proteolytic cleavages and concluded thatthe sites of limited proteolysis require a largeconformational change (local unfolding) of achain segment of up to 12 residues (Hubbard,1998). A possible explanation of the fact thathelices and, in general, elements of regularsecondary structure are not easily hydrolyzedby proteolytic enzymes can be given also onthe basis of energetic considerations. If prote-olysis is occurring at the centre of the helicalsegment, likely the helix is fully destroyed byend-effects and consequently all hydrogenbonds, which cooperatively stabilize it, arebroken. On the other hand, a peptide bond fis-sion at a disordered flexible site likely doesnot change much the energetics of that site,since the peptide hydrolysis can easily becompensated by some hydrogen bonds withwater (Krokoszynska & Otlewski, 1996).Therefore, it can be proposed that proteolysisof rigid elements of secondary structure isthermodynamically very disadvantageous.The limited proteolysis approach for prob-

ing protein conformation implies that theproteolytic event should be dictated by thestereochemistry and flexibility of the proteinsubstrate and not by the specificity of the at-tacking protease (Mihalyi, 1978; Price &Johnson, 1990; Fontana et al., 1986; 1993;1997a; 1997b; 1999; Hubbard et al., 1994;Hubbard, 1998). To this aim, the most suit-able proteases are those displaying broad sub-strate specificity, such as subtilisin, thermo-lysin, proteinase K and pepsin (Bond, 1990).These endopeptidases display a moderatepreference for hydrolysis at hydrophobic orneutral amino-acid residues, but often cleav-ages occur at other residues as well. The rec-ommended approach is to perform trial exper-iments of proteolysis of the protein of interestin order to find out the most useful protease,

the optimal protein substrate:protease (E:S)ratio and the effect of temperature and timeof incubation (Fontana et al., 1999). Possibleways to control proteolysis is by using a lowconcentration of protease, short reactiontimes and low temperature. It is not easy topredict in advance the most useful experimen-tal conditions for conducting a limited prote-olysis experiment, since these depend uponthe structure, dynamics, stability/rigidityproperties of the protein substrate and fromthe actual aim of the experiment, i.e., identifi-cation of the sites of protein flexibility, isola-tion of the rigid core of the protein or prepa-ration of a nicked protein (see below). In typi-cal experiments of limited proteolysis, it hasbeen found that an E:S ratio of 1:100 (byweight) is recommended, but occasionallyboth 1:20 or 1:5000 can be used. This resultsfrom the fact that there is a great variation inthe rate of the selective peptide fission in aglobular protein, requiring seconds or daysfor the limited proteolysis event (Mihalyi,1978). Moreover, if isolation of the nickedprotein resulting from the initial proteolysisis desired, both the time and temperature ofreaction should be properly controlled, sincethe nicked species may be present only tran-siently in the proteolysis mixture. Indeed, anicked protein is usually much more flexibleand unstable than the native one and easilyunfolds to a protein substrate that is finallydegraded to small peptides (see Fig. 1).

IDENTIFICATION OF FLEXIBLESITES OF A POLYPEPTIDE CHAIN

Limited proteolysis does not occur at thenumerous sites scattered across the proteinsurface, but is restricted to (very) few spe-cific locations. These sites have shown agood correlation with larger crystallographicB-factors, uncertain electron density andlarger dispersion values of backbone angles(Fontana et al., 1993; 1999). Hence, limitedproteolysis occurs preferentially at those

302 A. Fontana and others 2004

loops which display inherent conformationalflexibility, whereas the protein core remainsquite rigid and thus resistant to proteolysis(Fontana, 1989; Fontana et al., 1986; 1989;1993; 1997a; 1997b; Polverino de Laureto etal., 1995). Usually it is a region, rather thana specific site, the target of limited proteoly-sis, as given by the fact that, if several pro-teases are used, one observes that cleavagetakes place over a stretch of peptide bonds(Fontana et al., 1997a; 1997b). A survey ofthe cleavage sites of a variety of proteins ofknown 3D structure revealed that they neveroccur at the level of �-helices, but largely atloops (Fontana et al., 1986; 1993; 1999).Therefore, limited proteolysis experimentscan be used to identify the sites of enhancedflexibility or of local unfolding of a poly-peptide chain (Fontana et al., 1993; 1997a;Hubbard, 1998).Limited proteolysis experiments were used

to probe the structural and dynamic differ-ences between the holo and apo form of horsemyoglobin (Mb) (Fontana et al., 1997a). Ini-tial nicking of the polypeptide chain of apoMb(153 amino-acid residues, no disulfide bonds)by several proteases occurs at the level ofchain segment 89–96. In contrast, holoMb isresistant to proteolytic digestion when re-acted under identical experimental condi-tions. More recently, the conformational fea-tures of native and mutant forms of sperm-whale apoMb at neutral pH were probed bylimited proteolysis experiments utilizing upto eight proteases of different substratespecifities (Picotti et al., 2004). It was shownthat all proteases selectively cleave apoMb atthe level of chain segment 82–94 encompass-ing helix F in the X-ray structure of the holoform of the native protein; for example,thermolysin cleaves the Pro88–Leu89 pep-tide bond (Fig. 2). These results indicate thathelix F is highly flexible or largely disruptedin apoMb, in full agreement with NMR(Eliezer & Wright, 1996; Lecomte et al., 1996;1999; Eliezer et al., 1998) and molecular dy-namics simulations (Brooks, 1992; Tirado-

Rives & Jorgensen, 1993; Hirst & Brooks,1995; Onufriev et al., 2003).Since helix F contains the helix-breaking

Pro88 residue, it was conceivable to suggestthat helix F is kept in place in the native holoprotein by a variety of helix-heme stabilizinginteractions, including the coordination ofthe heme iron by proximal His93. In order tomodulate the stability of helix F, thePro88Ala and Pro88Gly mutants of sperm-whale apoMb were prepared by site-directedmutagenesis and their conformational prop-erties investigated by both far-UV CD spec-troscopy and limited proteolysis (Picotti et al.,2004). The helix content of the Pro88A mu-tant was somewhat enhanced with respect tothat of both native and Pro88Gly mutant, asexpected from the fact that the Ala residue isthe strongest helix-inducer among the 20amino-acid residues. The rate of limited pro-teolysis of the three apoMb variants bythermolysin and proteinase K was in the or-der native > Pro88Gly >> Pro88Ala, in agree-ment with the scale of helix propensity of Ala,Gly and Pro.The clear-cut results of the proteolysis ex-

periments conducted on apoMb (Fontana etal., 1997; Picotti et al., 2004) emphasize theutility of proteolytic probes of protein struc-ture and dynamics. By using the simple bio-chemical approach of limited proteolysis it ispossible to detect the unfolding of helix F, inagreement with the results obtained in ana-lyzing the molecular features of apoMb atneutral pH by using NMR and computationalapproaches. Clearly, it is the local mobil-ity/unfolding of the chain of the apoMb sub-strate that dictates the limited proteolysisphenomenon (Fontana et al., 1986). Indeed,when the mobile chain region encompassinghelix F in apoMb is induced to adopt a quiterigid and hydrogen-bonded structure, as thatresulting from a Pro�Ala replacement, thesite of limited proteolysis was rather well pro-tected against the proteolytic attack (Picotti etal., 2004).

Vol. 51 Probing protein structure by limited proteolysis 303

PROBING PARTLY FOLDED STATESOF PROTEINS

Interpreting function of proteins in terms oftheir three-dimensional (3D) structure isdominating protein science since many de-cades. Nevertheless, in recent years the pro-tein structure–function paradigm has beenchallenged by the observation that proteinscan exist in states different from the native(N) and fully unfolded (D) state (Dunker &Obradovic, 2001; Dunker et al., 2002). Indeed,there are a number of experimental observa-tions demonstrating that proteins can adopt

intermediate states that play a role in thefunctioning of proteins at the cellular level,such as ligand binding and protein trans-location (Bychkova & Ptitsyn, 1995; Ptitsyn,1995). Folding intermediates, usually namedmolten globule (MG) states, can be generatedat equilibrium by exposing the protein to mildacid solutions, in the presence of moderateconcentrations of protein denaturants, by re-moving protein-bound ligands (metal ions orprosthetic groups), as well as by chain trunca-tion or amino acid replacements by geneticmethods (Ptitsyn, 1995; Arai & Kuwajima,2000). In some cases, the molecular features

304 A. Fontana and others 2004

Figure 2. Limited proteolysis of apomyoglobin (apoMb) at neutral pH.

(Top) Schematic secondary structure of sperm-whale myoglobin. The height helices A to H are shown in coloredboxes. The amino-acid sequence of the chain segment encompassing helix F is magnified and the sites of initialproteolytic cleavage by proteinase K (K), thermolysin (Th), subtilisin (Su), chymotrypsin (Ch), trypsin (T), V8-pro-tease (V8), papain (P) and elastase (E) are indicated by arrows. (Bottom) Schematic 3D structure of sperm-whaleapoMb. The location of the helical segments (A to H) of the 153-residue chain of holo myoglobin are shown. Thechain segment encompassing helix F is shown to be disordered in apoMb (see text). The model was constructedfrom the X-ray structure (file 1YMB taken from the Brookhaven Protein Data Bank) using the program WebLab(Molecular Simulations Inc., San Diego, CA, U.S.A.).

of MGs have been found to resemble thosethat form transiently in kinetic experimentsof protein folding (Chamberlain & Marqusee,2000) and thus it was proposed that the MGstate is an intermediate in protein folding(Ptitsyn, 1987; 1995; Ptitsyn et al., 1990). TheMG was defined as a dynamic compact stateof a polypeptide chain, characterized by ahigh degree of native-like secondary struc-ture, but lacking the fixed tertiary contacts ofthe native state (Ohgushi & Wada, 1983;Ptitsyn, 1987). However, the results of a vari-ety of studies have clearly indicated that aplethora of MGs exists, ranging from thosemuch resembling the unfolded state of a pro-tein to those possessing substantial na-tive-like properties, including some specifictertiary interactions (Arai & Kuwajima,2000). Moreover, it has been reported that nu-merous proteins contain largely disorderedchain regions of 40 or more amino-acid resi-dues (Romero et al., 2001) and that some pro-teins appear to be intrinsically or “natively”unfolded or only partly folded under theirnormal conditions in the cell (Wright &Dyson, 1999; Uversky et al., 2000; Uversky,2002). Therefore, it has become increasinglyclear that proteins are dynamic systems thatcan adopt a variety of partly folded states orMGs and, in particular, that the function ofproteins cannot be interpreted solely on thebasis of their static 3D structures (Wright &Dyson, 1999; Dunker & Obradovic, 2002).It is clear, therefore, that the structural

analysis of protein intermediates or MGs isrelevant for a number of biophysical and bio-logical aspects of proteins and, in particular,for re-assessing the protein structure–func-tion paradigm (Wright & Dyson, 1999). How-ever, the analysis of the molecular features ofMGs is not at all an easy task, since thesestates usually are flexible and heterogeneous,not amenable to structural elucidation byX-ray crystallography and quite difficult toanalyse by NMR spectroscopy (Evans &Radford, 1994). Recent developments in hy-drogen/deuterium (HD) exchange, combined

with two-dimensional NMR spectroscopy,have provided useful experimental tools foranalyzing folding intermediates even at thelevel of atomic resolution (Eliezer et al.,1998). The advances in NMR measurementsinclude the use of multi-dimensional het-ero-nuclear NMR techniques utilizing 13C- or15N-labeled proteins. Also the solution X-rayscattering technique has been used to obtainaccurate data on the size, shape and, in somecases, even tertiary fold of compact, non-na-tive states of proteins (Arai & Kuwajima,2000). However, it is clear that the structuralcomplexity of MGs requires the use of newand complementary techniques and ap-proaches, since no one technique is fully supe-rior to all others (Evans & Radford, 1994;Dobson, 1994; Fink, 1995; Fontana et al.,1997a).The conformational state of �-lactalbumin

(LA) exposed to acid pH (A-state) is nowadaysregarded as a prototype MG (Ptitsyn, 1987;1995; Kuwajima, 1989; 1996; Permyakov &Berliner, 2000; Permyakov et al., 2003). TheA-state of LA has been investigated in greatdetail using a variety of experimental ap-proaches and techniques (Kuwajima, 1996;Arai & Kuwajima, 2000). These studies wereconducted on bovine, human, goat andguinea-pig LA, but the conformational fea-tures of these homologous proteins are verysimilar (Pike et al., 1996) and thus results ob-tained with LA from different sources andtheir interpretations likely can be used inter-changeably. NMR and HD-exchange measure-ments revealed that in acid the 123-residuechain of LA adopts a partly folded state char-acterized by a disordered �-domain, while the�-domain maintains substantial, albeit dy-namic, helical secondary structure (Wu et al.,1995). The �-domain is a discontinuous do-main given by the N-terminal segment 1–37and the C-terminal segment 85–123 and com-prises all helical segments of the protein,while the �-domain is given by the remainderof the protein chain encompassing the�-strands and a coil region (Pike et al., 1996).

Vol. 51 Probing protein structure by limited proteolysis 305

Alternatively, the term �-subdomain was usedto indicate the chain segment from residue 34to 57 encompassing the three �-strands of theprotein (Polverino de Laureto et al., 1995;2001; 2002a; 2002b).Polverino de Laureto et al. (1995) were first

in using the limited proteolysis approach forunravelling molecular features of MG statesof proteins. Proteolysis of LA in its A-state atlow pH by pepsin results in the initial cleav-age at peptide bonds 52–53 and 40–41 lo-cated in the �-subdomain of the protein, thusimplying that this region is flexible or un-folded. Subsequently, pepsin cleaves also atpeptide bond 103–104 and, therefore, frag-ments 53–103 and 1–40/104–123 accumu-late in the proteolysis mixture. Overall, it wasconcluded that the proteolytic probe detects,in the A-state of LA, the flexibility or unfold-ing of the �-subdomain, in full agreementwith the results of other physicochemicalstudies. Indeed, the proposal was advancedthat the MG of LA has a “bipartite structure”given by a structured �-domain and a disor-dered �-domain (Peng & Kim, 1994; Wu et al.,1995; Schulman et al., 1995; 1997).The conformational features of the cal-

cium-depleted form of LA are not yet definedin such a detail as those of the A-state. Actu-ally, there are discrepancies in the reportedexperimental results and conflicting propos-als regarding the conformational state ofapo-LA, ranging from a classical MG devoidof a cooperative thermal transition (Yutani etal., 1992) to a partly folded state with somenative-like properties and displaying insteadcooperativity (see Kuwajima, 1996, for a dis-cussion and references). The thermal unfold-ing transition of apo-LA to a MG state wasmonitored by differential calorimetry, sec-ond-derivative UV spectroscopy, fluorescenceemission, Raman spectroscopy and near-UVCD. Depending upon the technique and thesolvent conditions (pH, ionic strength), differ-ent figures for the melting temperature (Tm)of apo-LA have been reported, from about10°C to 42°C (Griko et al., 1994; Griko &

Remeta, 1999; Permyakov & Berliner, 2000).Often it is assumed that apo-LA, as obtainedfor example by dissolving the protein at 20°Cin Tris buffer, pH 8.0, containing a calciumchelating agent, adopts a MG state (Kataokaet al., 1997), but this may be not be true with-out specifying the ionic strength and the tem-perature of the protein solution. It is clear,therefore, that the conformational stateadopted by apo-LA is strongly influenced bythe specific solvent conditions (Griko &Remeta, 1994). Recently, both X-ray crystal-lography (Chrysina et al., 2000) and NMRspectroscopy (Wijesinha-Bettoni et al., 2001)have been used to analyze the structure ofapo-LA in the presence of salt (e.g., 0.5 MNaCl) at neutral pH. These studies revealedthat the conformational features of theapo-form of LA are much similar to those ofthe calcium-loaded protein and that the struc-tural differences between the apo and holoform are mainly confined at the level of thecalcium-binding sites.Limited proteolysis of bovine apo-LA at neu-

tral pH under moderate heating occurs at thesame �-subdomain region as in the A-state ofLA at low pH (Polverino de Laureto et al.,2002b). All sites of limited proteolytic cleav-age by the voracious and non-specificproteinase K (Lebherz et al., 1986) occur atthe level of the �-subdomain encompassingthe three �-strands S1, S2 and S3 of nativeLA (Fig. 3). Thus, proteolysis data indicatethat the heat-mediated MG of apo-LA at neu-tral pH shares similar overall conformationaland dynamical features to those of the A-stateof the protein and, in particular, that the�-subdomain is disordered. Visual inspectionof the 3D-model of LA (see Fig. 3) reveals thatthe �-subdomain is rather well separatedfrom the protein core and thus it can be con-sidered a structural domain or subdomain(Wetlaufer, 1973; 1981). Removal of the�-subdomain from the MG of apo-LA by prote-olysis leads to the gapped species des�-LA,given by fragments 1–34 and 54–123 or57–123 covalently linked by the disulfide

306 A. Fontana and others 2004

bridges of the protein. This gapped des�-LAspecies is sufficiently stable and rigid to resistfurther proteolysis, so that, during proteoly-sis, accumulates in the reaction mixture asthe major protein species. Conformationaland stability features of des�-LA have beenreported (Polverino de Laureto et al., 2001).It has been found that LA interacts with a

variety of hydrophobic compounds, includingfatty acids (Cawthern et al., 1997; Permyakovet al., 2003). A most intriguing observation

was reported by H�kansson et al. (1995; 1999;2000) and Svensson et al. (1999; 2000). Theyfound a form of LA isolated from the caseinfraction of milk that can induce apoptosis intumor cells, but not in healthy cells. The LAvariant inducing apoptosis, yet not thor-oughly characterized, appears to require thebinding of oleic acid (18:1) as cofactor and topossess spectroscopic properties related tothose of the MG state of LA in acid (Svenssonet al., 2000). The authors gave to this LA vari-

Vol. 51 Probing protein structure by limited proteolysis 307

Figure 3. Limited proteolysis of �-lactalbumin (LA) in its partly folded (molten globule) state.

(Top) Schematic representation of the crystal structure of bovine LA generated from the coordinates deposited inBrookhaven Protein Data Bank (1hfz) using the program Insight II version 97.0 (Molecular Simulations, SanDiego, CA, U.S.A.). The chain segment from residue 35 to 56 encompassing the three-stranded antiparallel�-pleated sheets is indicated in red and the chain segments 1–34 and 57–123 in blue. Calcium is represented by asolid sphere in green. The connectivities of the four disulfides (represented by sticks) along the 123-residue chainof LA are 6–120, 28–111, 61–77 and 73–91. (Bottom) Scheme of the secondary structure of the 123-residue chainof bovine LA (Pike et al., 1996). The four �-helices (H1–H4) along the protein chain are indicated by major boxesand below them the corresponding chain segments are given. The three �-strands (S1, 41–44; S2, 47–50; S3,55–56) are indicated by small boxes. The short 310 helices (h1b, 18–20; h2, 77–80; h3c, 115–118) are also shown bysmall boxes. At variance from LA derived from other sources, the chain segment encompassing helix H4 (residues105–110) in the crystal structure of bovine LA exhibits a variety of distinct conformers, including a distorted �-he-lical conformation (Pike et al., 1996). The amino-acid sequence of the chain region (residues 34–57) of apo-LA is ex-plicitly shown. The sites of peptide bond fission of apo-LA in its MG state at neutral pH in the presence of oleic acidby proteinase K are indicated by arrows (see text).

ant the fancy name HAMLET (human�-lactalbumin made lethal to tumor cells)(Svensson et al., 2000) and found it to possessseveral interesting biological properties(Köhler et al., 1999; 2001), including a bacte-ricidal activity (H�kansson et al., 2000). It isnot clear if HAMLET is monomeric or oligo-meric, since Svanborg and co-workers namedthe oleic acid complex of LA also MAL, i.e.,multimeric LA (Köhler et al., 1999;H�kansson et al., 1999; see: also Permyakovet al., 2003).The simple addition of oleic acid (7.5 equiva-

lents) to a solution of bovine apo-LA at neu-tral pH induces the formation of the MG stateof the protein, as demonstrated by far- andnear-UV CD (Polverino de Laureto et al.,2002b). The oleic acid-induced MG behaves to-wards the proteolytic probe proteinase K inthe same way as that of the protein upon mod-erate heating. Previously, the oleic acid/apo-LA complex or HAMLET has been pre-pared by a chromatographic procedure(Svensson et al., 2000). The freshly preparedoleic acid/apo-LA complex, obtained by sim-ple mixing the protein with the fatty acid, ismonomeric and shows spectroscopic proper-ties identical to those of the MG of LA(Polverino de Laureto et al., 2002b). The pro-tein species des�-LA, given by the N-terminalfragment 1–34 linked via disulfide bridges tothe C-terminal fragment 54–123 or 57–123,was isolated from the proteolysis mixture ofthe oleic acid/LA complex (see Fig. 3). There-fore, seemingly the same MG state of apo-LAcan be obtained at neutral pH at 37–45�C orby mixing apo-LA with oleic acid at room tem-perature. This MG of LA maintains a na-tive-like tertiary fold, characterized by astructured �-domain and a disordered�-subdomain 34–57 (Polverino de Laureto etal., 2002b).We have analyzed also the partly folded

states of protein members of the lysozyme/lactalbumin (LYS/LA) superfamily by CDmeasurements and limited proteolysis exper-

iments (Polverino de Laureto et al., 2002a).Hen, horse, dog and pigeon lysozyme (LYS)and bovine LA were used for this study.These are related proteins of 123–129amino-acid residues with similar 3D struc-tures, but notable differences among themreside in their calcium-binding propertiesand capability to adopt partly folded or MGstates in acid solution or upon depletion ofcalcium at neutral pH (apo-state). Far- andnear-UV CD measurements revealed that,while the structures of hen and dog LYS arerather stable in acid at pH 2.0 or at neutralpH in the absence of calcium, conforma-tional transitions to various extents occurwith all other LYS/LA proteins. The mostsignificant perturbation of tertiary structurein acid was observed with bovine LA andLYS from horse milk and pigeon egg-white.While hen LYS at pH 2.0 was fully resistantto proteolysis by pepsin, the other membersof the LYS/LA superfamily were cleaved atdifferent rates at few sites of the polypeptidechain, thus leading to rather large proteinfragments. The apo-form of bovine LA, horseLYS and pigeon LYS were attacked byproteinase K at pH 8.3, while dog and henLYS were resistant to proteolysis when re-acted under identical experimental condi-tions. Briefly, it has been found that the pro-teolysis data correlate well with the extent ofconformational transitions inferred from CDspectra and with existing structural infor-mations regarding the proteins investigatedand mainly derived from NMR and HD-ex-change measurements. The sites of initialproteolytic cleavages in the LYS variants oc-cur at the level of the �-subdomain (approxi-mately chain region 34–57), in analogy tothose observed with bovine LA (Polverino deLaureto et al., 1995; 2001; 2002b). Overall,proteolysis data indicated that the MG of theLYS/LA proteins is characterized by a struc-tured �-domain and a largely disrupted�-subdomain.

308 A. Fontana and others 2004

MONITORING PROTEINAGGREGATION

The problem of protein aggregation was con-sidered in the past just a nuisance in proteinresearch, but recently the analysis of the mo-lecular features of the protein aggregates con-stituting amyloid fibrils has attracted consid-erable attention and research effort by nu-merous investigators. Amyloid fibrils areself-assembled filaments formed by the spon-taneous aggregation of a wide variety of pep-tides and proteins and these aggregates areassociated with severe debilitating diseases,such as Alzheimer’s disease, type 2 diabetes,prion diseases, Parkinson’s disease, senilesystemic amyloidosis and Huntington’s dis-ease (Sacchettini & Kelly, 2002; Dobson,2003; Stefani & Dobson, 2003; Selkoe, 2003).The interest of protein scientists arises fromfundamental questions regarding the mecha-nisms by which amyloid fibrils form frommonomeric or oligomeric species and the na-ture of the interactions that make amyloid fi-brils a stable structural state for polypeptidechains. Peculiar characteristics of amyloid fi-brils include the presence of the cross-� struc-tural motif (Sunde & Blake, 1997) and the un-usual resistance to proteolytic degradation.Information about the molecular structuresand stabilizing interactions of amyloid fibrils,as well as of the mechanisms of fibril forma-tion, is likely to be useful for the developmentof therapeutic strategies and drugs for amy-loid diseases (Cohen & Kelly, 2003).The results of a variety of recent studies

have indicated that amyloid fibril formationfrom native proteins occurs via a confor-mational change leading to the formation ofpartly folded intermediates and subsequentassociation of these protein species to formpre-fibrillar species given by soluble oligo-mers that susequently associate into well-or-dered mature fibrils (Dobson, 2003). Forma-tion of protein intermediates appear to becritical for the onset of fibril formation, sincethese species are capable of strong inter-

molecular interactions due to the exposure ofhydrophobic patches otherwise buried in theoverall fold of the native protein (Fink, 1998).This view led to the proposal that likely allpolypeptide chains can form amyloid aggre-gates under proper experimental conditionsthat promote formation of a partly denaturedstate of the protein chain (Dobson, 1999;2003).However, it should be emphasized that a

large proportion of physiologically relevantamyloid deposits in the tissue are given byprotein fragments deriving from relativelylarger proteins (Sacchettini & Kelly, 2002;Dobson, 2003). For example, the Alzheimer’sfibrils derive from the fragment(s) producedby limited proteolysis of the amyloid precur-sor protein (APP). Other examples of aggre-gating fragments include those of serum amy-loid A, gelsolin, apolipoprotein A1, prolactin,amylin, calcitonin, �2-microglobulin, trans-thyretin, medin, fibrinogen and others. Pro-tein fragments derived by limited proteolysisof proteins are particularly unstable, becauseusually they can only adopt partly foldedstates and cannot establish the long-range in-teractions present in the intact native pro-tein. In particular, a peculiar property of frag-ments is that they can display hydrophobicpatches that can cause protein aggregation(Fink, 1998). Therefore, it seems that proteinfragmentation by limited proteolysis can be acausative mechanism of several amyloiddiseases.Considering our research interests in protein

folding intermediates, protein fragments andmechanisms of proteolysis, we made an effortto contribute to the problem of proteinfibrillogenesis. First of all, we attempted to fol-low the aggregation phenomenon from pro-tofibrils to mature fibrils of a SH3 domain byusing far-UV CD spectroscopy, electron mi-croscopy and limited proteolysis experiments(Polverino de Laureto et al., 2003). The SH3domains are small protein modules of 60–85amino-acid residues that are found in manyproteins involved in intracellular signal

Vol. 51 Probing protein structure by limited proteolysis 309

transduction. The SH3 domain of the p85�subunit of bovine phosphatidyl-inositol3�-kinase (PI3-SH3) under acidic conditionsreadily forms amyloid fibrils. The process ofPI3-SH3 aggregation at low pH was monitoredby using pepsin as proteolytic probe. Remark-ably, the protein aggregates that are formedinitially display enhanced susceptibility to pro-teolysis, suggesting that the protein becomesmore unfolded and/or flexible in the earlystages of aggregation. By contrast, the moredefined amyloid fibrils that are formed overlonger periods of time are completely resistantto proteolysis. This observation appears to in-dicate that a rather unfolded/flexible state ofthe protein is that required to transformprotofibrils into the final well-ordered fibrillarstructures. Moreover, considering that the ini-tial pre-fibrillar aggregates are substantiallymore cytotoxic than mature fibrils (Buccian-tini et al., 2002), it was proposed that the moreunfolded early aggregates could expose spe-cific regions of the protein to the external envi-ronment, causing inappropriate binding tomembranes and other cellular componentsand leading to impairment of cellular viabilityand ultimately cell death.Hen egg-white lysozyme (LYS) was found to

form amyloid fibrils when incubated in vitroat pH 2.0 for several days (Krebs et al., 2000).We have analyzed the amyloid fibril forma-tion by hen LYS, since this well-characterizedprotein appears to be an excellent experimen-tal model to study the determinants of pro-tein aggregation. Amyloid fibrils from LYS,obtained under rather harsh solution condi-tions by incubating the protein for severaldays at pH 2.0 and 65°C, are mainly com-posed of protein fragments, primarily encom-passed by chain region 49–101 of the 129-res-idue chain of the protein (Frare et al., 2004).In order to gain further insights into the ag-gregation properties of LYS, the propensitiesof different fragments of the protein to aggre-gate were examined. Several fragments havebeen prepared by limited proteolysis of henLYS by pepsin at pH 0.9 and in the presence

of 2 M guanidine hydrochloride. Under thesesolvent conditions the protein adopts a partlyfolded state (Sasahara et al., 2000). Frag-ments 57–107 and 1–38/108–129 were abun-dant species in the proteolytic mixture, thelast fragment being a two chain species con-stituted by the N-terminal fragment 1–38 andthe C-terminal fragment 108–129 covalentlylinked by the two disulfide bridgesCys6–Cys127 and Cys30–Cys115. The pro-pensity of these fragments to aggregate in so-lution was studied in detail by CD, thioflavineT binding and electron microscopy. It wasshown that fragment 57–107, but not frag-ment 1–38/108–129, is able to generatewell-structured amyloid fibrils when incu-bated at pH 2.0, 37°C for 2–6 days, i.e., underthe same solvent conditions that causefibrillar assembly of the full-length LYS.These findings indicate that the polypeptidechain encompassing the �-domain and C-helixof LYS is a highly amyloidogenic region of theprotein. Of interest, this chain region wasshown previously to locally unfold in theamyloidogenic variant D67H of humanlysozyme causing systemic amyloidosis andthus populating a partly folded protein spe-cies that initiates aggregation events (Canetet al., 2002).

IDENTIFICATION AND PREPARATIONOF PROTEIN DOMAINS

Relatively large globular proteins are assem-blies of compactly folded substructures, usu-ally called domains or modules. Protein do-mains are almost invariably seen with pro-teins made up of more than about 100 amino-acid residues and often their existence can berecognized simply by visual inspection of the3D model of a protein molecule (Wetlaufer,1973; 1981). Many large proteins, such asthose involved in cell adhesion, clotting,fibrinolysis and signalling, are composed of aseries of functionally, distinct, autonomouslyfolding protein domains. Often, identification

310 A. Fontana and others 2004

of domains or modules in a large protein canbe reached simply from sequence analysis ofthe databases and, in recent years, there hasbeen a rapid advance in identifying familiesof multi-module proteins, such as the immu-noglobulins, fibronectins, kringles, SK2 andSK3 domains, EGF-containing proteins andothers (Campbell & Downing, 1994).Wetlaufer (1973) proposed that protein do-

mains could represent intermediates in thefolding process of globular proteins. It is con-ceivable to suggest that, in a multidomainprotein, specific segments of the unfoldedpolypeptide chain first refold to individual do-mains and that they subsequently associateand interact with each other to give the finaltertiary structure of the protein, much thesame as do subunits in oligomeric proteins.The major implication of this hierarchicalmodel of protein folding by a mechanism ofmodular assembly is that isolated proteinfragments corresponding to domains in theintact protein are expected to fold into a na-tive-like structure, thus resembling in theirproperties a small globular protein. The con-formation of the individual domains may bedifferent from that attained in the final nativeprotein, since in the assembled domains somenovel mutually stabilizing interactions maybe operative. Therefore, a strategy to over-come the complexity of the protein foldingproblem is to cleave a multi-domain proteininto fragments that correspond to domains inthe native protein and to study theirconformational features in isolation. Thisprotein dissection strategy has been usedwith several protein systems and in a numberof studies it has been demonstrated that someprotein fragments can behave as autonomousfolding units (AFU) (Wetlaufer, 1981; Wu etal., 1994; Peng & Wu, 2000). However, theproblem remains how to choose and preparesuitable protein fragments to be studied fortheir folding properties.The current rapid growth in the number of

even large protein structures solved by X-raymethods, together with the fact that domains

are the constituent building blocks of theseproteins, has prompted both theoretical andexperimental research on protein domains(Peng & Wu, 2000). Several computer algo-rithms were developed for the identificationof protein domains utilizing the C�-coordi-nates of protein structures derived fromX-ray analyses (Rose, 1979; Janin & Wodak,1983, Zehfus, 1987; Siddiqui & Barton, 1995).These algorithms, based on principles such asinterface area minimization, plane cutting,clustering, distance mapping, specific volumeminimization and compactness, allow a de-scription of globular proteins in terms of a hi-erarchic architecture given by elements ofsecondary structure (helices, strands), sub-domains (supersecondary structures or fold-ing units), domains and whole protein mole-cule. Therefore, the concept of protein do-mains appears to be a convenient way both tosimplify the description and classification ofprotein structures and to study protein fold-ing. In current literature there is no strict,universally accepted definition of a proteindomain, but a consensus view of a domain in-volves a compact, local and independent unitrelatively well separated from the rest of theprotein molecule.Limited proteolysis appears to be the best

experimental technique for splicing out afragment that can fold autonomously. Thesuccess of the technique resides in the factthat limited (specific) proteolysis of a globularprotein occurs at the “hinge” regions or con-necting segments between domains. Thesehinge regions between domains are usuallymore flexible than the rest of the polypeptidechain forming the globular units of the do-mains. Thus, a peptide bond fission at theflexible or unfolded chain region likely doesnot hamper overall structure and stability ofthe individual protein domains. Indeed,Neurath (1980) emphasized that proteolysisis expected to occur at the flexible hinges be-tween protein domains. Over the years, lim-ited proteolysis was found a most suitableprocedure to produce from large proteins in-

Vol. 51 Probing protein structure by limited proteolysis 311

dividual, autonomously folding domains forfurther structural and functional character-ization (Wetlaufer, 1981; Neurath, 1980;Peng & Wu, 2000). The most interesting ap-plications of the limited proteolysis approachfor producing protein domains were withmultifunctional proteins, allowing the isola-tion of fragments capable of independentfolding and displaying some of the activitiesof the parent protein, thus helping to clarifyboth structure and mechanism of biologicalactivity of complex proteins. For example,proteolytic cleavage of DNA polymerase leadsto a large fragment which catalyzes 3�–5�

exonuclease action on both single strandedand unpaired regions of double-strandedDNA, and to a small fragment which retainsonly 5�–3� exonuclease activity (Klenow &Hennigsen, 1970). Other examples are theseparation and isolation of kringles fromprothrombin, plasminogen, urokinase andplasminogen activator (Patthy et al., 1984)and the isolation of the �-carboxyl-glutamicacid (Gla) domains from coagulation factors(Esmon et al., 1983). Limited proteolysis hasbeen used also to define the boundaries of adomain by removing its flexible and unstruc-tured parts (Dalzoppo et al., 1985; Darby etal., 1996). Finally, the chain flexibility notionexplains the fact that, in certain proteins, therather loose parts can be removed by limitedproteolysis, allowing the isolation of the“core” of the protein as the most proteoly-sis-resistant moiety and thus as a stable pro-tein entity (Vindigni et al., 1994).In our laboratory we have conducted sys-

tematic conformational studies on proteinfragments derived from thermolysin, a two-domain metallo-protease (Vita & Fontana,1982; Fontana et al., 1983; Vita et al., 1984).It was found that several C-terminal frag-ments of thermolysin are capable of foldinginto a native-like structure independentlyfrom the rest of the polypeptide chain, thuspossessing protein domain properties. Thesefragments of thermolysin were initially pre-pared by cyanogen bromide cleavage of the

protein at the level of methionine residues inposition 120 and 205 of the 316-residue chainof the protein (Vita et al., 1982; 1984). Ofcourse the dimensions of the fragments, forexample those of the most studied fragment206–316, were dictated by the location of theMet residue in the protein chain. With theaim to define the minimum size of athermolysin C-terminal fragment capable toacquire a stable native-like conformation,Dalzoppo et al. (1985) digested the thermo-lysin fragment 206–316 by means of severalproteases. From the kinetics of proteolysis di-gestion and analysis of the isolatedsubfragments, it was found that the rathershort fragment 255–316 was quite resistantto further proteolysis, implying a tightlyfolded conformation. Indeed, both CD spec-troscopy (Dalzoppo et al., 1985) and, in partic-ular, NMR measurements (Rico et al., 1994)provided a clear-cut evidence of a stable, na-tive-like structure of this small 62-residuefragment.

PREPARATION OF COMPLEMENTING

PROTEIN FRAGMENTS

The utility of peptide fragments for analyz-ing features of protein structure and foldinghas been recognized long time ago, as givenby the pioneering studies on fragments ofribonuclease A and staphylococcal nuclease(Anfinsen & Scheraga, 1975). A specific aimof these early studies was to develop suitableexperimental conditions to produce ratherlong protein fragments by limited proteolysisof the protein and to reconstitute a folded,functionally active “nicked” protein, i.e., thenoncovalent complex of (usually two) proteinfragments (Taniuchi et al., 1986). Over theyears a number of nicked proteins, or comple-menting fragment systems, has been de-scribed and the results of these studies pro-vided information about principles underly-ing protein structure, folding and dynamics(Fischer & Taniuchi, 1992; dePrat-Gay, 1996).

312 A. Fontana and others 2004

In general, it was proposed that fragment as-sociation proceeds through the formation inthe individual fragments of quite unstable“native formats” or folding subdomains that,upon association, acquire a more rigid andstable 3D structure (Anfinsen & Scheraga,1975; Tsai et al., 1998).It has been found that functionally active,

nicked proteins can be formed by a combina-tion of fragments in such a way that the dis-continuity of the polypeptide chain occurs atexposed and flexible sites (usually loops) ofthe folded protein, outside regions of regularsecondary structure (Taniuchi et al., 1986).For this reason, limited proteolysis of globu-lar proteins proved to be the most suitabletechnique to produce nicked proteins result-ing from usually two protein fragments capa-ble of autonomous folding and self associa-tion. This success derives from the fact thatlimited proteolysis occurs at flexible sitesalong the protein chain and that stable, auton-omous folded fragments are more resistant tofurther proteolytic degradation than rela-tively unstructured fragments (Fontana et al.,1997a; 1999). Recently, we have applied thelimited proteolysis approach to produce setsof complementing fragments from humangrowth hormone (hGH) (Spolaore et al.,2004), horse apoMb (Musi et al., 2004) andhorse cytochrome c (CYT) (Spolaore et al.,2001). With these proteins experimental con-ditions have been devised to cleave them at asingle peptide bond, thus producing for eachprotein a set of two fragments covering theentire protein chain. Besides the identifica-tion of the flexible sites along the proteinchain in the intact proteins, our studies led tothe discovery of three novel systems of com-plementing fragments. Each of these two-fragment systems (1–88/89–153 of apoMb,1–44/45–191 of hGH and 1–56/57–104 ofCYT) produce a stable protein complex(nicked protein) possessing a native-like 3Dstructure. Conformational studies conductedby CD spectroscopy on the isolated fragmentsrevealed that they adopt partially folded

states and that only upon complementationthey acquire the specific long-range interac-tions that are essential for attaining the struc-ture of the intact protein. Overall, our resultsindicate that protein fragment complemen-tation is a valuable tool to study the funda-mental steps of the folding process, thus sim-plifying the difficult problem of protein fold-ing (Tsai et al., 1998). Here, we summarizethe results of our studies conducted on nickedhGH, apoMb and CYT.Limited proteolysis of hGH by pepsin at low

pH 4.0 occurs at the level of the Phe44–Leu45peptide bond, leading to the production offragments 1–44 and 45–191 (Spolaore et al.,2004) (see Fig. 4). Thus, proteolysis data indi-cate that in acid solution hGH adopts a partlyfolded state characterized by a local unfoldingof the first mini-helix (residues 38–47) en-compassing the Phe44–Leu45 peptide bond.Fragment 1–44 was shown to retain little sec-ondary and tertiary structure at neutral pH,while fragment 45–191 independently foldsinto a highly helical secondary structure. Thetwo peptidic fragments are able to associateinto a stable and native-like hGH complex1–44/45–191. Of interest, hGH has both in-sulin-like and diabetogenic effects and twofragments of hGH occur in vivo and exertthese two opposite activities, namely frag-ment 1–43 showing an insulin-potentiatingeffect and fragment 44–191 a diabetogenicactivity. It was suggested that the confor-mational changes of hGH induced by anacidic pH promote the generation of the twophysiologically relevant fragments by pro-teolytic processing of the hormone. Likely,limited proteolysis of hGH at low pH is physi-ologically relevant, since the hormone is ex-posed to an acidic environment in the cell.The study of Spolaore et al. (2004) reports forthe first time the analysis of the confor-mational features of the two individual func-tional domains of hGH and of their complex.Proteolysis of the 153-residue chain of horse

apoMb by thermolysin results in the selectivecleavage of the peptide bond Pro88–Leu89

Vol. 51 Probing protein structure by limited proteolysis 313

(see Fig. 2). The N-terminal (residues 1–88)and C-terminal (residues 89–153) fragmentsof apoMb were isolated to homogeneity andtheir conformational and association proper-ties investigated in detail. Far-UV CD mea-surements revealed that both fragments inisolation acquire a high content of helical sec-ondary structure, while near-UV CD indicatedthe absence of tertiary structure. A 1:1 mix-ture of the fragments leads to a tight non-covalent protein complex (1–88/89–153,nicked apoMb), characterized by secondaryand tertiary structures similar to those of in-tact apoMb. The apoMb complex binds hemein a native-like manner, as given by CD mea-

surements in the Soret region. Moreover, inanalogy to intact apoMb, the nicked proteinbinds the hydrophobic dye 1-anilino-naphthalene-8-sulfonate. It was concludedthat the two proteolytic fragments 1–88 and89–153 of apoMb adopt partly folded statescharacterized by sufficiently native-likeconformational features that promote theirspecific association and mutual stabilizationinto a nicked protein species much resem-bling in its structural features intact apoMb.It was suggested that the formation of anoncovalent complex upon fragment com-plementation can mimic the protein foldingprocess of the entire protein chain, with the

314 A. Fontana and others 2004

Figure 4. Complementing fragments of human growth hormone (hGH).

(Top) Schematic 3D structure of hGH. The four major helices and the three minor helices are shown as ribbons, theremaining residues are represented as a string and the two disulfide bridges (Cys53–Cys165 and Cys182–Cys189)are indicated by grey sticks. The site of the Phe44–Leu45 peptide bond cleavage by pepsin at pH 4.0 is indicated byan arrow. Segments of the 3D structure of hGH corresponding to fragment 1–44 and fragment 45–191 are shownin red and yellow, respectively. The model was constructed from the X-ray structure of hGH (PDB file 3HHR) usingthe program WebLab Viewer Pro 4.0 (Molecular Simulations Inc., San Diego, CA, U.S.A.). (Bottom) Scheme of thesecondary structure of hGH. The main boxes indicate the helical segments of the four-helix bundle in hGH, smallerboxes indicate the three short helical segments, whereas disulfide bonds are represented by a solid line. Fragment1–44 (red) and fragment 45–191 (yellow) are colored as in the 3D model of hGH (Top).

difference that the folding of the complemen-tary fragments 1–88 and 89–153 is an inter-molecular process (Tsai et al., 1998). Consid-ering that apoMb has been extensively usedas a paradigm in protein folding studies sincefew decades, the novel fragment complement-ing system of apoMb appears to be very usefulfor investigating the initial as well as lateevents in protein folding.Limited proteolysis experiments have been

used by Spolaore et al. (2001) to monitor thefolding of a polypeptide chain from a ratherunstructured state to a folded, helical state.The N- and C-terminal fragment 1–56 and57–104, respectively, of horse CYT wereused. It was shown that the folding of thepolypeptide chain, as given by far-UV CDmeasurements, resulting from the associa-tion of the two fragments into a folded na-tive-like complex, can be monitored by usingproteinase K as proteolytic probe. It has beendemonstrated that the simple biochemicaltechnique of limited proteolysis can provideuseful protein structural data, complement-ing those obtained by the commonly used CDtechnique. A specific interest of the comple-menting fragment system 1–56/57–104 ofCYT resides in the fact that the individualfragments correspond exactly to the two exonproducts of the CYT gene.

CONCLUDING REMARKS

The sites of limited proteolysis (nicksites) inglobular proteins of known 3D structure arecharacterized by enhanced chain flexibility orsegmental mobility (Fontana et al., 1986;1993; 1999). Often, the sites of specific fissionare located at regions even devoid of struc-ture, i.e., at disordered protein regions forwhich no recognizable signal appears in theelectron density maps. This is in keeping withthe general notion that the protein substrateshould suffer considerable structural changein order to properly bind at the precisestereochemistry of the protease’s active site

in order to form the idealized transition stateof the hydrolytic reaction (induced fit mecha-nism; Herschlag, 1988). Proteolytic enzymes,therefore, can be used to pinpoint, in globularproteins, sites (loops or turns) or regionscharacterized by local unfolding.A specific advantage of using proteolytic

probes for probing protein structure and dy-namics is that they can provide data on the so-lution structure of a protein, even if thesedata do not reach the high resolution levelgiven by other physicochemical techniques(NMR, X-ray). The limited proteolysis tech-nique is simple to use and modest in demandsfor protein sample requirements, instrumen-tation and experimental efforts. In the past,the technique was somewhat difficult to ap-ply, since the analytical methods required toisolate and characterize protein fragmentswere labor intensive and not sensitiveenough. The present availability of automatic,efficient and highly sensitive techniques ofprotein sequencing and, in particular, the re-cent dramatic advances in mass spectrometryin analyzing peptides and proteins, likely willprompt a much systematic use of the limitedproteolysis approach as a simple first step inthe elucidation of molecular features of anovel and rare protein, especially if availablein minute amounts.We wish to emphasize again that quite often

in past and current literature limited proteol-ysis events are wrongly interpreted in termsof “exposure” of the site(s) of cleavage(Novotny & Bruccoleri, 1987). Of course, thenotion of accessibility is a required propertyof the sites of cleavage in order that the bimo-lecular reaction between the protease and theprotein substrate can take place, but not at allsufficient to explain the selective proteolysisof one single peptide bond among hundred(s)bonds, as often observed in limited proteoly-sis experiments. There are plenty of exposedpeptide bonds in a globular protein, but theone that is cleaved should be embedded in ahighly flexible or unstructured chain region(Fontana et al., 1986; 1993; 1997a; 1997b;

Vol. 51 Probing protein structure by limited proteolysis 315

1999). Therefore, aiming to probe the “sur-face topography” of a protein by the limitedproteolysis approach is simply unfounded. In-stead, the approach is eminently suitable topinpoint in a globular protein the sites ofchain flexibility or local unfolding. The corre-lation between the sites of limited proteolysisand the mobile sites detected by X-ray orNMR methods (Fontana et al., 1986; 1997a;1997b), as well as by molecular dynamics sim-ulations (Stella et al., 1999; Falconi et al.,2002), has been amply documented. In this re-spect, we may mention that, in a recent study,disorderd chain regions in protein structureswere identified by both proteolysis experi-ments and predictions of their location alongthe polypeptide chain by the neural networkprogram PONDR (Prediction of Natural Dis-ordered Regions; Iakoucheva et al., 2001).Predictions nicely correlated with the resultsof limited proteolysis, thereby indicating thatchain disorder or flexibility is the key parame-ter dictating limited proteolysis events inproteins.

R E F E R E N C E S

Anfinsen CB, Scheraga HA. (1975) Experimen-tal and theoretical aspects of protein folding.Adv Protein Chem.; 29: 205–300.

Arai M, Kuwajima K. (2000) Role of the moltenglobule state in protein folding. Adv ProteinChem.; 53: 209–82.

Bond JS. (1990) Commercially available pro-teases. In Proteolytic Enzymes: A PracticalApproach. Beynon RJ, Bond JS, eds, pp232–40. IRL Press, Oxford.

Brooks CL. (1992) Characterization of “native”apomyoglobin by molecular dynamics simula-tion. J Mol Biol.; 227: 375–80.

Bucciantini M, Giannoni F, Chiti F, Baroni F,Formigli L, Zurdo J, et al. (2002) Inherenttoxicity of aggregates implies a commonmechanism for protein misfolding diseases.Nature.; 416: 507–11.

Bychkova VE, Ptitsyn OB. (1995) Folding inter-mediates are involved in genetic diseases?FEBS Lett.; 359: 6–8.

Campbell D, Downing AK. (1994) Building pro-tein structure and function from modularunits. Trends Biotechnol.; 12: 168–74.

Canet D, Last AM, Tito P, Sunde M, Spencer A,Arche DB, Redfield C, Robinson CV, DobsonCM. (2002) Local cooperativity in the unfold-ing of an amyloidogenic variant of humanlysozyme. Nature Struct Biol.; 9: 308–14.

Cawthern KM, Narayan M, Chaudhuri D,Permyakov EA, Berliner LJ. (1997) Interac-tions of �-lactalbumin with fatty acids andspin label analogs. J Biol Chem.; 272:30812–6.

Chamberlain AK, Marqusee S. (2000) Compari-son of equilibrium and kinetic approachesfor determining protein folding mechanisms.Adv Protein Chem.; 53: 283–327.

Chrysina ED, Brew K, Acharya KR. (2000)Crystal structures of apo- and holo-bovine �-lactalbumin at 2.2-� resolution reveal an ef-fect of calcium on inter-lobe interactions. JBiol Chem.; 275: 37021–9.

Cohen FE, Kelly JW. (2003) Therapeutic ap-proaches to protein-misfolding diseases. Na-ture.; 426: 905–9.

Dalzoppo D, Vita C, Fontana A. (1985) Foldingof thermolysin fragments: Identification ofthe minimum size of a carboxyl-terminalfragment that can fold into a stable nativelike structure. J Mol Biol.; 182: 331–40.

Darby NJ, Kemmink J, Creighton TE. (1996)Identifying and characterizing a structuraldomain of protein disulfide isomerase. Bio-chemistry.; 35: 10517–28.

de Prat-Gay G. (1996) Association of comple-mentary fragments and the elucidation ofprotein folding pathways. Protein Eng.; 9:843–7.

Dobson CM. (1994) Solid evidence for moltenglobules. Curr Biol.; 4: 636–40.

Dobson CM. (1999) Protein misfolding, evolu-tion and disease. Trends Biochem Sci.; 24:329–32.

316 A. Fontana and others 2004

Dobson CM. (2003) Protein folding and disease:A view from the first Horizon Symposium.Nature Rev Drug Discov.; 2: 154–60.

Dunker AK, Obradovic Z. (2001) The proteintrinity: Linking function and disorder. Na-ture Biotechnol.; 19: 805–6.

Dunker AK, Brown CJ, Lawson JD, IakouchevaCM, Obradovic Z. (2002) Intrinsic disorderand protein function. Biochemistry.; 41:6573–80.

Eliezer D, Wright PE. (1996) Is apomyoglobin amolten globule? Structural characterizationby NMR. J Mol Biol.; 263: 531–8.

Eliezer D, Yao J, Dyson HJ, Wright PE. (1998)Structural and dynamic characterization ofpartially folded states of apomyoglobin andimplications for protein folding. NatureStruct Biol.; 5: 148–55.

Esmon NL, DeBault LE, Esmon CT. (1983)Proteolytic formation and properties ofgamma-carboxyglutamic acid-domainless pro-tein C. J Biol Chem.; 258: 5548–53.

Evans PA, Radford SE. (1994) Probing thestructure of folding intermediates. Curr OpinStruct Biol.; 4: 100–6.

Falconi M, Parrilli L, Battistoni A, Desideri A.(2002) Flexibility in monomeric Cu,Znsuperoxide dismutase detected by limitedproteolysis and molecular dynamics simula-tion. Proteins Struct Funct Genet.; 47:513–20.

Fink AL. (1995) Compact intermediate states inprotein folding. Annu Rev Biomol Struct.;24: 495–522.

Fink AL. (1998) Protein aggregation: Folding in-termediates, inclusion bodies and amyloid.Folding Des.; 3: 9–23.

Fisher A, Taniuchi H. (1992) A study of coredomains, and the core domain-domain inter-action of cytochrome c fragment complex.Arch Biochem Biophys.; 296: 1–16.

Fontana A. (1989) Limited proteolysis of globu-lar proteins occur at exposed and flexibleloops. In Highlights of Modern Biochemistry.Kotyk A, Skoda J, Pacek V, Kostka W, eds,pp 1711–26. VSP Int Publ, Amsterdam.

Fontana A, Vita C, Chaiken IM. (1983) Domaincharacteristics of the carboxyl-terminal frag-ment 206–316 of thermolysin. Biopolymers.;22: 69–78.

Fontana A, Fassina G, Vita C, Dalzoppo D,Zamai M, Zambonin M. (1986) Correlationbetween sites of limited proteolysis and seg-mental mobility in thermolysin. Biochemis-try.; 25: 1847–51.

Fontana A, Vita C, Dalzoppo D, Zambonin M.(1989) Limited proteolysis as a tool to detectstructure and dynamic features of globularproteins: Studies on thermolysin. In Methodsin Protein Sequence Analysis. Wittman-Liebold B, ed, pp 315–24. Springer-Verlag,Berlin.

Fontana A, Polverino de Laureto P, De FilippisV. (1993) Molecular aspects of proteolysis ofglobular proteins. In Protein Stability andStabilization. van den Tweel W, Harder A,Buitelear M, eds, pp 101–10. Elsevier Sci-ence Publ, Amsterdam.

Fontana A, Polverino de Laureto P, De FilippisV, Scaramella E, Zambonin M. (1997a)Probing the partly folded states of proteinsby limited proteolysis. Folding Des.; 2:R17–26.

Fontana A, Zambonin M, Polverino de LauretoP, De Filippis V, Clementi A, Scaramella E.(1997b) Probing the conformational state ofapomyoglobin by limited proteolysis. J MolBiol.; 266: 223–30.

Fontana A, Polverino de Laureto P, De FilippisV, Scaramella E, Zambonin M. (1999) Lim-ited proteolysis in the study of protein con-formation. In Proteolytic Enzymes: Tools andTargets. Sterchi EE, Stöcker W, eds, pp257–84. Springer Verlag, Heidelberg.

Frare E, Polverino de Laureto P, Zurdo J,Dobson CM, Fontana A. (2004) A highlyamyloidogenic region of hen lysozyme. J MolBiol.; in press.

Frauenfelder H, Petsko GA, Tsernoglou D.(1979) Temperature-dependent X-raydiffraction as a probe of protein structuraldynamics. Nature.; 280: 558–63.

Vol. 51 Probing protein structure by limited proteolysis 317

Griko YV, Remeta DP. (1999) Energetics of sol-vent and ligand-induced conformationalchanges in �-lactalbumin. Protein Sci.; 8:554–61.

Griko YV, Freire E, Privalov PL. (1994)Energetics of �-lactalbumin states: A calori-metric and statistical thermodynamics study.Biochemistry.; 33: 1889–901.

H�kansson A, Zhivotovsky B, Orrenius S,Sabharwal H, Svanborg C. (1995) Apoptosisinduced by a human milk protein. Proc NatlAcad Sci USA.; 92: 8064–8.

H�kansson A, Andreasson J, Zhivotovsky B,Karpman D, Orrenius S, Svanborg C. (1999)Multimeric �-lactalbumin from human milkinduces apoptosis through a direct effect oncell nuclei. Exp Cell Res.; 246: 451–60.

H�kansson A, Svensson M, Mossberg AK,Sabharwal H, Linse S, Lazon I, Lönnerdal B,Svanborg C. (2000) A folding variant of�-lactalbumin with bactericidal activityagainst Streptococcus pneumoniae. MolMicrobiol.; 35: 589–600.

Herschlag D. (1988) The role of induced fit andconformational changes of enzymes in speci-ficity and catalysis. Bioorganic Chem.; 16:62–96.

Hirst JD, Brooks CL. (1995) Molecular dynam-ics simulations of isolated helices ofmyoglobin. Biochemistry.; 34: 7614–21.

Hubbard SJ, Campbell SF, Thornton JM. (1991)Molecular recognition: conformational analy-sis of limited proteolytic sites and serineproteinase protein inhibitors. J Mol Biol.;220: 507–30.

Hubbard SJ. (1998) The structural aspects oflimited proteolysis of native proteins.Biochim Biophys Acta.; 1382: 191–206.

Hubbard SJ, Eisenmenger F, Thornton JM.(1994) Modeling studies of the change inconformation required for cleavage of limitedproteolytic sites. Protein Sci.; 3: 757–68.

Iakoucheva LM., Kimzey AL, Masslon CD,Bruce JE, Garner EC, Brown CJ, DunkerAK, Smith RD, Ackerman EJ. (2001) Identi-fication of intrinsic order and disorder in the

DNA repair protein XPA. Protein Sci.; 10:560–71.

Janin J, Wodak SJ. (1983) Structural domainsin proteins and their role in the dynamics ofprotein function. Prog Biophys Mol Biol.; 42:21–78.

Kataoka M, Kuwajima K, Tokunaga F, Goto Y.(1997) Structural characterization of the mol-ten globule of �-lactalbumin by solutionX-ray scattering. Protein Sci.; 6: 422–30.

Klenow H, Henningsen I. (1970) Selective elimi-nation of the exonuclease activity of the de-oxyribonucleic acid polymerase from Esche-richia coli B by limited proteolysis. Proc NatlAcad Sci USA.; 65: 168–75.

Köhler C, H�kansson A, Svanborg C, OrreniusS, Zhivotovsky B. (1999) Protease activationin apoptosis induced by MAL. Exp Cell Res.;249: 260–8.

Köhler C, Gogradze V, H�kansson A, SvanborgC, Orrenius S, Zhivotovsky B. (2001) A fold-ing variant of human �-lactalbumin inducesmitochondrial permeability transition in iso-lated mitochondria. Eur J Biochem.; 268:186–91.

Krebs MR, Wilkins DK, Chung EW, PitkeathlyMC, Chamberlain AK, Zurdo J, RobinsonCV, Dobson CM. (2000) Formation andseeding of amyloid fibrils from wild-type henlysozyme and a peptide fragment from thebeta-domain. J Mol Biol.; 300: 541–9.

Krokoszynska I, Otlewski J. (1996) Thermody-namic stability effects of single peptide bondhydrolysis in protein inhibitors of serine pro-teases. J Mol Biol.; 256: 793–802.

Kuwajima K. (1989) The molten globule stateas a clue for understanding the folding andcooperativity of globular protein structure.Proteins Struct Funct Genet.; 6: 87–103.

Kuwajima K. (1996) The molten globule state of�-lactalbumin. FASEB J.; 10: 102–9.

Lebherz HG, Burke T, Shackelford JE, StricklerJE, Wilson KJ. (1986) Specific proteolyticmodification of creatine kinase isoenzymes.Implication of C-terminal involvement in en-zymic activity but not in subunit-subunit rec-ognition. Biochem J.; 233: 51–6.

318 A. Fontana and others 2004

Lecomte JTJ, Kao YH, Cocco MJ. (1996) Thenative state of apomyoglobin described byproton NMR spectroscopy: The A-B-G-H in-terface of wild-type sperm whale apomyo-globin. Proteins Struct Funct Genet.; 25:267–85.

Lecomte JTJ, Sukits SF, Bhattacharya S,Falzone CJ. (1999) Conformational proper-ties of native sperm whale apomyoglobin insolution. Protein Sci.; 8: 1484–91.

Mihalyi E. (1978) Application of Proteolytic En-zymes to Protein Structure Studies. CRCPress, Boca Raton, FL.

Musi V, Spolaore B, Picotti P, Zambonin M, DeFilippis V, Fontana A. (2004) Nickedapomyoglobin: A noncovalent complex of twopolypeptide fragments comprising the entireprotein chain. Biochemistry.; 43: 6230–40.

Neurath H. (1980) Limited proteolysis, proteinfolding and physiological regulation. In Pro-tein Folding. Jaenicke R, ed, pp 501–4.Elsevier/North Holland Biomedical Press,Amsterdam-New York.

Novotny J, Bruccoleri RE. (1987) Correlationamong sites of limited proteolysis, enzymeaccessibility and segmental mobility. FEBSLett.; 211: 185–9.

Ohgushi M, Wada A. (1983) Molten globulestate: A compact form of globular proteinswith mobile side-chain. FEBS Lett.; 164:21–4.

Onufriev A, Case DA, Bashford D. (2003) Struc-tural details, pathways and energetics of un-folding apomyoglobin. J Mol Biol.; 325:555–67.

Patthy L, Trexler M, Vali Z, Banyai L, VaradiA. (1984) Kringles, modules specialized forprotein binding: Homology of the gela-tin-binding region of fibronectin with thekringle structures of proteases. FEBS Lett.;171: 131–6.

Peng ZY, Kim PS. (1994) A protein dissectionstudy of a molten globule. Biochemistry.; 33:2136–41.

Peng ZY, Wu LC. (2000) Autonomous proteinfolding units. Adv Protein Chem.; 53: 1–47.

Permyakov EA, Berliner LJ. (2000)�-Lactalbumin: structure and function. FEBSLett.; 473: 269–74.

Permyakov EA, Permyakov SE, Berliner LJ.(2003) �-Lactalbumin: Ca2+-binding proteinwith multiple functions. In Protein Struc-tures: Kaleidoscope of Structural Propertiesand Functions. Uversky VN, ed, pp 475–97.Research Signpost, Kerala, India.

Picotti P, Marabotti A, Negro A, Musi V,Spolaore B, Zambonin M, Fontana A. (2004)Modulation of the structural integrity of he-lix F in apomyoglobin by single amino acidreplacements. Protein Sci.; 13: 1572–85.

Pike ACW, Brew K, Acharya KR. (1996) Crystalstructures of guinea-pig, goat and bovine�-lactalbumin highlight the enhancedconformational flexibility of regions that aresignificant for its action in lactose synthase.Structure.; 4: 691–703.

Polverino de Laureto P, De Filippis V, Di BelloM, Zambonin M, Fontana A. (1995) Probingthe molten globule state of �-lactalbumin bylimited proteolysis. Biochemistry.; 34:12596–604.

Polverino de Laureto P, Vinante D, ScaramellaE, Frare E, Fontana A. (2001) Stepwiseproteolytic removal of the beta subdomain in�-lactalbumin: The protein remains foldedand can form the molten globule in acid so-lution. Eur J Biochem.; 268: 4324–33.

Polverino de Laureto P, Frare E, Gottardo R,van Dael H, Fontana A. (2002a) Partlyfolded states of members of thelysozyme/lactalbumin superfamily: A com-parative study by circular dichroism spec-troscopy and limited proteolysis. Protein Sci.;11: 2932–46.

Polverino de Laureto P, Frare E, Gottardo R,Fontana A. (2002b) Molten globule of bovine�-lactalbumin at neutral pH induced by heat,trifluoroethanol and oleic acid: a compara-tive analysis by circular dichroism and lim-ited proteolysis. Proteins Struct Funct Genet.;49: 385–97.

Polverino de Laureto P, Taddei N, Frare E,Capanni C, Costantini S, Zurdo J, Chiti F,Dobson CM, Fontana A. (2003) Protein ag-

Vol. 51 Probing protein structure by limited proteolysis 319

gregation and amyloid fibril formation by anSH3 domain probed by limited proteolysis. JMol Biol.; 334: 129–41.

Price NC, Johnson CM. (1990) Proteinases asprobes of conformation of soluble proteins.In Proteolytic Enzymes: A Practical Approach.Beynon RJ, Bond, JS, eds, pp 163–80. IRLPress, Oxford.

Ptitsyn OB. (1987) Protein folding: hypothesesand experiments. J Protein Chem.; 6:273–93.

Ptitsyn OB. (1995) Molten globule and proteinfolding. Adv Protein Chem.; 47: 83–229.

Ptitsyn OB, Pain RH, Semisotnov GV, ZerovnikE, Razgulgaev OI. (1990) Evidence for a mol-ten globule state as a general intermediatein protein folding. FEBS Lett.; 262: 20–4.

Rico M, Jimenez MA, Gonzalez C, De Filippis V,Fontana A. (1994) NMR solution structure ofthe C-terminal fragment 255–316 ofthermolysin: a dimer formed by subunitshaving the native structure. Biochemistry.;33: 14834–47.

Ringe D, Petsko GA. (1985) Mapping protein dy-namics by X-ray diffraction. Prog BiophysMol Biol.; 45: 197–235.

Romero P, Obradovic Z, Li X, Garner EC,Brown CJ, Dunker AK. (2001) Sequencecomplexity of disordered proteins. Proteins.;42: 38–48.

Rose GD. (1979) Hierarchic organization of do-mains in globular proteins. J Mol Biol.; 134:447–70.

Sacchettini JC, Kelly JW. (2002) Therapeuticstrategies for human amyloid diseases. NatRev Drug Discov.; 4: 267–75.

Sasahara K, Demura M, Nitta K. (2000) Par-tially unfolded equilibrium state of henlysozyme studied by circular dichroism spec-troscopy. Biochemistry.; 39: 6475–82.

Schechter I, Berger A. (1967) On the size of theactive site in proteases. I. Papain. BiochemBiophys Res Commun.; 27: 157–62.

Schulman BA, Redfield C, Peng ZY, DobsonCM, Kim PS. (1995) Different subdomainsare most protected from hydrogen exchange

in molten globule and native states of hu-man �-lactalbumin. J Mol Biol.; 253: 651–7.

Schulman BA, Kim PS, Dobson CM, Redfield C.(1997) A residue-specific NMR view of thenon-cooperative unfolding of a molten glob-ule. Nat Struct Biol.; 4: 630–4.

Selkoe DJ. (2003) Folding proteins in fatalways. Nature.; 426: 900–4.

Siddiqui AS, Barton GJ. (1995) Continuous anddiscontinuous domains: An algorithm for theautomatic generation of reliable protein do-main definitions. Protein Sci.; 4: 872–84.

Spolaore B, Bermejo R, Zambonin M, FontanaA. (2001) Protein interactions leading toconformational changes monitored by limitedproteolysis: Apo form and fragments ofhorse cytochrome c. Biochemistry.; 40:9460–8.

Spolaore B, Polverino de Laureto P, ZamboninM, Fontana A. (2004) Limited proteolysis ofhuman growth hormone at low pH: Isola-tion, characterization and complementationof the two biologically relevant fragments1–44 and 45–191. Biochemistry.; 43:6576–86.

Stefani M, Dobson CM. (2003) Protein aggrega-tion and aggregate toxicity: New insightsinto protein folding, misfolding diseases andbiological evolution. J Mol Med.; 81: 678–99.

Stella L, Di Iorio EE, Nicotra M, Ricci G.(1999) Molecular dynamics simulations ofhuman glutathione transferase P1-1:Conformational fluctuations of the apo struc-ture. Proteins Struct Funct Genet.; 37: 10–9.

Sternberg MJE, Grace DEP, Phillips DC. (1979)Dynamic information from protein crystal-lography: An analysis of temperature factorsfrom refinement of the hen egg-whitelysozyme. J Mol Biol.; 130: 231–53.

Sunde M, Blake C. (1997) The structure of amy-loid fibrils by electron microscopy and X-raydiffration. Adv Protein Chem.; 50: 123–59.

Svensson M, Sabharwal H, H�kansson A,Mossberg AK, Lipniunas P, Leffler H,Svanborg C, Linse S. (1999) Molecular char-acterization of �-lactalbumin folding variants

320 A. Fontana and others 2004

that induce apoptosis in tumor cells. J BiolChem.; 274: 6388–96.

Svensson M, H�kansson A, Mossberg AK, LinseS, Svanborg C. (2000) Conversion of�-lactalbumin to a protein inducingapoptosis. Proc Natl Acad Sci USA.; 97:4221–6.

Taniuchi H, Parr GR, Juillerat MA. (1986)Complementation in folding and fragment ex-change. Methods Enzymol.; 131: 185–217.

Tirado-Rives J, Jorgensen WL. (1993) Moleculardynamics simulations of the unfolding ofapomyoglobin in water. Biochemistry.; 32:4175–84.

Tsai CJ, Xu D, Nussinov A. (1998) Protein fold-ing via binding and viceversa. Folding Des.; 3:R71–80.

Uversky VN. (2002) Natively unfolded proteins:A point where biology waits for physics. Pro-tein Sci.; 11: 739–56.

Uversky VN, Gillespie JR, Fink AL. (2000) Whyare “natively unfolded” proteins unstruc-tured under physiological conditions? Pro-teins.; 41: 415–27.

Vindigni A, De Filippis V, Zanotti G, Visco C,Orsini G, Fontana A. (1994) Probing thestructure of hirudin from Hirudinariamanillensis by limited proteolysis: Isolation,characterization and thrombin-inhibitoryproperties of N-terminal fragments. Eur JBiochem.; 226: 323–33.

Vita C, Fontana A. (1982) Domain characteris-tics of the carboxyl-terminal fragment206–316 of thermolysin: unfolding thermody-namics. Biochemistry.; 21: 5196–202.

Vita C, Dalzoppo D, Fontana A. (1984) Inde-pendent folding of the carboxyl-terminal frag-ment 228–316 of thermolysin. Biochemistry.;23: 5512–9.

Wetlaufer DB. (1973) Nucleation, rapid foldingand globular intrachain regions in proteins.Proc Natl Acad Sci USA.; 70: 697–701.

Wetlaufer DB. (1981) Folding of protein frag-ments. Adv Protein Chem.; 34: 61–92.

Wijesinha-Bettoni R, Dobson CM, Redfield C.(2001) Comparison of the structural and dy-namical properties of holo and apo bovine�-lactalbumin by NMR spectroscopy. J MolBiol.; 307: 885–98.

Wright PE, Dyson HJ. (1999) Intrinsically un-structured proteins: Reassessing the proteinstructure-function paradigm. J Mol Biol.;203: 321–31.

Wu LC, Grandori R, Carey J. (1994) Autono-mous subdomains in protein folding. ProteinSci.; 3: 369–71.

Wu LC, Peng ZY, Kim PS. (1995) Bipartitestructure of the �-lactalbumin molten glob-ule. Nat Struct Biol.; 2: 281–6.

Yutani K, Ogasahara K, Kuwajima K. (1992)Absence of the thermal transition inapo-�-lactalbumin in the molten globulestate: A study by differential scanningmicrocalorimetry. J Mol Biol.; 228: 347–50.

Zehfus MH. (1987) Continuous compact proteindomains. Proteins Struct Funct Genet.; 2:90–110.

Vol. 51 Probing protein structure by limited proteolysis 321