A designed protein as experimental model of primordial folding

6
A designed protein as experimental model of primordial folding Mourad Sadqi a,b , Eva de Alba b , Rau ´ l Pe ´ rez-Jime ´ nez a,c,1 , Jose M. Sanchez-Ruiz c , and Victor Mun ˜ oz a,b,2 a Centro de Investigaciones Biolo ´ gicas, Consejo Superior de Investigaciones Científicas, Ramiro de Maeztu 9, Madrid 28040, Spain; b Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742; and c Departamento de Química-Física, Facultad de Ciencias, Universidad de Granada, Granada 18071, Spain Edited by Peter G. Wolynes, University of California at San Diego, La Jolla, CA, and approved January 9, 2009 (received for review November 27, 2008) How do proteins accomplish folding during early evolution? The- oretically the mechanism involves the selective stabilization of the native structure against all other competing compact conforma- tions in a process that involves cumulative changes in the amino acid sequence along geological timescales. Thus, an evolved pro- tein folds into a single structure at physiological temperature, but the conformational competition remains latent. For natural pro- teins such competition should emerge only near cryogenic tem- peratures, which places it beyond experimental testing. Here, we introduce a designed monomeric miniprotein (FSD-1ss) that within biological temperatures (330 –280 K) switches between simple fast folding and highly complex conformational dynamics in a struc- turally degenerate compact ensemble. Our findings demonstrate the physical basis for protein folding evolution in a designed protein, which exhibits poorly evolved or primordial folding. Furthermore, these results open the door to the experimental exploration of primitive folding and the switching between alter- native protein structures that takes place in evolutionary branch- ing points and prion diseases, as well as the benchmarking of de novo design methods. energy landscape glassy dynamics molecular evolution protein design protein folding P olypeptide chains with random sequences reflecting natural amino acid compositions are expected to form compact states in aqueous solution to minimize hydrophobic solvent exposure. Because it is impossible to simultaneously satisfy all of the potential interactions in the polypeptide chain, these com- pact states are structurally degenerate and, in analogy to random heteropolymers (1), should exhibit glassy conformational dy- namics. The trademarks of such dynamics are highly stretched exponential decays and a superArrhenius temperature- dependent relaxation rate that experiences an abrupt slow down below a characteristic temperature (here termed T g ) (2). Natu- rally evolved protein domains, however, fold into unique 3D structures efficiently and with simple exponential kinetics (3). How natural selection drives the transformation from random polypeptides to well-evolved folding proteins is one of the major unresolved fundamental questions in modern biochemistry. A widely held view has been to assume a thermodynamic criterion, namely that natural folding only requires the native structure to be the lowest energy state in the vast protein conformational space. The thermodynamic criterion has led to relative success in ab initio prediction of native structures (4), as well as in computer-guided de novo protein design (5). However, it has been recently highlighted that artificial proteins catalogued as successful designs may exhibit complex folding kinetics (6). In an alternative mechanism inspired by statistical mechanics the role of early evolution is to produce natively biased (7) or funneled (2) folding energy landscapes by selecting amino acid sequences that maximize the stability gap between the native and all other compact structures (8). Thus, an evolved natural protein domain folds with simple exponential kinetics to a unique structure at biological temperatures (i.e., slightly below its thermal denaturation midpoint, T m ), whereas the heteropoly- mer-like regime emerges only at very low temperature (T g T m ) (2). This physical mechanism for the evolution of protein folding has been thoroughly studied in computer simulations of simpli- fied protein models (9). The big challenge is to test the mech- anism experimentally. Such a test would entail resolving the switch from simple folding kinetics into a unique 3D structure to glassy dynamics in a structurally degenerate ensemble upon decreasing temperature. Theoretical estimates set T m /T g ratios of 1.6 (10) or higher (11) for natural proteins. Given typical protein thermal stabilities (12), these estimates place T g below the temperatures at which protein–water– glycerol mixtures form glasses (13) and thus beyond experimental reach. Accordingly, efforts to detect glassy dynamics slightly below the water freezing point in the 2-state folding protein L have failed (14). On the other hand, there have been reports of stretched kinetic decays in laser temperature-jump (15) and single-molecule force-clamp unfolding (16) experiments of other proteins. A de Novo Designed Protein as a Model of Primordial Folding. An alternative to natural proteins is to investigate the folding behavior of poorly evolved proteins, which by definition should have much smaller T m /T g (17). As a model of poorly evolved folding, we use here a small de novo designed protein. Partic- ularly, we focus on a simple scaffold corresponding to the natural zinc-finger domain Zif268 (Fig. 1 Bottom Left), which was then used by Dahiyat and Mayo (18) as a starting point for the first computer-generated de novo designed protein (FSD-1) (Fig. 1 Bottom Center). FSD-1 was designed with the energy of the target structure as the only fitness criterion (18); it only shares 6 residues with Zif268 and has an unusually large content in aromatic residues. These features hint at a T m /T g below that of natural proteins. As another advantage, from its small size (28 residues), FSD-1 is expected to exhibit simple ultrafast folding kinetics (19). The problem is that the accessible temperature range for FSD-1 is very narrow due to its marginal thermal stability (T m 295 K) (18). For our experiments we introduced 2 nonnatural aromatic amino acids [naphthyl-alanine (Nal) and dansyl-lysine (DnK)] in the original sequence of FSD-1. The f luorescent aromatic amino acids were incorporated as a FRET pair to monitor protein Author contributions: V.M. designed research; M.S., E.d.A., and R.P.-J. performed research; J.M.S.-R. and V.M. analyzed data; and V.M. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID code 2K6R). 1 Present address: Department of Biological Sciences, Columbia University, 1003 Fairchild Center, 1212 Amsterdam Avenue, New York, NY 10027. 2 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0812108106/DCSupplemental. www.pnas.orgcgidoi10.1073pnas.0812108106 PNAS March 17, 2009 vol. 106 no. 11 4127– 4132 PHYSICS

Transcript of A designed protein as experimental model of primordial folding

A designed protein as experimental modelof primordial foldingMourad Sadqia,b, Eva de Albab, Raul Perez-Jimeneza,c,1, Jose M. Sanchez-Ruizc, and Victor Munoza,b,2

aCentro de Investigaciones Biologicas, Consejo Superior de Investigaciones Científicas, Ramiro de Maeztu 9, Madrid 28040, Spain; bDepartment of Chemistryand Biochemistry, University of Maryland, College Park, MD 20742; and cDepartamento de Química-Física, Facultad de Ciencias, Universidad de Granada,Granada 18071, Spain

Edited by Peter G. Wolynes, University of California at San Diego, La Jolla, CA, and approved January 9, 2009 (received for review November 27, 2008)

How do proteins accomplish folding during early evolution? The-oretically the mechanism involves the selective stabilization of thenative structure against all other competing compact conforma-tions in a process that involves cumulative changes in the aminoacid sequence along geological timescales. Thus, an evolved pro-tein folds into a single structure at physiological temperature, butthe conformational competition remains latent. For natural pro-teins such competition should emerge only near cryogenic tem-peratures, which places it beyond experimental testing. Here, weintroduce a designed monomeric miniprotein (FSD-1ss) that withinbiological temperatures (330–280 K) switches between simple fastfolding and highly complex conformational dynamics in a struc-turally degenerate compact ensemble. Our findings demonstratethe physical basis for protein folding evolution in a designedprotein, which exhibits poorly evolved or primordial folding.Furthermore, these results open the door to the experimentalexploration of primitive folding and the switching between alter-native protein structures that takes place in evolutionary branch-ing points and prion diseases, as well as the benchmarking of denovo design methods.

energy landscape � glassy dynamics � molecular evolution �protein design � protein folding

Polypeptide chains with random sequences reflecting naturalamino acid compositions are expected to form compact

states in aqueous solution to minimize hydrophobic solventexposure. Because it is impossible to simultaneously satisfy all ofthe potential interactions in the polypeptide chain, these com-pact states are structurally degenerate and, in analogy to randomheteropolymers (1), should exhibit glassy conformational dy-namics. The trademarks of such dynamics are highly stretchedexponential decays and a superArrhenius temperature-dependent relaxation rate that experiences an abrupt slow downbelow a characteristic temperature (here termed Tg) (2). Natu-rally evolved protein domains, however, fold into unique 3Dstructures efficiently and with simple exponential kinetics (3).How natural selection drives the transformation from randompolypeptides to well-evolved folding proteins is one of the majorunresolved fundamental questions in modern biochemistry.

A widely held view has been to assume a thermodynamiccriterion, namely that natural folding only requires the nativestructure to be the lowest energy state in the vast proteinconformational space. The thermodynamic criterion has led torelative success in ab initio prediction of native structures (4), aswell as in computer-guided de novo protein design (5). However,it has been recently highlighted that artificial proteins cataloguedas successful designs may exhibit complex folding kinetics (6). Inan alternative mechanism inspired by statistical mechanics therole of early evolution is to produce natively biased (7) orfunneled (2) folding energy landscapes by selecting amino acidsequences that maximize the stability gap between the native andall other compact structures (8). Thus, an evolved natural proteindomain folds with simple exponential kinetics to a uniquestructure at biological temperatures (i.e., slightly below its

thermal denaturation midpoint, Tm), whereas the heteropoly-mer-like regime emerges only at very low temperature (Tg ��Tm) (2).

This physical mechanism for the evolution of protein foldinghas been thoroughly studied in computer simulations of simpli-fied protein models (9). The big challenge is to test the mech-anism experimentally. Such a test would entail resolving theswitch from simple folding kinetics into a unique 3D structureto glassy dynamics in a structurally degenerate ensemble upondecreasing temperature. Theoretical estimates set Tm/Tg ratios of�1.6 (10) or higher (11) for natural proteins. Given typicalprotein thermal stabilities (12), these estimates place Tg belowthe temperatures at which protein–water–glycerol mixtures formglasses (13) and thus beyond experimental reach. Accordingly,efforts to detect glassy dynamics slightly below the water freezingpoint in the 2-state folding protein L have failed (14). On theother hand, there have been reports of stretched kinetic decaysin laser temperature-jump (15) and single-molecule force-clampunfolding (16) experiments of other proteins.

A de Novo Designed Protein as a Model of Primordial Folding. Analternative to natural proteins is to investigate the foldingbehavior of poorly evolved proteins, which by definition shouldhave much smaller Tm/Tg (17). As a model of poorly evolvedfolding, we use here a small de novo designed protein. Partic-ularly, we focus on a simple � � � scaffold corresponding to thenatural zinc-finger domain Zif268 (Fig. 1 Bottom Left), whichwas then used by Dahiyat and Mayo (18) as a starting point forthe first computer-generated de novo designed protein (FSD-1)(Fig. 1 Bottom Center). FSD-1 was designed with the energy ofthe target structure as the only fitness criterion (18); it onlyshares 6 residues with Zif268 and has an unusually large contentin aromatic residues. These features hint at a Tm/Tg below thatof natural proteins. As another advantage, from its small size (28residues), FSD-1 is expected to exhibit simple ultrafast foldingkinetics (19). The problem is that the accessible temperaturerange for FSD-1 is very narrow due to its marginal thermalstability (Tm � 295 K) (18).

For our experiments we introduced 2 nonnatural aromaticamino acids [naphthyl-alanine (Nal) and dansyl-lysine (DnK)] inthe original sequence of FSD-1. The fluorescent aromatic aminoacids were incorporated as a FRET pair to monitor protein

Author contributions: V.M. designed research; M.S., E.d.A., and R.P.-J. performed research;J.M.S.-R. and V.M. analyzed data; and V.M. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.

Data deposition: The atomic coordinates and structure factors have been deposited in theProtein Data Bank, www.pdb.org (PDB ID code 2K6R).

1Present address: Department of Biological Sciences, Columbia University, 1003 FairchildCenter, 1212 Amsterdam Avenue, New York, NY 10027.

2To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/cgi/content/full/0812108106/DCSupplemental.

www.pnas.org�cgi�doi�10.1073�pnas.0812108106 PNAS � March 17, 2009 � vol. 106 � no. 11 � 4127–4132

PHYS

ICS

compactness and were intrusively placed inside the proteinstructure (see Fig. 1 Bottom Right) rather than in floppy tails.The new version (FSD-1ss) is synthetic in 2 ways: Its amino acidsequence was designed de novo, and it includes nonnaturalamino acids within its folded structure. The Nal–DnK pairplaced in the FSD-1 structure should result in �90% FRETefficiency (i.e., with an Ro value of �2.5 nm, a k2 value of 2/3, andan �1.7-nm distance between probes) (see Fig. 1), providing ahighly sensitive probe of protein conformational changes.

Natural-Like Folding Near the Denaturation Midpoint. At low tem-peratures the FSD-1ss far-UV CD spectrum indicates a mix of�-helix, �-sheet, and coil (Fig. 2A), and FRET efficiency is�90% (Fig. 2C). The protein unfolds upon increasing temper-ature (Fig. 2 A) and/or addition of urea (Fig. 2B), followingtypical sigmoidal curves with the broadness expected for naturalproteins of similar size. The sigmoidicity is better appreciated inthe curve derivative, which shows a clear peak at the midpoint(e.g., Fig. 2D). Both CD and FRET thermal denaturationexperiments show a pronounced pretransition slope [Fig. 2C andsupporting information (SI) Fig. S1]. FSD-1ss is very stable,requiring �6.5 M urea (Fig. 2B) or �355 K to be half-unfoldedwhen monitored both by CD (Fig. S1) and FRET (Fig. 2D).Addition of urea decreases the Tm monotonically (Fig. 2D).Therefore, FSD-1ss exhibits equilibrium unfolding typical ofsmall natural proteins and is much more stable than the parentFSD-1 (i.e., Tm � 60 K higher).

We also determined the NMR structure of FSD-1ss at thehighest temperature still below the unfolding transition (i.e., 323K) (Fig. 2D). The NMR structure (PDB ID code 2K6R) showsthe expected overall topology with an �-helix packed against 2short strands connected by a loop (Fig. 2E). The structure is verywell defined for a protein of such small size (Table S1). Thesuperposition of the ensemble of 20 structures results in rmsd of0.04 nm for the backbone heavy atoms of residues 3–26. More-over, the side chains of the hydrophobic core residues overlayalmost on top of each other (Fig. 2E). PROCHECK providesanother indicator of the high structural quality with �90% of theresidues in the most favorable regions of the Ramachandran plotand 0% in disallowed regions.

To determine the (un)folding kinetics of FSD-1ss near its Tm,we used a FRET nanosecond laser-induced temperature-jumpinstrument (20). Fig. 2F shows the folding relaxation of FSD-1ss

after a T jump of �10 K to 323 K in the presence of 2 M urea.Under these conditions the protein is slightly below the Tm (seeFig. 2D), and thus the resulting relaxation is still dominated bythe folding rate. The relaxation is very fast and can be well fit toa double exponential decay with relaxation times of 150 ns and4.5 �s (Fig. 2F). Biphasic folding-relaxation decays with onephase in the hundreds of nanoseconds and another in the fewmicroseconds are common in ultrafast folding proteins, such asthe villin headpiece subdomain (21) or the albumin bindingdomain prb (22). In fact, 4.5 �s at 322 K place FSD-1ss at the topof the ultrafast folding chart (23). Thus, FSD-1ss exhibitsnatural-like simple ultrafast folding kinetics near its Tm.

Structural Rearrangements at Low Temperatures. Below 320 Krather unusual behavior is observed. Far-UV CD indicatesprogressive increases in backbone secondary structure from 320to 273 K (Fig. 2 A). FRET efficiency also increases, suggestingmore collapse (Fig. 2C). Moreover, the second component ofCD singular value decomposition reveals changes in the tertiaryenvironment of the Phe residues restricted to the 273- to 320-Krange (Fig. 2G). The characteristic Phe negative band at �220-to 230-nm shifts to the blue as temperature decreases, indicatingless solvent exposure. Such reorganization of the FSD-1ss pro-tein core is reduced by addition of urea (Fig. 2G). Furthermore,the 1D NMR amide proton spectrum exhibits line broadeningbelow 323 K. At 278 K the spectrum is so broadened that onlya few signals can be detected (Fig. 2H). NMR line-broadeningindicates structural reorganizations (i.e., large alterations in theelectronic environment of amide protons) taking place in the 50-to 500-�s timescale. This behavior is entirely reversible andindependent of protein concentration from 50 �M to 2 mM,indicating that FSD-1ss does not aggregate under our experi-mental conditions.

Differential scanning calorimetry (DSC) produces a broadthermogram without defined peaks (Fig. 2I). The thermogramshape is similar to that of apomyoglobin (24) but is alsoconsistent with FSD-1ss’s small size [little unfolding enthalpy(12)] and ultrafast folding [downhill regime (23)]. Furthermore,although broad, the FSD-1ss thermogram agrees very well withthe empirical baselines for folded and unfolded states calculatedfrom its sequence (Fig. 2I). This agreement allows the estimateof an average change in heat capacity on unfolding (�Cp) of �1.4kJ/mol. The thermogram also shows a heat-capacity shoulder inthe same temperature interval as the changes in Phe tertiaryenvironment (Fig. 2 I and G), consistent with the presence ofconformational changes at low temperature.

The degree of NMR line-broadening shown in Fig. 2H isreminiscent of the onset of cold denaturation (25). In this case,however, cold denaturation is ruled out by other observations: (i)CD and FRET indicate that the protein becomes more struc-tured and compact as temperature lowers; (ii) the low-temperature heat capacity is consistent with the empiricalfolded-state baseline (Fig. 2I) (SI Text); (iii) addition of ureaeliminates the low-temperature changes rather than enhancingthem (see Fig. 2G); and (iv) given �Cp � 1.4 kJ/mol and Tm �355 K, and assuming a standard enthalpy per residue of 2.7kJ/mol at 333 K (12), cold denaturation should only occur below220 K. Therefore, we can conclude that, below 320 K, FSD-1ssundergoes conformational transitions in a structurally degener-ate compact ensemble. The structural changes involve bothbackbone and core rearrangements that produce large differ-ences in amide proton chemical shifts, presumably because ofstrong ring-current effects from the many aromatic residues.

Glassy Conformational Dynamics. We performed additional laser Tjump experiments at lower temperatures. The first interestingobservation is that the relaxation decay, after a T jump of �10K to 310 K, is not single or double exponential but requires fitting

Fig. 1. Amino acid sequences and structures of FSD-1ss and its predecessors.Sequences are lined up with the secondary structure assignment. The num-bering scheme follows the FSD-1ss sequence, thus being shifted 1 position inthe other 2 proteins. This numbering scheme is maintained throughout. The3 structures are shown at the bottom (Zif268, FSD-1, and FSD-1ss from left toright) by using a ribbon backbone representation. Blue corresponds to criticalcore residues, green corresponds to Nal, and red corresponds to DnK inFSD-1ss. The magenta sphere shows the position of the zinc atom in Zif268.

4128 � www.pnas.org�cgi�doi�10.1073�pnas.0812108106 Sadqi et al.

to a stretched exponential function [i.e., exp((t/�)�)] with � � 8�s and � � 0.55 (Fig. 3A). The degree of stretching in therelaxation increases as temperature decreases. This trend can beappreciated in Fig. 3A, which shows decays after �10-K jumpsto final temperatures between 310 and 279 K together with bestfits to single exponential decays for reference (thin black curves).Fits to a stretched exponential function are good at all temper-atures (Fig. 3A, red curves) with a fitted � parameter thatdecreases to 0.33 at 279 K. At this low temperature the relaxationis highly stretched, extending for at least 5–6 time decades (Fig.3A). A classical multiexponential approach requires �7 expo-nentials to produce fits of similar quality. This relaxation is in factclose to the superstretched structural relaxation of native myo-globin after CO photodissociation (26).

A strongly nonlinear temperature dependence is also apparentin the 4 decays shown in Fig. 3A with relaxation times of 8, 15,50, and 330 �s for final temperatures from 310 K to 279 K in�10-K steps. The highly curved temperature dependence isevident in an Arrhenius plot (Fig. 3C, blue curve). CurvedArrhenius plots need to be interpreted with caution because thefolding rate of 2-state folding proteins also shows curvature

arising from the difference in heat capacity between native andfolding transition states (27, 28). However, the curvature inFSD-1ss is much too pronounced to be accounted for by heatcapacity changes. Fitting the 0 M data (Fig. 3C, blue curve) toa thermodynamic transition-state rate expression requires afolding activation heat capacity of 2.2 � 0.22 kJ/(mol�K) (SIText). This folding activation heat capacity is significantly largerthan the total equilibrium change for FSD-1ss (Fig. 2I) and atleast 3-fold larger than expected for 2-state folding [i.e., �40%of the equilibrium heat capacity change (28) or 0.56 kJ/(mol�K)for FSD-1ss]. The 2.2 kJ/(mol�K) value is indeed comparablewith the folding activation heat capacity of the 3.7-times-largerprotein FKBP-12 (29).

The addition of the chemical denaturant urea monotonicallydecreases the degree of relaxation stretching (see the decreasingdeviations from single exponential fits in Fig. 3B), speeds up therelaxation rate in folding conditions (instead of the slowing downof 2-state folding), and drastically decreases the curvature of theArrhenius plot (Fig. 3C, green and red curves). For example, thecurvature at 2 M urea corresponds to only 0.4 � 0.05 kJ/(mol�K)in folding activation heat capacity, more in line with the value

Fig. 2. Folding and structural properties of FSD-1ss. (A) Equilibrium thermal unfolding by far-UV CD. (B) Equilibrium urea unfolding by CD at 222 nm. (C)Equilibrium thermal unfolding by FRET: 0 M (blue), 1 M (green), and 2 M (red) urea. (D) Derivative of the FRET equilibrium thermal unfolding curves shown inC. Color coding is as shown in C. (E) Superposition (backbone of residues 3–26) of the 20 lowest-energy NMR structures at 323 K. Core side chains are shown ingreen. (F) Relaxation decay after a �10 K ns T jump to 322 K final temperature in the presence of 2 M urea. The red curve is the best double exponential fit. (G)Singular value decomposition (SVD) of far-UV CD spectra as a function of temperature and urea. (Inset) The second component (Comp.). The main graph showsthe second component’s amplitude (Amp.) for 0, 1, and 2 M urea. Color coding is as shown in C. (H) One-dimensional NMR spectrum of the proton amide region:323 K (red), 298 K (green), and 278 K (blue). (I) DSC thermogram in absolute heat capacity units showing the empirical estimates of the heat capacity baselinesfor the folded and unfolded states (SI Text). Error bars indicate the experimental errors of 4 runs.

Sadqi et al. PNAS � March 17, 2009 � vol. 106 � no. 11 � 4129

PHYS

ICS

expected for a 29-residue 2-state folder (28). The effects of ureain FSD-1ss are therefore incompatible with classical activatedfolding kinetics. In contrast, all of these observations arestraightforwardly explained as a decrease in glassiness producedby the urea-induced smoothing of the rough FSD-1ss foldinglandscape.

Estimating the Roughness of the FSD-1ss Folding Landscape. Toestimate the roughness of the FSD-1ss folding landscape, we usethe empirically determined stretching parameter � (Fig. 3D) andequation � � [1 � (�F/RT)2]�1/2 (30), where �F2 is the free-energy variance between adjacent conformations in the energylandscape. This calculation renders a roughness of (�4.5 kJ/mol)2 for FSD-1ss in the absence of urea. Such an empiricalestimate of landscape roughness provides a very useful bench-mark for computer simulations, which could then be used tofurther investigate the relationship between glassy conforma-tional dynamics and landscape roughness in atomistic simula-tions of FSD-1ss. It is also interesting to note that the experi-mental data follow well the theoretical relation at temperaturesabove 285 K but deviate quite apparently below (Fig. 3D). Thedeviation below 285 K can be interpreted as signaling thedynamic glass transition, resulting in an estimate of Tm/Tg of �1.3for the synthetic superstable FSD-1ss miniprotein. A ratio of 1.3is much lower than the estimates for natural proteins (11) andthus reminiscent of poorly evolved proteins.

Structural Basis for Degenerate Folding. What are the structuraldeterminants of FSD-1ss folding behavior at low temperature?Both FSD-1 (at 274 K) and FSD-1ss (at 323 K) have well-resolved NMR structures with little conformational heteroge-neity in backbone and core side chains (e.g., Fig. 2E). Overall,the 2 structures are similar, but detailed inspection uncoverscritical differences (Fig. 4A). The �-helix and several core sidechains (F13, L19, and I23) (see Fig. 4A) superimpose on top ofeach other, whereas there are large changes in the N-terminalbackbone region together with rotamer changes and spatialdisplacements of the other core side chains (Fig. 4A).

The most dramatic difference is an �180° flip of the side chainin position 7, Nal in FSD-1ss versus Lys in FSD-1 (see Fig. 1). Inthe FSD-1 structure, K7 is on the solvent-exposed face of the�-hairpin-like structure (Fig. 4B Right, orange residue). Thehighly hydrophobic character of Nal7, however, forces a back-bone rearrangement to minimize its solvent exposure (Fig. 4BLeft, orange residue), resulting in a nonoptimized structure. Thisfinding is highlighted by comparison with the original FSD-1structure. The protein core of FSD-1 is well-packed with allaromatics residues contained within the helix–hairpin interface,resulting in a prolate ellipsoidal structure (Fig. 4B Right). Incontrast, the FSD-1ss structure is more open and spherical (Fig. 4BLeft). The Nal7 flip opens the FSD-1ss core and pushes out theother aromatic residues (Fig. 4B; red, blue, and purple residues) tomake room for the additional hydrophobe. Accessible surface-areacalculations indicate that the real NMR structure of FSD-1ss buriesmore hydrophobic surface than its sequence modeled in the FSD-1structure (Fig. 1 Right), explaining why it is indeed preferred.However, the overall gain comes at the expense of partially exposingF22 and F26 to the solvent. In other words, finding the best packingarrangement involves sacrificing the local environment of severalhydrophobic residues.

The changes in core packing are accompanied by similarlydrastic backbone rearrangements in the N-terminal region (Fig.4C). Most notorious are the changes in the turn region connect-ing the 2 strands. For FSD-1, the first strand extends to K7, withthe turn being placed at I8-K9. In FSD-1ss, the first strand isshorter and a turn-like conformation appears at A6-Nal7 toaccommodate the Nal7 flip (see Fig. 4C Bottom). The shift inturn-register allows the formation of many hydrophobic contactsbetween Nal7 and I8 (observed as NOEs) but also displaces the2 strands farther apart, impeding the formation of �-sheet-likehydrogen bonds (Fig. 4C Top Left). Therefore, there is backbonefrustration in the N-terminal region of FSD-1ss, which cannotsimultaneously satisfy the �-sheet hydrogen bonds and localhydrophobic contacts between Nal7 and I8.

Lessons for Protein Evolution and de Novo DesignHere we report the temperature-controlled switching of de-signed protein FSD-1ss from fast-folding into a single structure

Fig. 3. Glassy conformational dynamics in FSD-1ss. (A) Relaxation decays after �10-K jumps to final temperatures from 310 K (bottom trace) to 279 K (top trace)without urea. The red curves are best-stretched exponential fits and the thin black lines are best exponential fits for reference. (B) Decays as in A except�10-K jumps are to a final temperature of 289 K in 0, 1, and 2 M urea (from top to bottom). (C) Arrhenius plot of the inverse relaxation time obtainedfrom stretched exponential fits: 0 M (blue), 1 M (green), and 2 M (red) urea. All data points correspond to folding conditions (�Tm). The curves are fitsto a quadratic (superArrhenius) equation. Similar quality fits are obtained with a transition-state equation including an activation heat capacity term (SIText). (D) Stretching parameter � as a function of temperature in the absence of urea. Error bars indicate fitting errors. The red curve is a fit to the Xieand Wolynes equation (30).

4130 � www.pnas.org�cgi�doi�10.1073�pnas.0812108106 Sadqi et al.

to glassy dynamics in a nonaggregating structurally degeneratecompact ensemble. These results provide direct experimentaldemonstration of the inherent competition between the nativeand alternative structures during protein folding and evolution.Structural analysis indicates that the structural degeneration ofFSD-1ss stems from a conflict between the backbone propensityto form hydrogen bonds and the efforts of different hydrophobicresidues to maximize their insufficient packing. At low temperaturethese forces balance out, leading to a structurally degenerateensemble, but as temperature rises the increasingly strong hydro-phobic effect wins over selecting a single structure before unfoldingtakes place. Increasingly stretched exponential kinetics at lowtemperature have also been reported for short chains (8–12 resi-dues) of phenylacetylene that form stacked helical structures inpolar solvents (31), revealing striking similarities between theconformational behavior of poorly evolved proteins and muchsimpler synthetic homopolymers.

The folding properties of FSD-1ss have implications forprotein design. De novo design strategies have frequently fo-cused on matching amino acid sequences to a target backbonestructure. It is interesting that substituting a single solvent-exposed Lys with Nal results in a more stable protein that,however, folds into a different structure. The result highlights thedelicate interplay between stability and specificity when design-ing proteins and reinforces the importance of implementingnegative selection during the design strategy. In this light, wecould expect current successful de novo designs to exhibit more‘‘evolved’’ folding than FSD-1ss (which contains intrusive non-natural aromatic amino acids) but still less than natural proteins.This observation is consistent with the complex kinetics ob-

served in �3d at low temperature (32) and in Top-7 (6), whichare 2 proteins designed with entirely natural amino acidcomposition.

Finally, interpreting our results in evolutionary terms, we canconclude that FSD-1ss is analogous to a primordial or poorlyevolved protein. Although FSD-1ss folds into a single structureand is very stable, the glassy regime is already present atphysiological temperatures, making it inapt to perform classicalbiological functions. The narrow gap between folding and struc-tural degeneration should also produce hypersensitivity to mu-tation (9). Turning FSD-1ss into a better-evolved or more‘‘natural’’ protein involves decreasing the temperature at whichthe structurally degenerate glassy regime emerges (larger Tm/Tg).Our experiments suggest various strategies. For example, onecould target the more stable structure (Fig. 4B Left) by reducingthe bulkiness of some hydrophobic residues (e.g., F22 and F26)and retuning the sequence of residues 6–10 to decrease its localpropensity to form the competing �-hairpin (Fig. 4C). Alterna-tively, one could target the original FSD-1 structure (Fig. 4BRight), removing Nal7 and further stabilizing the hairpin struc-ture on its solvent-exposed face. The strong effect caused by Nalon FSD-1ss also offers some clues regarding the old question ofwhat is the right hydrophobic balance to achieve protein folding.Nal is a 2-member-ring amino acid that is slightly bulkier andmore hydrophobic than its aromatic natural counterparts (Phe,Tyr, and Trp), which could suggest that the 20-amino-acidnatural alphabet has been carefully selected during early evolu-tion to maintain the aromatic–hydrophobic protein contentbelow a certain threshold, above which folding becomes a veryhard design selection problem.

Fig. 4. Structural basis for degenerate folding. (A) Superposition of FSD-1ss lowest-energy NMR structure at 323 K (red structures) and the averaged minimizedFSD-1 structure at 275 K (blue structures) (18). The 2 views highlight different subsets of the main hydrophobic core residues. Numbering and sequences are asshown in Fig. 1B. (B) Comparison between FSD-1ss (Left) and FSD-1 (Right) structures. The backbone is shown as a ribbon, and the van der Waals surfaces areshown in shades of brown. The core side chains are colored according to the spectrum of light to indicate sequence order (from red to purple). (C) Comparisonof backbone conformation in the N-terminal segment of FSD-1ss (Upper Left) and FSD-1 (Upper Right), by using a ball and sticks representation. The arrowshighlight the structural differences. (Lower) Ramachandran plot showing the dihedral angle values of residues 6–10 in FSD-1ss (red) and FSD-1 (blue).

Sadqi et al. PNAS � March 17, 2009 � vol. 106 � no. 11 � 4131

PHYS

ICS

Therefore, FSD-1ss appears as an excellent model case tofurther explore the early stages of protein evolution via exper-iment. Moreover, FSD-1ss and FSD-1 fold into quite differentstructures (Fig. 4), with only 2 amino acid changes and, thus,offer a unique opportunity to investigate the physical transitionfrom one native structure to another through sequence orenvironment manipulation. This investigation could be seenboth as an experimental recreation of evolutionary branchingpoints and as a simple model to mimic conformational switchessuch as those characteristic of prion diseases (33).

Experimental ProceduresProtein concentration was measured at 280 nm by using as a molar-extinction coefficient the sum of Nal (5,526 M�1�cm�1), DnK (1,571

M�1�cm�1), and tyrosine (1,280 M�1�cm�1). CD experiments were per-formed at a protein concentration of 50 �M in 20 mM citrate buffer, pH 5.0,at different urea concentrations. Equilibrium FRET experiments were per-formed at a protein concentration of 25 �M in 20 mM acetate buffer, pH5.0. FRET efficiencies were determined from the donor fluorescence inten-sity by using a Nal sample as reference. DSC experiments were carried outin 20 mM acetate buffer, pH 5.0, as described previously (34). NMR sampleswere prepared in 90:10 (vol:vol) H2O/2H2O at pH 5.0. Nanosecond laser-induced temperature-jump experiments were performed as described inref. 20. A more detailed methods description, the structure calculationprocedures, and structure statistics are all included in SI Text, Figs. S1 andS2, and Table S1.

ACKNOWLEDGMENTS. This work has been supported by National ScienceFoundation Grant MCB-0317294 and Marie Curie Excellence Award MEXT-CT-2006-042334.

1. Bryngelson JD, Wolynes PG (1989) Intermediates and barrier crossing in a randomenergy-model (with applications to protein folding). J Phys Chem 93:6902–6915.

2. Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG (1995) Funnels, pathways, and theenergy landscape of protein-folding—A synthesis. Proteins Struct Funct Genet 21:167–195.

3. Jackson SE (1998) How do small single-domain proteins fold? Folding Des 3:R81–R91.4. Bonneau R, Baker D (2001) Ab initio protein structure prediction: Progress and pros-

pects. Annu Rev Biophys Biomol Struct 30:173–189.5. Butterfoss GL, Kuhlman B (2006) Computer-based design of novel protein structures.

Annu Rev Biophys Biomol Struct 35:49–65.6. Watters AL, et al. (2007) The highly cooperative folding of small naturally occurring

proteins is likely the result of natural selection. Cell 128:613–624.7. Go N (1984) The consistency principle in protein structure and pathways of folding. Adv

Biophys 18:149–164.8. Sali A, Shakhnovich EI, Karplus M (1994) How does a protein fold? Nature 369:248–251.9. Shakhnovich EI (2006) Protein folding thermodynamics and dynamics: Where physics,

chemistry, and biology meet. Chem Rev 106:1559–1588.10. Wolynes PG, Onuchic JN, Thirumalai D (1995) Navigating the folding routes. Science

267:1619–1620.11. Chan HS, Shimizu S, Kaya H (2004) Cooperativity principles in protein folding. Methods

Enzymol 380:350–379.12. Robertson AD, Murphy KP (1997) Protein structure and the energetics of protein

stability. Chem Rev 97:1251–1267.13. Angell CA (1995) Formation of glasses from liquids and biopolymers. Science 267:1924–

1935.14. Gillespie B, Plaxco KW (2003) Nonglassy kinetics in the folding of a simple single-

domain protein. Proc Natl Acad Sci USA 97:12014–12019.15. Sabelko J, Ervin J, Gruebele M (1999) Observation of strange kinetics in protein folding.

Proc Natl Acad Sci USA 96:6031–6036.16. Brujic J, Hermans RI, Walther KA, Fernandez JM (2006) Single-molecule force spectros-

copy reveals signatures of glassy dynamics in the energy landscape of ubiquitin. NatPhys 2:282–286.

17. Wolynes PG (2008) Protein Folding, Misfolding and Aggregation, ed Munoz V (R SocChem, Cambridge, UK), pp 49–69.

18. Dahiyat BI, Mayo SL (1995) De novo protein design: Fully automated sequence selec-tion. Science 278:82–87.

19. Naganathan AN, Munoz V (2005) Scaling of folding times with protein size. J Am ChemSoc 127:480–481.

20. Sadqi M, Lapidus LJ, Munoz V (2003) How fast is protein hydrophobic collapse? ProcNatl Acad Sci USA 100:12117–12122.

21. Kubelka J, Eaton WA, Hofrichter J (2003) Experimental tests of villin subdomain foldingsimulations. J Mol Biol 329:625–630.

22. Vu DM, Myers JK, Oas TG, Dyer RB (2004) Probing the folding and unfolding dynamicsof secondary and tertiary structures in a three-helix bundle protein. Biochemistry43:3582–3589.

23. Naganathan AN, Doshi U, Fung A, Sadqi M, Munoz V (2006) Dynamics, energetics, andstructure in protein folding. Biochemistry 45:8466–8475.

24. Griko YV, Privalov PL (1994) Thermodynamic puzzle of apomyoglobin unfolding. J MolBiol 235:1318–1325.

25. Babu CR, van Hilser VJ, Wand AJ (2004) Direct access to the cooperative substructureof proteins and the protein ensemble via cold denaturation. Nat Struct Biol 11:352–357.

26. Lim MH, Jackson TA, Anfinrud PA (1993) Nonexponential protein relaxation—dynamics of conformational change in myoglobin. Proc Natl Acad Sci USA 90:5801–5804.

27. Oliveberg M, Tan YJ, Fersht AR (1995) Negative activation enthalpies in the kinetics ofprotein-folding. Proc Natl Acad Sci USA 92:8926–8929.

28. Akmal A, Munoz V (2004) The nature of the free energy barriers to two-state folding.Proteins Struct Funct Bioinf 57:142–152.

29. Main ERG, Fulton KF, Jackson SE (1999) Folding pathway of FKBP 12 and characteri-sation of the transition state. J Mol Biol 291:429–444.

30. Xie X, Wolynes PG (2001) Microscopic theory of heterogeneity and nonexponentialrelaxations in supercooled liquids. Phys Rev Let 86:5526–5529.

31. Yang WY, Prince RB, Sabelko J, Moore JS, Gruebele M (2000) Transition from expo-nential to nonexponential kinetics during formation of a nonbiological helix. J AmChem Soc 122:3248–3249.

32. Zhu Y, et al. (2003) Ultrafast folding of alpha(3): A de novo designed three-helix bundleprotein. Proc Natl Acad Sci USA 100:15486–15491.

33. Kelly JW (1998) The environmental dependency of protein folding best explains prionand amyloid diseases. Proc Natl Acad Sci USA 95:930–932.

34. Guzman-Casado M, Parody-Morreale A, Robic S, Marqusee S, Sanchez-Ruiz JM(2003) Energetic evidence for formation of a pH-dependent hydrophobic cluster inthe denatured state of Thermus thermophilus ribonuclease H. J Mol Biol 329:731–743.

4132 � www.pnas.org�cgi�doi�10.1073�pnas.0812108106 Sadqi et al.