Structural plasticity of thermophilic serine hydroxymethyltransferases

13
Structural Plasticity of Thermophilic Serine Hydroxymethyltransferases Alessandro Paiardini, 1 Giulio Gianese, 1,3 Francesco Bossa, 1,3 and Stefano Pascarella 1,2,3 * 1 Dipartimento di Scienze Biochimiche “A. Rossi Fanelli” and Centro di Biologia Molecolare del Consiglio Nazionale delle Ricerche, Universita ` “La Sapienza”, Rome, Italy 2 Centro Interdipartimentale di Ricerca per l’Analisi dei Modelli e dell’Informazione nei Sistemi Biomedici (CISB), Universita ` “La Sapienza”, Rome, Italy 3 Centro di Eccellenza di Biologia e Medicina Molecolare (BEMM), Universita ` “La Sapienza”, Rome, Italy ABSTRACT Serine hydroxymethyltransferase (SHMT) catalyzes the reversible cleavage of serine to form glycine and monocarbonic groups, essential in several biosynthetic pathways. The availability of crystallographic structures of SHMT from meso- philic organisms and information produced by the genomic projects prompted the analysis of the adap- tation of SHMT to “extreme” environments, such as high temperatures, by exploitation of structural data from thermophilic organisms. The sequences of 10 thermophilic/hyperthermophilic SHMTs were multiply aligned to 53 mesophilic homologs and analyzed by a comparative approach, examining the amino acid compositions and preferred residue ex- changes between mesophiles and extremophiles. The structural basis of the observed exchanges was further investigated through the application of ho- mology modeling to the 10 extremophilic SHMTs. The results of this study indicate that, in SHMT, thermal stability can be achieved mainly through three strategies: (i) increased number of charged residues at the protein surface; (ii) increased hydro- phobicity of the protein core; and (iii) substitution of thermolabile residues exposed to the solvent. Additional features of the archaeal SHMTs, for which no structural data are available yet, were also inves- tigated to explain their quaternary assemblage and the interaction with modified folates. Proteins 2003; 50:122–134. © 2002 Wiley-Liss, Inc. Key words: serine hydroxymethyltransferase; ther- mophilic enzymes; modified folates; Archaea; residue exchanges; structural adaptation INTRODUCTION During the past decade, genomic projects provided us with the structural information of an exponentially increas- ing number of proteins. 1 This huge amount of data opened new intriguing possibilities to understand, at a molecular level, the plasticity of life in adaptation to the so-called “extreme” environments, characterized by one or a combi- nation of “extreme” physicochemical conditions, such as high or low temperatures, high ionic strength, high hydro- static pressure, and so forth. Indeed, systematic compari- sons among protein sequences or entire genomes from extremophilic and mesophilic organisms suggested pos- sible adaptive strategies. 2 Such comparative analyses confirmed there is no general solution to the achievement of structural and functional stability at extreme environ- ments, 3 although a few common strategies can be detected. For example, about 15 different physicochemical factors are thought to be related to thermostability. 4 Among these, an increased occurrence of charged residues and salt bridges at the expense of uncharged polar residues, a more hydrophobic core, surface loop deletions, and substi- tutions of thermolabile residues seem to be the most consistent. 5 Likewise, a few common themes were detected in enzymes from psychrophilic organisms, mainly consist- ing of a decreased occurrence of charged and H-bond forming residues at exposed sites and a lower apolar content of protein interior. 6,7 Similarly, shared structural characteristics were observed in proteins from extreme halophiles: an increased number of highly hydrated amino acids, such as glutamic acid and arginine, enables the halophilic enzyme to successfully compete with salts for water molecules to maintain a hydratation shield 8 ; inter- subunit salt bridges that appear to be locked in by hydrated ions have been recently invoked as a further stabilizing factor. 9 So far, not enough data have been collected from acidophilic, alkalophilic, and piezophilic organisms. However, it has been suggested that pressure and pH, compared to temperature and ionic strength, exert a minor selective pressure on protein function and Abbreviations: SHMT, serine hydroxymethyltransferase; aaSHMT, Aquifex aeolicus SHMT; apSHMT, Aeropyrum pernix SHMT; afSHMT, Archaeoglobus fulgidus SHMT; eSHMT, Escherichia coli SHMT; hcSHMT, human cytosolic SHMT; mtSHMT, Methanobacte- rium thermoautotrophicum SHMT; mmSHMT, Methanothermobacter marburgensis SHMT; mjSHMT, Methanococcus jannaschii SHMT; paSHMT, Pyrococcus abyssi SHMT; phSHMT, Pyrococcus horikoshii SHMT; rcSHMT, rabbit cytosolic SHMT; ssSHMT, Sulfolobus solfatari- cus SHMT; tmSHMT, Thermotoga maritima SHMT; PLP, pyridoxal-5- phosphate; PLG, PLP-glycine complex; THF, tetrahydrofolate; phF, Pyrococcus horikoshii variant folate. Grant sponsor: Italian Ministero dell’ Universita ` e della Ricerca (MIUR). *Correspondence to: Stefano Pascarella, Dipartimento di Scienze Biochimiche, Universita ` La Sapienza, P.le A. Moro 5, 00185 Rome, Italy. E-mail: [email protected] Received 30 April 2002; Accepted 22 July 2002 Published online 00 Month 2002 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.10268 PROTEINS: Structure, Function, and Genetics 50:122–134 (2003) © 2002 WILEY-LISS, INC.

Transcript of Structural plasticity of thermophilic serine hydroxymethyltransferases

Structural Plasticity of Thermophilic SerineHydroxymethyltransferasesAlessandro Paiardini,1 Giulio Gianese,1,3 Francesco Bossa,1,3 and Stefano Pascarella1,2,3*1Dipartimento di Scienze Biochimiche “A. Rossi Fanelli” and Centro di Biologia Molecolare del Consiglio Nazionale delleRicerche, Universita “La Sapienza”, Rome, Italy2Centro Interdipartimentale di Ricerca per l’Analisi dei Modelli e dell’Informazione nei Sistemi Biomedici (CISB), Universita“La Sapienza”, Rome, Italy3Centro di Eccellenza di Biologia e Medicina Molecolare (BEMM), Universita “La Sapienza”, Rome, Italy

ABSTRACT Serine hydroxymethyltransferase(SHMT) catalyzes the reversible cleavage of serineto form glycine and monocarbonic groups, essentialin several biosynthetic pathways. The availability ofcrystallographic structures of SHMT from meso-philic organisms and information produced by thegenomic projects prompted the analysis of the adap-tation of SHMT to “extreme” environments, such ashigh temperatures, by exploitation of structuraldata from thermophilic organisms. The sequencesof 10 thermophilic/hyperthermophilic SHMTs weremultiply aligned to 53 mesophilic homologs andanalyzed by a comparative approach, examining theamino acid compositions and preferred residue ex-changes between mesophiles and extremophiles.The structural basis of the observed exchanges wasfurther investigated through the application of ho-mology modeling to the 10 extremophilic SHMTs.The results of this study indicate that, in SHMT,thermal stability can be achieved mainly throughthree strategies: (i) increased number of chargedresidues at the protein surface; (ii) increased hydro-phobicity of the protein core; and (iii) substitutionof thermolabile residues exposed to the solvent.Additional features of the archaeal SHMTs, for whichno structural data are available yet, were also inves-tigated to explain their quaternary assemblage andthe interaction with modified folates. Proteins 2003;50:122–134. © 2002 Wiley-Liss, Inc.

Key words: serine hydroxymethyltransferase; ther-mophilic enzymes; modified folates;Archaea; residue exchanges; structuraladaptation

INTRODUCTION

During the past decade, genomic projects provided uswith the structural information of an exponentially increas-ing number of proteins.1 This huge amount of data openednew intriguing possibilities to understand, at a molecularlevel, the plasticity of life in adaptation to the so-called“extreme” environments, characterized by one or a combi-nation of “extreme” physicochemical conditions, such ashigh or low temperatures, high ionic strength, high hydro-static pressure, and so forth. Indeed, systematic compari-sons among protein sequences or entire genomes from

extremophilic and mesophilic organisms suggested pos-sible adaptive strategies.2 Such comparative analysesconfirmed there is no general solution to the achievementof structural and functional stability at extreme environ-ments,3 although a few common strategies can be detected.For example, about 15 different physicochemical factorsare thought to be related to thermostability.4 Amongthese, an increased occurrence of charged residues andsalt bridges at the expense of uncharged polar residues, amore hydrophobic core, surface loop deletions, and substi-tutions of thermolabile residues seem to be the mostconsistent.5 Likewise, a few common themes were detectedin enzymes from psychrophilic organisms, mainly consist-ing of a decreased occurrence of charged and H-bondforming residues at exposed sites and a lower apolarcontent of protein interior.6,7 Similarly, shared structuralcharacteristics were observed in proteins from extremehalophiles: an increased number of highly hydrated aminoacids, such as glutamic acid and arginine, enables thehalophilic enzyme to successfully compete with salts forwater molecules to maintain a hydratation shield8; inter-subunit salt bridges that appear to be locked in byhydrated ions have been recently invoked as a furtherstabilizing factor.9 So far, not enough data have beencollected from acidophilic, alkalophilic, and piezophilicorganisms. However, it has been suggested that pressureand pH, compared to temperature and ionic strength,exert a minor selective pressure on protein function and

Abbreviations: SHMT, serine hydroxymethyltransferase; aaSHMT,Aquifex aeolicus SHMT; apSHMT, Aeropyrum pernix SHMT;afSHMT, Archaeoglobus fulgidus SHMT; eSHMT, Escherichia coliSHMT; hcSHMT, human cytosolic SHMT; mtSHMT, Methanobacte-rium thermoautotrophicum SHMT; mmSHMT, Methanothermobactermarburgensis SHMT; mjSHMT, Methanococcus jannaschii SHMT;paSHMT, Pyrococcus abyssi SHMT; phSHMT, Pyrococcus horikoshiiSHMT; rcSHMT, rabbit cytosolic SHMT; ssSHMT, Sulfolobus solfatari-cus SHMT; tmSHMT, Thermotoga maritima SHMT; PLP, pyridoxal-5�-phosphate; PLG, PLP-glycine complex; THF, tetrahydrofolate; phF,Pyrococcus horikoshii variant folate.

Grant sponsor: Italian Ministero dell’ Universita e della Ricerca(MIUR).

*Correspondence to: Stefano Pascarella, Dipartimento di ScienzeBiochimiche, Universita La Sapienza, P.le A. Moro 5, 00185 Rome,Italy. E-mail: [email protected]

Received 30 April 2002; Accepted 22 July 2002

Published online 00 Month 2002 in Wiley InterScience(www.interscience.wiley.com). DOI: 10.1002/prot.10268

PROTEINS: Structure, Function, and Genetics 50:122–134 (2003)

© 2002 WILEY-LISS, INC.

Art
Art
Art

stability: acidophiles and alkalophiles show neutral intra-cellular pH, and generally single-chain proteins do notundergo denaturation at pressures � 400 MPa (organismsliving in the deep sea experience pressures up to 120 MPa).10

The understanding of the molecular mechanism of pro-tein adaptation to extreme environments is of potentialinterest for many biotechnological and industrial applica-tions, because extremophilic enzymes offer increased ratesof reactions, higher substrate solubility, and/or longerenzyme half-lives at the conditions of industrial pro-cesses.11 We focused our attention on serine hydroxy-methyltransferase (SHMT; EC 2.1.2.1), a pyridoxal-5�-phosphate (PLP)-dependent enzyme that catalyzes thereversible conversion of serine and tetrahydrofolate (THF,H4PteGlun) into glycine and 5,10-methylen-THF (5,10-CH2-H4PteGlun). Archaeal SHMTs are known to bind “modifiedfolates,”12 whose common characteristics are (i) presenceof a pterinic ring, similar to that of THF, but mono- orbis-methylated at carbons 7 and 9; (ii) “tails” composed byseveral chemical groups different from the glutamic acidpresent in THF, such as ribitol, ribose, and hydroxyglut-arate.13 SHMT is actively studied in our research group14

and represents a good model for studying enzyme adapta-tions to extreme environments because a large number oforthologous sequences are available today from the threedomains of life (mostly derived from genomic projects) andvery few structural changes occurred throughout the evolu-tion. In particular, several sequences from thermophiliceu- and archaeabacteria species are available.

The preferred residue exchanges and amino acid compo-sition differences between mesophilic and thermophilicenzymes were measured by analysis of a multiple align-ment of 63 SHMT sequences. A new weighing scheme wasadopted to reduce the statistical noise due to the randomdivergence between sequences.15 The observed trendswere further investigated through the analysis of three-dimensional models of thermophilic SHMTs, homologyderived from the known structures of SHMT from E.coli (eSHMT),16 and cytosolic isoenzymes from man(hcSHMT),17 and rabbit (rcSHMT).18 The same modelswere used to investigate the structural basis of the interac-tion of archaeal SHMTs with the modified folates and theirquaternary assemblage.

MATERIALS AND METHODSData Collection

The three-dimensional structures of SHMT from E. coli(PDB code: 1DFO) and of the cytosolic isoenzymes fromman (1BJ4) and rabbit (1CJO) were taken from theBrookhaven Protein Data Bank (PDB).19 The three struc-tures were superposed and a sequence alignment wasderived from the structural equivalencies. The initialalignment was manually refined to optimize the position ofinsertions and deletions (indels). An exhaustive search ofSHMT sequences was conducted in the SWISS-PROT,TREMBL,20 and PIR21 databanks, by using PSI-BLAST22

and FASTA23 programs and eSHMT as query. Sequencesfound were retrieved with SRS (Sequence Retrieval Sys-tem).24 Redundant sequences were removed, with a final

yield of 63 sequences of SHMT, 22 of which belonged toEukarya, 33 to Bacteria, and 8 to Archaea. To limit thecomparisons to functional enzymatic units, only sequencesof mature proteins were considered (e.g., signal peptideswere removed). Sequences were then multiply aligned byusing the program CLUSTALW25 and manually manipu-lated to optimize the matching of several characteristics,including the observed and predicted secondary structuralelements, the hydrophobic regions in the three-dimen-sional structures, the structurally and functionally con-served residues, and indel regions in the structures. TheHOMOLOGY package in INSIGHTII (Accelrys 2000, SanDiego, CA) was used for the manipulation of structuresand alignments.

Optimum growth temperature assigned to each proteincorresponds to normal living environmental temperaturefor ectothermic organisms and to body temperature forhomeothermic organisms. These data were retrieved fromthe web site of the Deutsche Sammlung von Mikroorganis-men und Zelikulturen GmbH (DSMZ; URL: http://www.gbf.de/dsmz/) and of the Department of Earth and Plan-etary Sciences (Washington University, Saint Louis, MO;URL: http://levee.wustl.edu/�chan).

Sequence numbering used throughout this article refersto the eSHMT sequence (Fig. 1), if not otherwise stated.

Evolutionary Analysis

Evolutionary relationships were analyzed by using thePROTDIST, FITCH, DRAWGRAM, and DRAWTREE rou-tines of the PHYLIP program, available at the PasteurInstitute server (http://bioweb.pasteur.fr).26

Preferred Amino Acid Substitutions

Favored amino acid substitutions between thermophilicand mesophilic sequences were calculated from the mul-tiple alignments by using a modified version of the methodby Argos et al.27 In fact, the noise resulting from thephylogenetic distance and the diversity of lifestyles amongthe compared organisms may bias the statistics of theresidue mutations. We propose a novel weighing schemethat, beside the �T used by Argos et al.,27 incorporatescorrections for sequence evolutionary distance and residuefrequency. Given two aligned sequences (A, B), we defineaij as the number of times a residue of type i, in A, changedto type j in B. This number can be normalized by thenumber of residues of type i (ni) in A, to obtain theproportion of residue of type i of A, substituted with j in itscounterpart B (pAij � aij/ni). Likewise, pBij is defined. IfpAij � pBij, the positive difference pAij � pBij measures theproportion of residue of type i lost by sequence A toadvantage of residue of type j of B, after each comparison.Vice versa, if pBij � pAij, the negative difference pAij � pBij

measures the proportion of residue of type i gained bysequence B at the expense of residue of type j of A. Asubstitution matrix can be calculated by comparing eachprotein sequence in the multiple alignment with its ther-mophilic counterpart. If the possible pairwise sequencecomparisons were n, the cij elements of a temperature andevolutionary distance weighed average exchange matrixcan be calculated according to:

ADAPTATION OF SHMT TO EXTREME ENVIRONMENTS 123

cij �

�n ��TnEn2��pAij � pBij

�n ��TnEn2� n

(1)

where �Tn is the absolute difference (TA � TB) betweenthe optimum growth temperatures of the two species, inCelsius degrees. Because A corresponds to the sequence ofthe extremophilic organism and B to the sequence of themesophilic one, in Eq. 1 the weighing coefficient (�Tn/En

2)

Fig. 1. Multiple-sequence alignment of SHMT models and templates. Amino acid one-letter code is used. Dashes represent insertions and deletions;numbers above and below the sequences represent sequence numbering of eSHMT and absolute position of the alignment, respectively. Invariantpositions are boxed in black; alignment columns displaying an amino acid identity in more than a half of the sequences are boxed, and the conservedresidue is bolded. The sequences are labeled with their SWISS-PROT code. Secondary structures of eSHMT (GLYA_ECOLI) and hcSHMT (GLYC_HUMAN)are reported in the first two lines of each block: -helices and �-strands are rendered as squiggles and arrows, respectively. Blanks denote irregularconformations. ESPript50 was used to render this figure.

124 A. PAIARDINI ET AL.

is always positive, the mark of cij being due exclusively tothe difference pAij � pBij. E is the distance between the twosequences assigned by the FITCH program. According toEq. 1, the longer the distance between two sequences, thelower their weight in the calculation of cij. The overallexchange matrix for k thermophilic sequences was calcu-lated according to:

Cij � �k �cijk/k (2)

The mean and SD for the non-zero elements of the overallexchange matrix were determined; the significance Rij ofthe exchange ij was then calculated by dividing thedifference between Cij, and the overall matrix mean C� bythe standard deviation �:

Rij ��Cij � C� �

�(3)

Amino Acid Composition

Differences in amino acid composition were measuredby:

P �

�niext

�nitot

�Next

�Ntot

(4)

where ni and N represent respectively the number ofresidues of type i and the total number of residues presentin the thermophiles (ext), and in the entire dataset (tot). Pvalues �1 or �1, observed in the extremophilic sequences,indicate, respectively, a frequency of residue type i higheror lower than expected. P � 1 means neutrality. Equation4 was also applied in the calculation of the differences inamino acid composition between each thermophilic se-quence and the entire data set.

Model Building

The three-dimensional crystal structures of hcSHMT,rcSHMT, and eSHMT were used as templates for thehomology-derived construction of the dimeric models of 10SHMT from extremophilic organisms. Homology modelingwas based on the multiple-sequence alignment obtained asdescribed in Data Collection. Protein monomeric modelswere calculated with the MODELLER-4 package.28 Tendifferent models were built for each target protein by usingthe highest built-in refinement procedure and the onedisplaying the lowest objective function,29 which measuresthe extent of violation of constraints from the templates,was taken as the representative model. Calculation ofslightly different models of the same structure can be usedto indicate the most variable and, therefore, less reliableregions in the fold. The PLP molecule and a structurallyconserved water molecule, found in the active site of theenzyme,16 were taken from the eSHMT structure andincluded in the model as “block” residues (i.e., as rigidbodies). Model dimers were then built superposing eachmonomer onto the dimeric template eSHMT structure.Steric clashes at the subunit interface were manually

removed. Each dimeric model was optimized by energyminimization: only residues at the interface and active sitewere allowed to move under a tethering forcing constant of418 kJ � Å�1 and a torsional force applied to angles withforcing constant of 209 kJ � Å�1. A distance-dependentdielectric constant, no morse potential, no cross-terms, andcharges “on” were used for 100 steepest-descent and 500conjugate-gradient energy minimization steps. The inter-face was once more manually inspected, refined, and thensubjected to a further 100 and 500 steps of steepest-descent and conjugate-gradient minimizers, respectively,run in the same conditions. PROCHECK30 was used tomonitor the stereochemical quality of the final models,whereas PROSAII31 was used to measure the overallprotein quality in packing and solvent exposure.

Secondary structures were determined by using theprogram DSSP,32 and solvent accessibility was calculatedfrom atom coordinates with NACCESS.33 Structural re-gions displaying no more than 0.05 and no less than 0.25fractional accessibility were considered buried and ex-posed, respectively. Residues losing at least 15% of theiraccessible surface area on subunit association and notfarther from the other subunit more than 6 Å,34 wereconsidered to be at the interface. The charge distributionon the surface of archaeal SHMTs was calculated anddisplayed with GRASP.35

Amino Acids Exchanges in Different StructuralEnvironments

Propensities Pij for a residue exchange from type i totype j in different structural contexts were calculatedaccording to:

Pij �

�Cijenv

�Cijtot

�Naenv

�Natot

(5)

The terms (Cij)env and (Cij)tot represent elements of theoverall exchange matrices, calculated for residues ob-served in different structural environments (env) and forthe whole sequences (tot). (Na)env and (Na)tot are thenumber of residues counted in the structural environ-ments and in the entire dataset, respectively.

Modeling of the Interaction Between the ModifiedFolate and Pyrococcus horikoshii SHMT (phSHMT)

Pterine of Pyrococcus horikoshii13 was built by using theBUILDER module from the INSIGHTII suite (Accelrys,2000, San Diego, CA), starting from the crystal structureof THF complexed with eSHMT (which is in the form of aPLP-glycine-5-formyl-H4PteGlu ternary complex),16 andmanually positioned inside the active site of phSHMT.Only the ribitol and ribose moiety of the pterine tail wasincluded in the molecule. Functional groups shared byTHF and the archaeal pterine were oriented in a similarway to interact with the same functional residues. Aftermanual adjustment, to optimize favorable interactionsand relieve close contacts, potential energy of the modeled

ADAPTATION OF SHMT TO EXTREME ENVIRONMENTS 125

complex was minimized with the DISCOVER 2.9 module(Accelrys, 2000, San Diego, CA) by using the Cff91 forcefield. Five hundred steps of steepest-descent minimizationwere carried out, followed by conjugate gradient minimiza-tion, until the maximum Cartesian derivative of theenergy was �0.00418 kJ � Å�1. The PLP-substrate complexwas subjected to a tethering force of 418 kJ � Å�1 duringthe minimization steps. The energy-minimized structurewas then used to explore the potential energy surfacewithin the active site pocket. To this purpose, the torsionangle between the phenyl group of the modified folate andthe first carbon atom of the tail was rotated by 360° in 10steps of 36°; each different conformation of the complexwas subjected to energy minimization and molecular dy-namics. All phSHMT atoms were fixed, except thoseinteracting with the tail at a cutoff distance of 6 Å, duringits rotation. Dynamic simulation was performed for 10,000steps at 300 K, after a 100-steps equilibration at the sametemperature. The total energy of the system was moni-tored for the entire simulation. The Cff91 force field, adistance-dependent dielectric constant, a cutoff distance of40 Å and 1-fs timestep were used during the simulation.

All the programs were run under IRIX 6.5 on O2 SiliconGraphics stations.

RESULTS

The data collection used in this work included 63 SHMTsequences from several sources (Table I and Fig. 1). Tensequences, distributed among the Bacteria and Archaeadomains of life, are from thermophilic and hyperthermo-philic organisms (able to grow above 60°C and 80°C,respectively).11 Equation 1 was used to derive the ex-change matrix shown in Figure 2. Only the residue ex-changes scoring at a significance �R� � 3.0 SDs from themean value (see Materials and Methods), corresponding toa p value of about 0.001, were considered statisticallysignificant, whereas exchanges with 2 � �R� � 3 werediscussed only if supported by propensity-based aminoacidic composition analysis. Values �R� � 2.0 were rejectedas not significant. Eleven exchanges between thermophilicand mesophilic sequences scored at a significance �R� � 3.0(Table II). Positive R values denote exchanges in thedirection mesophile 3 extremophile and vice versa fornegative values.36 Average and individual variation ofamino acid composition are displayed in Table III.

The structural meaning of the most significant ex-changes observed was investigated through the systematicapplication of homology modeling to the 10 extremophilicSHMTs, using the structures of SHMT from E. coli, man,and rabbit as templates (Fig. 1). The level of sequenceidentity between templates and models guaranteed asufficiently accurate structural homology,28 ranging from32% (apSHMT aligned with eSHMT) to 61% (aaSHMTaligned with eSHMT). The reliability of the structuralmodels were assessed by PROCHECK and PROSAII. Allthe model parameters were within the accepted rangesand were comparable with those calculated for the struc-tural templates (Table IV). The superposition of the 10models built for each of the 10 target sequences suggested

that the least accurate regions are invariably loops locatedfar from the active site and from the subunit interface.

The examination of the 10 structural models, withparticular attention to the eight archaeal SHMTs, high-lighted structural features possibly related to the adapta-tion to extreme environments, which could not be inferredsolely by sequence analysis. The interaction of a variantfolate (phF) with an archaeal SHMT was analyzed in thesample case of the model of Pyrococcus horikoshii, becauseit displays the highest optimal temperature and shares37% sequence identity with the closest template. Theinitial position of the phF was set according to the crystalstructure of THF complexed with eSHMT (which is in theform of a PLP-glycine-5-formyl-H4PteGlu ternary com-plex).16 Local bumps were manually removed, in particu-lar the collision between the methyl group bound at C7 ofthe pterin ring and Asn 345 (sequence numbering isreferred to eSHMT, if not otherwise stated); alternativeconformations of the complex corresponding to differentpotential energy minima were tested. The orientation ofthe “tail”, corresponding to the local potential energyminimum of the system, is shown in Figure 3.

Thermozymes need to maintain the stability of theirquaternary structure especially if, as in the PLP enzymes,active site is formed by residues contributed to by adjacentsubunits. Several cases are known in which more exten-sive contacts between adjacent subunits, also achievedthrough protein oligomerization, are essential for theachievement of thermostability.37,38 The calculation of thepreferred amino acid exchanges and the fraction of totaland hydrophobic surface at the subunit interface betweenmonomers (data not shown) does not indicate any cleardifference except for a slight increase of the hydrophobicsurface in SHMTs from hyperthermophilic organisms.Indeed, the exchange Trp3Gly in the direction thermo-phile 3 mesophile is the only favored replacement at themonomer interface. This analysis showed also no signifi-cant correlation between amino acid replacements/compo-sition and domain structure (data not shown). A compari-son between the crystal structures and the archaealmodels of SHMTs revealed the additional presence in thelatter of a conserved cluster of six to eight His residues(three or four for each monomer; see absolute positions296, 297, 299, and 300 in the alignment in Fig. 1), facingeach other and surrounding the C2 symmetry axis at thecenter of the dimer (Fig. 4). These residues can establishvan der Waals interactions and hydrogen bonds, whichcontribute to the thermostability of the archaeal oligomerand may also suggest the presence of a ion-coordinationsite.

Comparison of the active sites of the archaeal SHMTswith their bacterial and eukaryotic counterparts showedno significant structural differences, apart from the substi-tution of the highly conserved Gly 262 with a Ser, whosefunction seems to be the replacement of a structural watermolecule present in the crystal structures of SHMTs andlikely involved in catalysis,16 with the hydroxyl group ofthe Ser side-chain.

126 A. PAIARDINI ET AL.

TABLE I. SHMT Data Set

Organism Domain of life* Data bank code Temperature of growth (°C) Sequence length

Candida albicans cytosolic E GLYC_CANAL 25 468Saccaromyces cerevisiae E GLYC_YEAST 27 468Neurospora crassa E GLYC_NEUCR 24 477Schizosaccaromyces pombe 1 E GLY2_SCHPO 30 468Schizosaccaromyces pombe 2 E GLY1_SCHPO 30 470Candida albicans mitochondrial E GLYM_CANAL 25 470Saccaromyces cerevisiae mitochondrial E GLYM_YEAST 27 473Flaveria pringlei 1 E GLYM_FLAPR 25 477Flaveria pringlei 2 E GLYN_FLAPR 25 477Solanum tuberosum E GLYM_SOLTU 25 478Arabidopsis thaliana E GLYA_ARATH 25 478Pisum sativum E GLYM_PEA 25 478Homo sapiens cytosolic E GLYC_HUMAN 37 473Orictolagus cuniculus cytosolic E GLYM_RABIT 37 474Ovis aries cytosolic E GLYC_SHEEP 37 474Mus musculus cytosolic E GLYC_MOUSE 37 477Homo sapiens mitochondrial E GLYM_HUMAN 37 471Orictolagus cuniculus mitochondrial E GLYM_RABIT 37 471Caenorabditis elegans E GLYC_CAEEL 37 469Drosophila melanogaster E Q9W457 37 472Leishmania major E Q9NEF2 30 460Encephalitozoon lepreae E GLYC_ENCCU 30 460Mycoplasma genitalium B GLYA_MYCLE 37 426Mycoplasma pneumoniae B GLYA_MYCPN 37 426Borrelia burgdorferi B GLYA_BORBU 35 418Deinococcus radiodurans B F75567 30 415Bacillus subtilis B GLYA_BACSU 30 416Bacillus halodurans B E84120 37 414Synechocystis sp. B GLYA_SYNY3 37 428Acinetobacter radioresistens B GLYA_ACIRA 37 418Pseudomonas aeruginosa B C83341 37 419Actinobacillus actinomytemcomitans B GLYA_ACTAC 37 421Haemophilus influenzae B GLYA_HAEIN 37 422Xylella fastidiosa B E82743 37 417Neisseria meningitis B GLYA_NEIME 26 417Neisseria meningitis sier. B B GLYA_NEIMBH 37 417Vibrio cholerae B H82258 37 416Neisseria gonorrhoeae B GLYA_NEIGO 37 417Buchnera aphidicola B GLYA_BUCAI 37 418Bradjrhizobium japonicum B GLYA_BRAJA 30 430Methylobacterium extorquens B GLYA_METEX 30 430Salmonella typhimurium B GLYA_SALTY 30 418Hyphomicrobium sp. B GLYA_HYPME 37 430Ricketia prowazekii B GLYA_RICPR 37 420Campylobacter jejunii B GLYA_CAMJE 37 415Helicobacter pilori B GLYA_HELPJ 37 417Mycobacterium tuberculosis B GLA1_MYCTU 37 427Mycobacterium lepreae B GLYA_MYCLE 35 427Streptomyces coelicolor B GLYA_STRCO 30 421Chlamydia pneumoniae B GLYA_CHLPN 37 481Chlamydia trachomatis B GLYA_CHLTR 37 481Treponema pallidum B GLYA_TREPA 37 491Escherichia coli B GLYA_ECOLI 37 418Aquifex aeolicus B GLYA_AQUAE 95 429Thermotoga maritima B GLYA_THEMA 80 428Aeropyrum pernix A GLYA_AERPE 90 441Sulfolobus solfataricus A GLYA_SULSO 90 417Pyrococcus horikoshii A GLYA_PYRHO 100 428Pyrococcus abyssi A GLYA_PYRAB 100 421Methanobacterium thermoautotrophicum A GLYA_METTH 65 429Methanothermobacter marburgensis A GLYA_METTM 65 425Methanococcus jannaschii A GLYA_METJA 85 430Archaeoglobus fulgidus A GLYA_ARCFU 80 438

*E � Eukarya; B � Bacteria; A � Archaea; symbols referred to thermophiles are bolded.

ADAPTATION OF SHMT TO EXTREME ENVIRONMENTS 127

DISCUSSION

The adaptation of enzymes to extreme environments hasbeen analyzed during the past decades by two differentmethodological approaches: (i) experimentally, by applica-tion of random or knowledge-based site-directed mutagen-esis or in vitro evolution39–41; (ii) theoretically, by com-parative analysis among homologous three-dimensionalstructures or sequences from mesophiles and extremo-philes. However, there are a few potential drawbacks inthe latter approach15: (i) quite significant phylogenetic

biases may exist within these data, which have little ornothing to do with adaptation to function at any given“extreme” physicochemical condition, but reflect instead,the phylogenetic history and a wide range of differentevolutionary pressures on protein structure and functionexperienced by the species; (ii) the often too small samplesets used in these studies. The great number of proteinsequences retrieved from the databanks and a carefulintrafamily statistical and structural comparison associ-ated with a new weighing scheme permitted the above-

Fig. 2. A: Significance R of the C matrix exchanges and (B) counts of residue substitutions observed inthermophilic SHMTs. Amino acid residues are indicated with the one-letter code. Positive R values denoteexchanges in the direction mesophile3 thermophile and vice versa for negative values.

128 A. PAIARDINI ET AL.

mentioned limitations to be overcome.42 Despite the vari-ability of behaviors adopted by the individual members ofthe SHMT family, some common trends have been high-lighted, which indicated a few significantly different fea-tures possibly related to adaptation to the extreme environ-ment considered.

Thermostability of SHMTSalt bridges

The most significant exchange observed in our sample isAsp3Glu (R � 6.3) in the direction mesophile3 thermo-phile, which is reflected in the opposite exchange Glu3Asp(R � �3.2; the negative mark of R indicates the directionthermophile3mesophile). These substitutions occur mainly

at exposed sites, within -helices or coil regions (Table II).Other exchanges involving Glu and scoring at a signifi-cance between 2 and 3 were observed (Gln3Glu R � 2.6,Ala3Glu R � 2.2; Glu3Ala R � �2.3). Indeed, increase ofGlu can be appreciated in all the thermophilic proteins,irrespectively of their evolutionary origin (bacterial orarchaeal) and their optimum growth temperature (TableIII). On the contrary, Asp is involved only in anotherexchange, Asp3Asn (R � �2.3), which suggests a signifi-cant proportion of Asp gained by thermophiles at theexpense of Asn. This is confirmed by the compositionalanalysis (Table III), which shows Glu be significantly morerepresented in the thermophiles than in the mesophiles(P � 1.37), whereas Asp is evenly distributed among the

TABLE II. Most Significant Amino Acid Substitutions in Thermophilic SHMTs

ThermophilesSignificancea

Rb

Propensityc

Favoured regions

Propensityd Favouredaccessibility

stateHelix Sheet Coil Buried Exposed

Asp3Glu 6.3 1.63 0.01 0.66 Helix 0.02 2.44 ExposedPhe3Met 4.7 1.14 1.14 0.82 Helix/sheet 2.22 0.00 BuriedGln3Lys 3.1 1.81 0.46 0.34 Helix 0.03 2.71 ExposedTyr3Phe 3.0 0.69 0.33 1.52 Coil 0.47 0.18 BuriedMet3Phe �4.4 0.79 3.71 0.39 Sheet 2.21 0.17 BuriedLeu3Ile �3.8 1.20 2.02 0.48 Sheet 1.79 0.11 BuriedTrp3Tyr �3.8 1.34 0.16 0.92 Helix 0.00 1.42 ExposedGlu3Asp �3.2 1.23 0.00 1.07 Helix 0.10 2.22 ExposedTrp3Gly �3.1 0.00 0.00 2.32 Coil 0.00 0.00 e

Phe3Tyr �3.1 0.73 0.16 1.53 Coil 1.07 0.92 BuriedGln3Pro �3.1 0.32 0.41 1.87 Coil 0.00 0.99 ExposedaA positive mark indicates an amino acid replacement in the direction mesophile3 thermophile; a negative mark indicates the opposite direction.bThe significance of residue substitutions is reported as the ratio between the value and the SD of the non-zero elements of the matrix C. Onlysubstitutions with a significance �3.0 are reported.cPropensity is defined as the ratio between the fraction of an amino acid exchange at each type of secondary structure and the fraction ofsecondary structural content in all the three-dimensional structures.dPropensity is defined as the ratio between the fraction of an amino acid exchange at a buried or exposed position and the fraction of residuesfound at the respective position in all the three-dimensional structures.eThis exchange is observed preferably (p � 4.36) in structural regions displaying a fractional accessibility between 0.05 and 0.25 (see Materialsand Methods).

TABLE III. Residue Composition (%), Overall and Individual Propensities (P) in Thermophilic SHMTs

GLYA_AERPRa

GLYA_AQUIa

GLYA_ARCFUa

GLYA_METJAa

GLYA_METTHa

GLYA_METTMa

GLYA_PYRABa

GLYA_PYRHOa

GLYA_SULSOa

GLYA_THEMAa Overall*

% P % P % P % P % P % P % P % P % P % P P

Trp 1.1 2.25 0.5 0.95 0.2 0.47 0.7 1.41 0.5 0.95 0.7 1.43 1.4 2.82 1.4 2.77 0.7 1.46 0.5 0.95 1.45Ile 6.3 1.16 5.1 0.95 6.6 1.21 7.0 1.28 7.2 1.32 7.3 1.34 5.7 1.05 5.6 1.03 6.2 1.15 6.8 1.24 1.15Phe 2.9 0.81 3.7 1.02 4.3 1.18 5.1 1.39 4.2 1.14 3.8 1.03 5.0 1.35 5.2 1.39 4.3 1.11 3.5 0.96 1.12Leu 9.8 1.10 7.9 0.90 7.8 0.88 7.9 0.90 8.2 0.92 7.8 0.88 7.9 0.89 7.5 0.85 7.9 0.90 6.8 0.77 0.91Met 3.2 1.39 3.3 1.43 3.0 1.30 2.8 1.23 4.2 1.82 3.5 1.54 3.1 1.35 3.3 1.43 3.4 1.47 3.3 1.43 1.36Val 8.8 1.21 8.4 1.15 9.1 1.25 5.8 0.80 6.5 0.90 6.6 0.91 8.1 1.11 8.4 1.15 8.4 1.15 8.4 1.15 1.07Cys 0.2 0.21 0.5 0.43 0.7 0.63 1.2 1.07 0.9 0.86 1.2 1.08 0.2 0.22 0.2 0.23 0.2 0.22 0.9 0.86 0.62Tyr 3.8 0.78 4.4 1.16 3.2 0.84 3.3 0.86 3.3 0.86 2.6 0.68 4.0 1.06 4.2 1.10 3.6 0.95 3.5 0.92 0.93Pro 4.5 1.01 4.4 0.99 3.0 0.67 4.2 0.94 3.3 0.73 3.8 0.84 5.0 1.11 5.2 1.14 4.3 0.96 4.4 0.99 0.95Ala 8.8 0.84 8.8 0.93 8.9 0.85 9.6 0.91 9.3 0.89 8.3 0.79 8.8 0.84 8.4 0.80 7.9 0.76 9.1 0.87 0.87Thr 4.5 0.90 5.6 1.11 3.2 0.64 3.0 0.60 3.3 0.65 4.2 0.84 3.1 0.62 3.3 0.65 5.3 1.04 5.4 1.06 0.83His 3.2 1.08 2.8 0.96 3.0 1.01 4.0 1.34 4.0 1.34 3.5 1.20 4.5 1.53 4.4 1.50 2.9 0.98 2.8 0.96 1.16Gly 7.9 0.94 8.2 0.97 7.3 0.87 6.8 0.81 7.2 0.86 7.5 0.90 8.6 1.02 8.7 1.03 7.0 0.83 8.2 0.98 0.93Ser 5.4 0.97 3.5 0.63 6.6 1.18 5.4 0.96 6.1 1.08 7.3 1.30 4.0 0.73 4.0 0.72 5.0 0.90 3.7 0.67 0.93Gln 2.3 0.64 2.1 0.59 2.1 0.58 2.6 0.72 2.3 0.66 1.9 0.53 2.9 0.80 2.8 0.79 2.9 0.81 2.1 0.59 0.70Asn 3.6 0.73 2.9 1.03 3.7 0.90 3.5 0.86 4.2 1.03 4.5 1.10 2.9 0.71 2.6 0.64 4.6 1.12 3.5 0.86 0.91Glu 7.5 1.18 8.4 1.33 10 1.58 10.3 1.61 10.5 1.65 9.7 1.52 8.6 1.35 8.7 1.37 8.7 1.37 10.1 1.58 1.37Asp 5.4 1.05 4.7 0.90 4.8 0.92 5.6 1.07 5.6 1.08 5.7 1.09 5.2 1.01 5.2 0.99 4.8 0.93 3.7 0.72 0.98Lys 7.0 1.12 7.0 1.30 5.9 0.95 8.4 1.33 3.7 0.60 4.7 0.75 7.6 1.21 7.7 1.23 8.7 1.37 8.0 1.26 1.09Arg 5.0 1.08 4.2 0.91 6.6 1.42 3.0 0.66 5.4 1.16 5.4 1.17 3.3 0.73 3.3 0.71 3.4 0.73 5.2 1.11 0.98

*Propensity calculated over all the SHMT thermophilic sequences.aCode refers to Table I.

ADAPTATION OF SHMT TO EXTREME ENVIRONMENTS 129

two (P � 0.98). An increased proportion of Glu in thermo-philes was already noted by several authors27,43,44 andexplained by the presence of additional salt bridges. Onaverage, only one residue exchange involves a basic resi-due: Gln3Lys (R � 3.1). However, an increased popula-tion of positively charged residues can be observed fromthe compositional analysis: in particular, Lys increases inbacterial (P � 1.30 for GLYA_AQUAE and P � 1.26 forGLYA_THEMA) and hyperthermophilic SHMTs, whereasArg is more frequent in thermophilic archaeal SHMTs

(Table III). The scrutiny of the exchange matrices calcu-lated for the individual thermo- and hyperthermophilicsequences display several replacements involving eitherLys or Arg (for example, Gln3Lys R � 3.3 for GLYA_AQUAE; Lys3Arg R � 2.8 for GLYA_ARCFU; Arg3LysR � 2.6 and Trp3Arg R � 2.8 for GLYA_METJA;Arg3Lys R � �2.5 for GLYA_METTH; Asp3Lys R � 2.6and Arg3Asp R � �2.7 for GLYA_PYRHO; Asp3Lys R �2.8 for GLYA_SULSO; and Lys3Ala R � �2.6 forGLYA_THEMA).

TABLE IV. Percentage of Sequence Identity With Structural Templates and Overall Quality of the Models

Model

% Sequence identity with structuraltemplates

Ramachandran plot*PROCHECK

overall qualityaPROSAII

overall qualitybeSHMT hcSHMT rcSHMT

apSHMT 32 28 27 98.9 �0.24 0.83aaSHMT 61 48 48 99.2 �0.23 1.06afSHMT 35 34 33 99.5 �0.20 0.89mtSHMT 37 33 33 99.7 �0.27 0.89mmSHMT 37 32 32 99.4 �0.31 0.80mjSHMT 37 34 33 99.5 �0.20 0.79phSHMT 37 31 30 99.2 �0.28 0.90paSHMT 37 31 30 99.3 �0.26 0.90ssSHMT 32 29 29 99.7 �0.30 0.86tmSHMT 58 47 47 99.5 �0.15 0.96

%Residues in most favored, additional allowed, and generously allowed regions of the Ramachandran plot.aOverall score as listed in the “PROCHECK summary” output. Recommended value for good structures is ��0.50.bNormalized z score. Recommended value for good structures is �0.70; values obtained for eSHMT, rcSHMT, and hcSHMT are 1.03, 0.93, and0.89, respectively.

Fig. 3. Model structure of the bound phF at the active site of phSHMT. phSHMT is represented as blueribbons; phF is shown as yellow sticks and meshes and superposed to formyl-H4PteGlu, for reference. PLP isrepresented as pink sticks. Oxygen atoms are colored red, nitrogen atoms blue, and phosphorus purple. Thehydrophobic cavity facing the two methyl groups at C7 and C9 of the modified folate is shown as purple CPKs.This figure was rendered by using pyMOL.51

130 A. PAIARDINI ET AL.

Examination of additional charged residues on the sur-face of thermophilic SHMTs was thus conducted by avisual inspection of the 10 structural models. In severalcases, we found the presence of potential new ion pairs,some of which involved in the formation of networks (inthis work, “strong” and “weak” ion pairs are considered tobe formed if any side-chain oxygen atom of Glu and Asp iswithin 4.0 Å and 6.0 Å, respectively, of any side-chainnitrogen atom of Arg or Lys).44,45

An example is shown in Figure 5, which displays thecomparison between the C-terminal portion of eSHMT foldedas a helix-turn-helix motif (residues 375–412) and the corre-sponding part in the model of aaSHMT. A few changes

involving charged residues can be detected: (i) the replace-ment of Ala 397 in eSHMT with Lys can lead to the formationof a new “strong” salt bridge with Glu 396 in the thermophilicenzyme; (ii) several amino acidic exchanges (namely,Asp388Glu, Lys405Glu, Glu381Lys, and Asp408Glu) contrib-ute to the formation of a large ion pair network, whosepresence could enhance the thermostability of the two -heli-ces of aaSHMT. It is worth mentioning that the replacementAsp388Glu (it should be recalled that Asp3Glu is the mostsignificant exchange observed) was responsible for the forma-tion of a “strong” ion pair, bringing closer two oppositecharges. Structural consistency of this model is ensured bythe high similarity to the template sequence accounting for

Fig. 4. Ribbon representation of the interface between monomers of phSHMT. The two monomers arecolored light and dark gray, respectively. For clarity, only the cluster-forming His are displayed as stick models.Residues are labeled by absolute sequence position; the prime indicates that the residues are contributed fromthe other subunit. PLP are displayed as CPKs.

ADAPTATION OF SHMT TO EXTREME ENVIRONMENTS 131

61% identity (Table IV). Moreover, each salt bridge is con-served in at least 6 of the 10 different models built foraaSHMT.

Such observations suggest that formation of ion pairs isan important component of thermal adaptation of SHMT.

Hydrophobic residues

Several exchanges involve hydrophobic aromatic resi-dues (Table II) such as Tyr3Phe (R � 3.0; directionmesophile 3 thermophile), Trp3Tyr (R � �3.8), andTrp3Gly (R � �3.1), the latter two in the directionthermophile 3 mesophile. These exchanges are reflectedin the variation of average amino acid composition ofthermophiles (Table III), where a marked increase ofaromatic residue content can be detected: significant in-crease of Trp (P � 1.45) and, to a lesser extent, Phe (P �1.12) contents, at the expense of Tyr (P � 0.93) and Gly(P � 0.93). This is particularly evident in the phSHMT(P � 2.77 and 1.39 for Trp and Phe, respectively) andpaSHMT (P � 2.82 and 1.35) whose optimal growthtemperature is the highest in our data set. Indeed, Phe�Trpcontent in phSHMT and paSHMT (28 residues) is doubled,compared to eSHMT (14 residues). The exchange Trp3Tyr(thermophile3mesophile), which occurs mainly at the paand phSHMT surfaces, may also explain why a slightincrease in the polar/hydrophobic surface ratio was found

systematically in all the thermostable SHMTs, in compari-son with their mesophilic homologues, with the remark-able exception of phSHMT and paSHMT.

The clustering of aromatic residues was previouslysuggested to be an important feature contributing to theenhanced thermostability46,47 and barostability10,48 of ex-tremophilic proteins. A structural scrutiny of the SHMTmodels to examine the distribution of aromatic residues(according to Chang and Loew,48 for an aromatic residue tobe considered in a cluster, it must find at least anotheraromatic residue within a 7Å C-C distance cutoff) de-tected the largest cluster in the hyperthermophilicphSHMT and paSHMT. A similar side-chain orientation ofthe residues forming the cluster was present in all of the10 alternative models built for ph and pa SHMTs (themean of the absolute deviation from the �1 mean value ofeach residue is 9 � 5 degrees). Four Phe and one Trpresidues replace five nonaromatic chains of eSHMT(namely, Ile 147, Ala 149, Gly 151, Gly 172, and Asn 351),whereas three Tyr residues (145, 177, and 213) weresubstituted by one Trp and two Phe, respectively. Theseresults suggest a crucial role of the aromatic residues inthe thermal adaptation of SHMTs. It has been suggestedthat aromatic interactions may also play a role in thepiezophilic enzymes.10 We recalculated the exchanges andamino acid composition variations in the subset containing

Fig. 5. Comparison between the C-terminal portion of eSHMT (dark gray) and the equivalent part in themodel aaSHMT (light gray). Several amino acidic exchanges (see Discussion) contribute to the formation ofadditional ion pairs, whose presence could enhance the stability of the two -helices of aaSHMT at hightemperatures. Residues involved in the exchanges are shown as sticks and labeled according to eSHMTsequence position. Distance between the atoms involved in the formation of new ion pairs is also shown.Side-chains of Glu 396 and Arg 401 were missing from the eSHMT PDB file.

132 A. PAIARDINI ET AL.

thermophilic piezophilic organisms (i.e., mj, pa, ph, andtmSHMTs. No significant difference was found (data notshown). Therefore, it is difficult to assess in our sample therole, if any, of aromatic interactions in the adaptation tohigh hydrostatic pressure.

Besides the increased content of aromatic residues,other amino acid exchanges involving hydrophobic resi-dues are statistically significant. In particular, Phe3Met(R � 4.7), Ile3Leu (R � �3.8), observed in buried sites(Table II), and Ala3Glu (R � 2.2) at exposed sites, mayexplain the increased content of Met (overall P � 1.36), Ile(P � 1.15) and the decreased content of Ala (P � 0.87).These results, already described in literature,48 suggestthat a different distribution of hydrophobic residues withinthe protein structure and an improved packing of theinterior of the protein are another strategy adopted bySHMT to achieve thermal stability.

Thermolabile residues

It was observed that thermophilic proteins tend to bedepleted of residues that undergo degradation at hightemperatures with consequent inactivation of the enzyme.Thermolabile residues are considered Gln, Asn, Asp,Cys.48,49 Indeed, the frequent substitution of Gln with Lys(R � 3.1), Glu (R � 2.6), and Val (R � 2.3) in the directionmesophile 3 thermophile, the amino acid replacementAsp3Asn (R � �2.3) in the opposite direction and a largedecrease in Gln (P � 0.70), Cys (P � 0.62) and, to a lesserextent, Asn (P � 0.91) content (Table III), prompted us tofurther investigate the variation of occurrence of potentialthermolabile residues and deamidation/isoaspartate forma-tion sites. We calculated the propensity of the occurrenceof deamidation and isoaspartate formation sites Asn-Gly,Asn-Ser, and Asp-Gly in thermophilic versus mesophilicSHMTs, obtaining, respectively, the P score of 0.2, 0.6, and1.02. It is interesting that such sites were completelyabsent in the enzymes from ph, pa, and apSHMTs. Thisanalysis suggests an inverse correlation between thermo-stability of SHMT and presence in the sequence of deami-dation sites and thermolabile residues in general.

Interaction With Modified Folates and QuaternaryAssemblage

The position of phF inside the active site pocket ofphSHMT is very similar to that of THF in eSHMT. Theenergy of the optimal conformation is �423.72 kJ/mol,which compares with 1097.73 kJ/mol of the initial struc-ture. The two methyl groups bound at C7 and C9 of themodified folate are accommodated inside a hydrophobiccleft, mainly defined by the side-chain carbons of Ser 35,Tyr 64�, Tyr 65�, Lys 346, and Asn 345 (the prime indicatesthat the residues are contributed from the other subunit).The terminal portion of the molecule interacts with theloop and the -helix defined by residues 57�-65� and246-256, respectively.

Bacterial and eukaryotic SHMTs have different quater-nary assemblages: although eSHMT is a homodimer,rcSHMT exists entirely as a homotetramer under physio-logical conditions. This pattern can be related to the

different charge surface distribution required for bindingthe chemically distinct polyglutamate tails of folates: ineukaryotes, the polyglutamate tail is entirely �-linked,whereas in bacteria only the first three glutamates are�-linked, the following being -linked.16,17 The differentoligomeric structure corresponds to different sequencefeatures. Tetrameric quaternary structure of the eukary-otic SHMTs seems to be linked to the presence of His 113(which establishes H-bonds and stacking interactions withthe symmetry related His from the other dyads) and of theion pairs formed by Glu 141 and Arg 115 on oppositedimers. These residues are replaced in eSHMT by Pro,Gln, and Thr, respectively. Examination of the archaealsequences highlighted E. coli-like features: (i) His 113 isreplaced by a Gly in all the archaeal enzymes; (ii) Glu 141is not conserved; (iii) Arg 115 is replaced by Lys, Pro, orThr. Moreover, the GRASP electrostatic field of the surfaceof the archaeal enzymes is more similar to eSHMT than tohc or rcSHMT. Consequently, it is anticipated that ar-chaeal SHMTs have a dimeric biological unit.

CONCLUSION

Enzyme comparative analysis along orthologous lin-eages can suggest possible strategies for molecular adapta-tion to extreme environments. Although this work wasfocused on SHMT, the methodological approach can beapplied to other enzyme systems for which suitable three-dimensional and sequential postgenomic information isavailable. Moreover, the conclusions reached can be experi-mentally tested.

ACKNOWLEDGMENTS

We thank Professor Donatella Barra for support andhelpful advice. We are grateful to Francesco Serci for hisinitial contribution to the research. This work will besubmitted by A.P. in partial fulfillment of the require-ments of the degree of Dottorato di Ricerca at the Univer-sita di Roma “La Sapienza.”

REFERENCES

1. Sanchez R, Pieper U, Mirkovic N, De Bakker PIW, Wittenstein E,Sali A. MODBASE: a database of annotated comparative proteinstructure models. Nucleic Acids Res 2002;28:255–259.

2. De Long R. Extreme genomes. Genome Biol 2000;1:Review 1029.3. Jaenicke R, Bohm G. The stability of proteins in extreme environ-

ments. Curr Opin Struct Biol 1998;8:738–748.4. Vogt G, Argos P. Protein thermal stability: hydrogen bonds or

internal packing? Fold Des 1997;2:540–546.5. Kumar S, Nussinov R. How do thermophilic proteins deal with

heat? Cell Mol Life Sci 2001;58:1216–1233.6. Gerday C, Aittaleb M, Bentahir M, Chessa JP, Claverie P, Collins

T, D’Amico S, Dumont J, Garsoux G, Georlette D, Hoyoux A,Lonhienne T, Meuwis MA, Feller G. Cold-adapted enzymes: fromfundamentals to biotechnology. Trends Biotechnol 2000;18:103–107.

7. Gianese G, Bossa F, Pascarella S. Comparative structural analy-sis of psychrophilic and meso- and thermophilic enzymes. Proteins2002;47:236–249.

8. Kennedy SP, Ng WV, Salzberg SL, Hood L, DasSarma L. Under-standing the adaptation of Halobacterium species NRC-1 to itsextreme environments through computational analysis of itsgenome sequence. Genome Res 2001;11:1641–1650.

9. Richard SB, Madern D, Garcin E, Zaccai G. Halophilic adaptation:novel solvent-protein interactions observed in the 2.9 and 2.6Åresolution structures of the wild type and a mutant of malate

ADAPTATION OF SHMT TO EXTREME ENVIRONMENTS 133

dehydrogenase from Haloarcula marismortui. Biochemistry 2000;39:992–1000.

10. Gross M, Jaenicke R. Proteins under pressure. Eur J Biochem1994;221:617–630.

11. Vieille C, Burdette DS, Zeikus JG. Thermozymes. BiotechnolAnnu Rev 1996;2:1–83.

12. White RH. Distribution of folates and modified folates in ex-tremely thermophilic bacteria. J Bacteriol 1991;173:1987–1991.

13. Maden BEH. Tetrahydrofolate and tetrahydromethanopterin com-pared: functionally distinct carriers in C1 metabolism. BiochemSoc 2000;350:609–629.

14. Contestabile R, Angelaccio S, Bossa F, Wright HT, Scarsdale N,Kazanina G, Schirch V. Role of tyrosine 65 in the mechanism ofserine hydroxymethyltransferase. Biochemistry 2000;39:7492–7500.

15. Sheridan PP, Panasik N, Coombs JM, Brenchley JE. Approachesfor deciphering the structural basis of low temperature enzymeactivity. Biochim Biophys Acta 2000;1543:417–433.

16. Scarsdale JN, Radaev S, Kazanina G, Schirch V, Wright HT.Crystal structure at 2.4Å resolution of E.coli serine hydroxymeth-yltransferase in complex with glycine substrate and 5-formyltetrahydrofolate. J Mol Biol 2000;296:155–168.

17. Renwick SB, Snell K, Baumann U. The crystal structure of humancytosolic serine hydroxymethyltransferase: a target for cancerchemotherapy. Structure 1998;6:1105–1116.

18. Scarsdale JN, Kazanina G, Radaev S, Schirch V, Wright HT.Crystal structure of rabbit cytosolic serine hydroxymethyltrans-ferase at 2.8 A resolution: mechanistic implications. Biochemistry1999;38:8347–8358.

19. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, WeissigH, Shindyalov IN, Bourne PE. The protein data bank. NucleicAcids Res 2000;28:235–242.

20. Junker V, Contrino S, Fleischmann W, Hermjakob H, Lang F,Magrane M, Martin MJ, Mitaritonna N, ODonovan C, Apweiler R.The role SWISS-PROT and TrEMBL play in the genome researchenvironment. J Biotechnol 2000;78:221–234.

21. Barker WC, Garavelli JS, Huang H, McGarvey PB, Orcutt BC,Srinivasarao GY, Xiao C, Yeh LS, Ledley RS, Janda JF, Pfeiffer F,Mewes HW, Tsugita A, Wu C. The protein information resource(PIR). Nucleic Acids Res 2000;28:41–44.

22. Friedberg I, Kaplan T, Margalit H. Evaluation of PSI-BLASTalignment accuracy in comparison to structural alignments. Pro-tein Sci 2000;9:2278–2284.

23. Pearson WR, Lipman DJ. Improved tools for biological sequencecomparisons. Proc Natl Acad Sci USA 1988;85:2444–2448.

24. Etzold T, Argos P. SRS-An indexing and retrieval tool for flat filedata libraries. Comput Appl Biosci 1993;9:49–57.

25. Thompson JD, Higgins DG, Gibson TJ. CLUSTALW: improvingthe sensitivity of progressive multiple alignment through se-quence weighting, position-specific gap penalties and weightmatrix choice. Nucleic Acids Res 1994;22:4673–4680.

26. Felsenstein J. Distance methods for inferring phylogenies: ajustification. Evolution 1984;38:16–24.

27. Argos P, Rossman MG, Grau UM, Zuber H, Frank G, TratschinJD. Thermal stability and protein structure. Biochemistry 1979;25:5698–5703.

28. Sali A, Potterton L, Yuan F, Van Vlijmen H, Karplus M. Evalua-tion of comparative protein modeling by MODELLER. Proteins1995;23:318–326.

29. Burke DF, Deane CM, Nagarajaram HA, Campillo N, Martin-Martinez M, Mendez J, Molina F. An iterative structure approachto sequence alignment and comparative modelling. Proteins 1999;3:55–60.

30. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PRO-CHECK: a program to check the stereochemical quality of proteinstructures. J Appl Crystallogr 1993;26:283–291.

31. Sippl MJ. Recognition of errors in three-dimensional structures ofproteins. Proteins 1993;17:355–362.

32. Kabsch W, Sander C. Dictionary of protein secondary structure:pattern recognition of hydrogen-bonded and geometrical features.Biopolymers 1983;22:2577–2637.

33. Hubbard SJ, Thornton JM. NACCESS, computer program. Depart-ment of Biochemestry and Molecular Biology, University College,London, 1993.

34. McPhalen CA, Vincent MG, Picot D, Jansonius JN, Lesk AM,Chothia C. Domain closure in mitochondrial aspartate aminotrans-ferase. J Mol Biol 1992;227:193–217.

35. Nicholls A, Honig B. A rapid finite difference algorithm, utilizingsuccessive over-relaxation to solve the Poisson-Boltzmann equa-tion. J Comp Chem 1991;12:435–445.

36. Haney PJ, Badger JH, Buldak GL, Reich CI, Woese CR, Olsen GJ.Thermal adaptation analyzed by comparison of protein sequencesfrom mesophilic and extremely thermophilic Methanococcus spe-cies. Proc Natl Acad Sci USA 1999;96:3578–3583.

37. Kirino H, Aoki M, Aoshima M, Hayashi Y, Ohba M, Yamagishi A,Wakagi T, Oshima T. Hydrophobic interaction at the subunitinterface contributes to the thermostability of 3-isopropylmalatedehydrogenase from an extreme thermophile, Thermus thermophi-lus. Eur J Biochem 1994;220:275–281.

38. Villeret V, Clantin B, Tricot C, Legrain C, Roovers M, Stalon V,Glansdorff N, Van Beeumen JF. The crystal structure of Pyrococ-cus furiosus ornithine carbamoyltransferase reveals a key role foroligomerization in enzyme stability at extremely high tempera-tures. Proc Natl Acad Sci USA 1998;17:2801–2806.

39. Arnold FH, Wintrode PL, Miyazaki K, Gershenson A. Howenzymes adapt: lessons from directed evolution. Trends BiochemSci 2001;26:100–106.

40. Vetriani C, Maeder DL, Tolliday N et al. Protein thermostabilityabove 100°C: a key role for ionic interactions. Proc Natl Acad SciUSA 1998;95:12300–12305.

41. Sun DP, Sauer U, Nicholson H, Matthews BW. Contributions ofengineered surface to the stability of T4 lysozyme determined bydirected mutagenesis. Biochemistry 1991;30:7142–7153.

42. Karshikoff A, Ladenstein R. Ion pairs and thermotolerance ofproteins from hyperthermophiles: a “traffic rule” for hot roads.Trends Biochem Sci 2001;26:550–556.

43. Britton KL, Baker PJ, Borges KMM et al. Insights into thermalstability from a comparison of the glutamate dehydrogenases fromPyrococcus furiosus and Thermococcus litoralis. Eur J Biochem1995;229:688–695.

44. Szilagyi A, Zavodszky P. Structural differences between meso-philic, moderately thermophilic and extremely thermophilic pro-tein subunits: results of a comprehensive survey. Structure 2000;8:493–504.

45. Barlow DJ, Thornton JM. Ion-pairs in proteins. J Mol Biol1983;168:867–885.

46. Connerton I, Cummings N, Harris GW, Debeire P, Breton C. Asingle domain thermophilic xylanase can bind insoluble xylan:evidence for surface aromatic clusters. Biochim Biophys Acta1999;1433:110–121.

47. Kannan N, Vishveshwara S. Aromatic clusters: a determinant ofthermal stability of thermophilic proteins. Protein Eng 2000;13:753–761.

48. Chang YT, Loew G. Homology modeling, molecular dynamicssimulations, and analysis of CYP119, a P450 enzyme from ex-treme acidothermophilic archaeon Sulfolobus solfataricus. Bio-chemistry 2000;39:2484–2498.

49. Tomazic SJ, Klibanov AM. Mechanisms of irreversible thermalinactivation of Bacillus licheniformis -amylases. J Biol Chem1988;263:3086–3091.

50. Gouet P, Courcelle E, Stuart DI, Metoz F. ESPript: analysis ofmultiple sequence alignments using PostScript. Bioinformatics1999;15:305–308.

51. De Lano WL. The pyMOL molecular graphics system. DeLanoScientifics, San Carlos, CA, USA, 2002.

134 A. PAIARDINI ET AL.