Analysis of zinc-ligand bond lengths in metalloproteins: Trends and patterns

10
proteins STRUCTURE FUNCTION BIOINFORMATICS Analysis of zinc-ligand bond lengths in metalloproteins: Trends and patterns Bruno Tamames, Se ´rgio Filipe Sousa, Juan Tamames, Pedro Alexandrino Fernandes, and Maria Joa ˜o Ramos * REQUIMTE, Departamento de Quı´mica, Faculdade de Cie ˆncias, Universidade do Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal INTRODUCTION Zinc, the second-most abundant transition element in living organisms after iron and the only metal known to be represented in enzymes from each one of the six classes established by the International Union of Biochemistry, 1 is an essential component of a very large number of enzymes involved in virtually all aspects of metabolism, and present in a broad array of species of all phyla. 1–3 In fact, a recent bioinformatics survey on the Zn proteins encoded in the human ge- nome has proposed that about 2800 human proteins are potentially zinc-binding in vivo, a number that represents something like 10% of the total human pro- teome. 4 Notable examples of Zn metalloenzymes include the carbonic anhydrases I and II, the carboxypeptidases A, B, and T, alcohol dehydrogenase, and thermo- lysin. 3,5 The RNA polymerase II, 6–10 the matrix metalloproteinases, 11–13 pro- tein farnesyltransferase, 14–19 and the metallo-b-lactamases 20,21 are among the most discussed examples in recent years. Zinc has particular characteristics that detach it from the other first-row transition metals, and that make it very attractive for biological systems, in both enzymatic ca- talysis and in maintaining structure. The flexible coordination geometry, the fast ligand exchange, the strong binding to suitable sites, the lack of redox activity (no generation of free radicals), and its role as Lewis acid are just a few examples, which add to a high bio-availability. 3,22 Zn proteins traditionally adopt a coordination number of 4 or 5 in the metal coordination sphere. Histidine is the most common amino acid ligand, followed by glutamate, aspartate, and cysteine. 2,23 Despite the importance that many Zn proteins play in living organisms several significant struc- tural and mechanistic questions remain unanswered. The RSBC Protein Data Bank 24 is a trove of raw information that with the right treatment can provide answers to some of the questions involving the coordination of zinc, and one that has been relatively forgotten in its full scope. The growing num- ber of zinc metalloproteins in the PBD, and the improving quality of these structures, make a systematic survey of all zinc binding sites worthwhile, also enabling a fresh- approach to elusive structural aspects characteristic of zinc systems, such as the ones that arise, for example, from the carboxylate-shift mechanism. 18,25 In this study, the zinc coordination spheres of 994 PDB structures were care- fully analyzed comprising a total of 7547 zinc ligands evaluated, and of 10,776 Grant sponsor: FCT (Fundac ¸a ˜o para a Cie ˆncia e a Tecnologia); Grant numbers: POCI/QUI/61563/2004, SFRH/BD/ 25457/2005, SFRH/BD/12848/2003. *Correspondence to: Maria Joa ˜o Ramos, REQUIMTE, Departamento de Quı´mica, Faculdade de Cie ˆncias, Universidade do Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal. E-mail: [email protected] Received 5 February 2006; Accepted 23 March 2007 Published online 10 July 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.21536 ABSTRACT Zinc is one of the biologically most abundant and important metal ele- ments, present in a plethora of enzymes from a broad array of spe- cies of all phyla. In this study we report a thorough analysis of the geometrical properties of Zinc coor- dination spheres performed on a dataset of 994 high quality protein crystal structures from the Protein Data Bank, and complemented with Quantum mechanical calculations at the DFT level of theory (B3LYP/ SDD) on mononuclear model sys- tems. The results allowed us to draw interesting conclusions on the structural characteristics of Zn centres and to evaluate the impor- tance of such effects as the resolu- tion of X-ray crystallographic struc- tures, the enzyme class in which the Zn centre is included, and the iden- tity of the ligands at the Zn coordi- nation sphere. Altogether, the set of results obtained provides useful data for the enhancement of the atomic models normally applied to the theoretical and computational study of zinc enzymes at the quan- tum mechanical level (in particular enzymatic mechanisms), and for the development of molecular mechani- cal parameters for the treatment of zinc coordination spheres with mo- lecular mechanics or molecular dy- namics in studies with the full enzyme. Proteins 2007; 69:466–475. V V C 2007 Wiley-Liss, Inc. Key words: zinc enzymes; metallo- proteins; PDB; DFT; B3LYP. 466 PROTEINS V V C 2007 WILEY-LISS, INC.

Transcript of Analysis of zinc-ligand bond lengths in metalloproteins: Trends and patterns

proteinsSTRUCTURE O FUNCTION O BIOINFORMATICS

Analysis of zinc-ligand bond lengths inmetalloproteins: Trends and patternsBruno Tamames, Sergio Filipe Sousa, Juan Tamames,

Pedro Alexandrino Fernandes, and Maria Joao Ramos*

REQUIMTE, Departamento de Quımica, Faculdade de Ciencias, Universidade do Porto,

Rua do Campo Alegre 687, 4169-007 Porto, Portugal

INTRODUCTION

Zinc, the second-most abundant transition element in living organisms after

iron and the only metal known to be represented in enzymes from each one of

the six classes established by the International Union of Biochemistry,1 is an

essential component of a very large number of enzymes involved in virtually all

aspects of metabolism, and present in a broad array of species of all phyla.1–3 In

fact, a recent bioinformatics survey on the Zn proteins encoded in the human ge-

nome has proposed that about 2800 human proteins are potentially zinc-binding

in vivo, a number that represents something like 10% of the total human pro-

teome.4 Notable examples of Zn metalloenzymes include the carbonic anhydrases

I and II, the carboxypeptidases A, B, and T, alcohol dehydrogenase, and thermo-

lysin.3,5 The RNA polymerase II,6–10 the matrix metalloproteinases,11–13 pro-

tein farnesyltransferase,14–19 and the metallo-b-lactamases20,21 are among the

most discussed examples in recent years.

Zinc has particular characteristics that detach it from the other first-row transition

metals, and that make it very attractive for biological systems, in both enzymatic ca-

talysis and in maintaining structure. The flexible coordination geometry, the fast

ligand exchange, the strong binding to suitable sites, the lack of redox activity (no

generation of free radicals), and its role as Lewis acid are just a few examples, which

add to a high bio-availability.3,22 Zn proteins traditionally adopt a coordination

number of 4 or 5 in the metal coordination sphere. Histidine is the most common

amino acid ligand, followed by glutamate, aspartate, and cysteine.2,23 Despite the

importance that many Zn proteins play in living organisms several significant struc-

tural and mechanistic questions remain unanswered.

The RSBC Protein Data Bank24 is a trove of raw information that with the right

treatment can provide answers to some of the questions involving the coordination

of zinc, and one that has been relatively forgotten in its full scope. The growing num-

ber of zinc metalloproteins in the PBD, and the improving quality of these structures,

make a systematic survey of all zinc binding sites worthwhile, also enabling a fresh-

approach to elusive structural aspects characteristic of zinc systems, such as the ones

that arise, for example, from the carboxylate-shift mechanism.18,25

In this study, the zinc coordination spheres of 994 PDB structures were care-

fully analyzed comprising a total of 7547 zinc ligands evaluated, and of 10,776

Grant sponsor: FCT (Fundacao para a Ciencia e a Tecnologia); Grant numbers: POCI/QUI/61563/2004, SFRH/BD/

25457/2005, SFRH/BD/12848/2003.

*Correspondence to: Maria Joao Ramos, REQUIMTE, Departamento de Quımica, Faculdade de Ciencias, Universidade

do Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal. E-mail: [email protected]

Received 5 February 2006; Accepted 23 March 2007

Published online 10 July 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.21536

ABSTRACT

Zinc is one of the biologically most

abundant and important metal ele-

ments, present in a plethora of

enzymes from a broad array of spe-

cies of all phyla. In this study we

report a thorough analysis of the

geometrical properties of Zinc coor-

dination spheres performed on a

dataset of 994 high quality protein

crystal structures from the Protein

Data Bank, and complemented with

Quantum mechanical calculations

at the DFT level of theory (B3LYP/

SDD) on mononuclear model sys-

tems. The results allowed us to

draw interesting conclusions on the

structural characteristics of Zn

centres and to evaluate the impor-

tance of such effects as the resolu-

tion of X-ray crystallographic struc-

tures, the enzyme class in which the

Zn centre is included, and the iden-

tity of the ligands at the Zn coordi-

nation sphere. Altogether, the set of

results obtained provides useful

data for the enhancement of the

atomic models normally applied to

the theoretical and computational

study of zinc enzymes at the quan-

tum mechanical level (in particular

enzymatic mechanisms), and for the

development of molecular mechani-

cal parameters for the treatment of

zinc coordination spheres with mo-

lecular mechanics or molecular dy-

namics in studies with the full

enzyme.

Proteins 2007; 69:466–475.VVC 2007 Wiley-Liss, Inc.

Key words: zinc enzymes; metallo-

proteins; PDB; DFT; B3LYP.

466 PROTEINS VVC 2007 WILEY-LISS, INC.

bond lengths measured. The insight given from this anal-

ysis was subsequently complemented with theoretical cal-

culations at the DFT level of theory on several Zn model

systems. The unveiled trends provide valuable guidelines

for the study of biological zinc systems, also allowing a

reinterpretation of some old mechanistic paradigms. Fur-

thermore, the patterns found and the results obtained

can be helpful in improving the atomic models used in

theoretical and computational mechanistic studies of zinc

enzymes at the quantum mechanical level, and in the de-

velopment of molecular mechanical parameters for the

treatment of the zinc coordination spheres with molecu-

lar mechanics or molecular dynamics in studies with the

full enzyme.

METHODOLOGY

Statistical analysis

All the structures available in PDB24 with a resolution

value lower than 2.5 A and containing at least one Zn

ion were retrieved and analyzed, comprising a total of

994 structures. A total of 10,776 bond lengths involving

zinc atoms were measured and quantified using the pro-

gram ‘‘ProtPDBDist’’ developed in our research group.

This program measures the distance between the zinc

atom and every other atom in a 3.0-A radius, ignoring

carbon atoms (which are not involved in coordination

with the metal) and if it finds in a 3.0-A radius any oxy-

gen atom belonging to the carboxyl group of an Asp or

Glu residue (aspartic and glutamic acid respectively) then

the distance between the zinc atom and the other oxygen

atom of that carboxyl group is also measured and

recorded. A resolution value lower than 2.5 A was care-

fully chosen as criterion taking into consideration on one

hand that the accuracy on metal-bond lengths for struc-

tures with higher resolution values (i.e. worst resolution)

is typically too low, and on the other hand, that the

choice of a better resolution as a requirement could limit

the statistical significance of some of the studies initially

planned.

In the coordinate file of each structure, the Zn ions

were located and the Zn-coordination residues were iden-

tified. The distances towards Zn of every noncarbon

atom within a 3-A radius of the metal atom were mea-

sured using a program ‘‘ProtPdbDist,’’ written in C11,

developed in our research group. In the cases where the

coordinating was done through an oxygen atom that was

part of a carboxylate group (the lateral chain of an Asp

or Glu residue) both Zn-oxygen bond lengths were meas-

ured. The Zn environment of all structures presenting

Zn-ligand bonds lengths with values distinctively above

or below average were computationally visualized and

inspected in minute detail. The raw data obtained (Zn-

bond lengths, residues present and atoms coordinating

the zinc ion) was refined and further analyzed.

Quantum chemical calculations

Starting from the crystallographic structures with a Zn

atom with the best resolution for each of the combina-

tions His-His-His-Carboxylate, His-His-Cys-Carboxylate,

His-Cys-Cys-Carboxylate, and Cys-Cys-Cys-Carboxylate

(1MFM, 1RN5, 1DDZ, and 1P3J),26–29 four models

were prepared and optimized. Conventional modeling of

the amino acid side chains was used, that is, the zinc

ligands aspartate (or glutamate), cysteinate, and histidine

were modeled by acetate, methylthiolate, and imidazole,

respectively. The validity of this type of approaches has

been demonstrated before in the mechanistic study of

FTase16,18,30,31 and of several different enzymes.32–36

The geometry of each model was first freely optimized.

Several starting structures were tested to avoid the risk of

being trapped in local minima. These starting structures

were prepared by moderately changing the orientation of

the several ligands in relation to the zinc atom while

keeping acceptable Zn-ligand initial distances. No geomet-

ric constraints were imposed on any of the calculations.

In all the calculations described earlier, density func-

tional theory (DFT) with the B3LYP functional37,38 was

used. DFT calculations have been shown to give accurate

results for systems involving transition metals,39 particu-

larly when using the B3LYP functional.25,40–42 For zinc

complexes, the superior accuracy of the B3LYP functional

in comparison with Hartree–Fock and second-order Mol-

ler–Plesset perturbation theory, and with several other

DFT functionals has also been previously demon-

strated.25,34 Optimizations were carried out using the

SDD basis set, as implemented in Gaussian 03.43 This

basis set uses the small core quasi-relativistic Stuttgart/

Dresden electron core potentials (also known as Stoll–

Preuss, or simply SP)44,45 for transition elements. For

zinc, the outer electrons are described by a (311111/

22111/411) valence basis specifically optimized for this

metal and for the use with the SP pseudopotentials. C,

N, and O atoms are accounted by a (6111/41) quality ba-

sis set, whereas S and H atoms are treated respectively by

a (531111/4211) and a (31) quality basis sets. The high-

performance of SP pseudopotentials in calculations

involving transition metals compounds, particularly

within closed-shell systems, has been previously demon-

strated.46 The SDD basis set has also been used before

with success with B3LYP in the study of the Zn-metallo-

enzyme farnesyltransferase16,18,30,31 and of other more

general Zn systems.25 To further dedicate our attention

to Zn-His and Zn-Cys coordination, for each of the four

models initially prepared a scan along a Zn-His and a

Zn-Cys distances was performed by successively varying

this distance at 0.05-A intervals and reoptimizing the

model with B3LYP/SDD. A total of around 100 geome-

tries were considered in each of the six scans performed.

The energy for each transition states was estimated from

the maxima in the correspondent scans.

Analysis of Zinc Ligand Bonds in Metalloproteins

DOI 10.1002/prot PROTEINS 467

RESULTS AND DISCUSSION

Statistical analysis

Figure 1 shows the percentage of the most frequent

amino acid residues interacting with zinc. The results

show that these interactions between the metal atom and

the rest of the protein involve mainly four amino acid

residues: histidine (His) (48%), cysteine (Cys) (27%),

aspartate (Asp) (15%), and glutamate (Glu) (7%). The

other amino acids have only a residual contribution in

direct Zn interactions. From these serine (0.9%), threo-

nine (0.5%), and lysine (0.4%) are the most significant.

The relative proportion of each amino acid presents

some differences in relation to the values found in the

literature.23 However, these differences are probably due

to the growth in the number of structures in the PDB in

the last few years, and to the fact that our study was

focused on the structures available in the PDB and not

on the proteins themselves.

These interactions, along with the average length of

each interaction type and corresponding standard devia-

tion are summarized in Table I. Additionally, we have

plotted the percentage of occurrence against the distance

between the zinc atom and the non metallic atoms. Sub-

sequently, we will analyze these plots individually.

Figure 2, representing the distances involving a zinc

atom and an oxygen atom in Glu and Asp residues, can

be divided in several regions to facilitate the analysis.

The first region, up to 1.5 A, is not shown on the plot as

there are no structures with such interactions present.

This region represents the repulsive area where the elec-

tronic density of the metallic atom overlaps with that of

the oxygen (nonmetallic) atom. This happens with every

element with little variation.

The second region of coordination, from 1.5 to 2.5 A,

is the most populated area and represents all the bonds

established between zinc and the oxygen atoms. This

region has a large dispersion and a single peak. However,

we believe this main peak to be an overlap of two peaks:

one peak representing mono-coordinated carboxylates,

and most other ligands in which a single oxygen atom

interacts with the zinc atom; a second peak accounting

for the bi-coordinated carboxylates, which present bond

lengths higher than normal, and for intermediate cases

where a carboxylate-shift occurs, that is a situation in

which a mono-coordinated carboxylate (normally from a

Glu or Asp residue) is transformed into a bi-coordinated

carboxylate (Fig. 3).

Figure 1Percentage of most frequent amino acid residues interacting with zinc.

Table IInteraction Distances in the Zinc Coordination Sphere

Number ofinteractions

Average(�)

Standarddeviation

(�)

Zn��N

Total 3728 2.11 0.15

HisTotal 3462 2.11 0.14N d 1115 2.09 0.14N e 2347 2.12 0.15

Other 266 2.17 0.24

Zn��STotal 1915 2.32 0.12Cys 1877 2.32 0.11Other 38 2.31 0.21

Zn��O

TotalZ1 (0–2.5�) 3233 2.10 0.17Z2 (2.5–3.5�) 1664 2.99 0.37Z3 (>3.5�) 218 4.18 0.18

Glu & AspZ1 (0–2.5�) 1672 2.09 0.16Z2 (2.5–3.5�) 1344 3.09 0.34Z3 (>3.5�) 218 4.18 0.18

GluZ1 (0–2.5�) 561 2.11 0.17Z2 (2.5–3.5�) 386 3.01 0.35Z3 (>3.5�) 71 4.20 0.18

AspZ1 (0–2.5�) 1111 2.07 0.16Z2 (2.5–3.5�) 958 3.13 0.34Z3 (>3.5�) 147 4.17 0.18

H2O 935 2.22 0.24Other 956 2.19 0.24

Figure 2Percentage of occurrence of distances between Zn and O atoms in Glu and Asp

residues.

B. Tamames et al.

468 PROTEINS DOI 10.1002/prot

A third region, from 2.5 to 3.8 A, that is a nonbonding

region, includes the O2 (oxygen atom farther from the

metallic atom) from the mono-coordinated carboxylates,

with a syn coordination type (Fig. 4). Again this peak

can be divided in two, the first with a maximum in the

2.85-A region represents structures where the carboxy-

late-shift is still possible and a second one, with a maxi-

mum in the 3.10-A region, which includes structures

where the carboxylate-shift will not occur.25

Finally the region, from 3.8 to 4.5 A, represents only

the O2 atoms from the mono-coordinated carboxylates,

with an anti coordination type (Fig. 4).

Figure 5 represents the frequency as a function of the

bond lengths of both oxygen atoms present in the car-

boxylate ligands. In this figure the higher frequency peaks

described earlier can be seen clearly. Furthermore, in the

surface map we can observe the existence of several fre-

quency maximums with similar Zn��O1 bond lengths

but different Zn��O2 bond lengths, illustrating the sev-

eral possible coordination alternatives by carboxylate

ligands.

Figures 6 and 7 represent, respectively, the percentage

of occurrence of bond lengths Zn��N and Zn��S. Both

plots show a high dispersion with values between 1.7 and

2.4 A for Zn��N bonds and between 2.0 and 2.5 A for

Zn��S bonds, and with a percentage of occurrence of at

least 1% for all the values within both intervals. The dis-

tribution of the bond lengths between zinc and sulfur or

nitrogen is much more homogenous than in the case of

oxygen because N- and S-ligands interact with zinc in

basically only one way, whereas in oxygen ligands, partic-

ularly the ones encompassing carboxylate groups, there

are several ways by which the atoms can interact.

Enzyme classes

According to their function, all enzymes can be

grouped into six different classes, as established by the

International Union of Biochemistry.1 This structure of

six classes and the naming of the enzymes are based

essentially on the perceived functions of enzymes. Hence,

in several cases both the class and the name were subject

to alteration when a new function was discovered, or

some new insight into the mechanism or function of the

enzyme was gained. Although fallible, this is the most

logical, and only used way to separate the enzymes into

groups according to their function. Table II and Figure 8

show the bond lengths for these six classes of enzymes.

Figure 3Carboxylate shift mechanism: one of the residues coordinated with the Zn atom moves away from it while the second oxygen of a carboxylate moves toward it, shifting

from a monodentate to a bidentate coordination.

Figure 4Syn and Anti coordination modes of carboxylate ligands (Asp and Glu) to metal

atoms.

Analysis of Zinc Ligand Bonds in Metalloproteins

DOI 10.1002/prot PROTEINS 469

These statistical data unveil some clear differences

between the six classes. The observed differences are

probably related to the different functions that the zinc

centers play in the global reaction mechanism of the

enzyme, depending on the correspondent class. These

data is in agreement with work previously done in

grouping the enzymes in different classes and more

detailed work such as this with all the metalloenzymes

can perhaps help in the reconsideration of some enzymes

classes. Figure 9 shows the percentage of zinc bonds

involving oxygen, sulphur, and nitrogen atoms in each

class. Again the difference between the various enzyme

classes is visible, and all three elements are predominant

in at least one of the classes.

Resolution of the PDB structures

All the structures considered in this work had a resolu-tion value lower than 2.5 A, because structures with reso-lution values higher than that, often present large inac-curacies in the bond-lengths values. On the basis of this,we repeated the statistical studies for several different re-solution intervals, ranging from 0 to 2.5 A. The resultsobtained are fully shown in Table III. Figure 10 shows, asan example the percentage of occurrence of interactiondistances Zn��N as a function of X-ray crystallographyresolution.In these results we can observe a considerable reduc-

tion of the dispersion (and the standard deviation) for

each of the bond types studied when we consider lower

Figure 5Frequency of occurrence of Zn��O1 bond lengths versus Zn��O2 bond lengths. [Color figure can be viewed in the online issue, which is available at www.interscience.

wiley.com.]

Figure 6Percentage of occurrence of distances between Zn and N atoms.

Figure 7Percentage of occurrence of distances between Zn and S atoms.

B. Tamames et al.

470 PROTEINS DOI 10.1002/prot

resolution values, that is higher quality structures. The

values for the central peak in the bond lengths for each

element remain mostly unchanged for the several resolu-

tion intervals, but the height of the peak decreases for

the lower quality structures. The exception is the Zn-His

interaction for which a considerable decrease in the

bond-length was observed for the higher quality struc-

tures. In addition, it is important to point out, that in

better resolution structures there is a much lower fre-

quency of bond lengths with unnaturally small values,

such as distances between Zn and S of only 1.5 A. Fur-

thermore, for the specific case of Zn��N the percentage

of structures with bond lengths higher than 2.1 A greatly

increases as the resolution value of the analyzed struc-

tures increases. These findings reflect some of the prob-

lems inherent to X-ray crystallography today.

Quantum chemical calculations

To explore in more detail the geometrical properties

characteristic of Zn coordination spheres in metallopro-

teins we have decided to complement the statistical anal-

ysis performed with a full set of quantum chemical cal-

culations on Zn model systems. In particular, we aimed

to unveil and explain in minute detail some of the trends

disclosed from the statistical analysis initially performed.

Hence, models were carefully prepared as to target the

most representative aspects of Zn coordination.

Even though in solution this metal coordinates to six

water molecules, in small molecules Zn complexes tend to

adopt coordination numbers of 4 (50%), 5 (19%), or 6

(35%).47 A previous analysis on 111 X-ray crystallo-

graphic structures of Zn proteins in the PDB23 has

revealed that in the active sites of Zn enzymes a coordina-

tion number of 4 (48%), 5 (44%), or 6 (6%) is typically

encountered, whereas in structural sites these percentages

are of 79, 6, and 12%, respectively.23 Hence, the typical

mononuclear Zn sphere is tetra coordinated. It is also

known that histidine, glutamate, aspartate, and cysteine

are by far the most common amino acid ligands present

in the primary coordination sphere of Zn metalloproteins

(representing 96% of the total of Zn binding amino acids

in the PDB), as demonstrated in the statistical analysis

performed (Fig. 1), and suggested in previous studies.2,23

Table IIInteraction Distances in Zinc Coordination Spheres Separated by Enzyme Class

Enzyme Class Zn��N (His) Zn��S (Cys)

Zn��O (Asp and Glu)

Z1(0–2.5�) Z2(2.5–3.5�) Z3(>3.5�)

1. Oxidoreductases 2.08 � 0.12 (760) 2.31 � 0.11 (1066) 2.03 � 0.16 (281) 2.92 � 0.21 (211) 4.06 � 0.27 (68)2. Transferases 2.14 � 0.14 (128) 2.32 � 0.10 (427) 2.17 � 0.18 (51) 2.67 � 0.21 (21) 3.96 � 0.48 (4)3. Hydrolases 2.13 � 0.15 (1814) 2.34 � 0.13 (222) 2.11 � 0.17 (1186) 3.07 � 0.28 (772) 3.89 � 0.30 (245)4. Lyases 2.08 � 0.14 (637) 2.30 � 0.08 (89) 2.09 � 0.18 (86) 3.01 � 0.29 (58) 3.98 � 0.31 (27)5. Isomerases 2.16 � 0.15 (73) 2.41 (2) 2.04 � 0.15 (91) 3.14 � 0.20 (58) 4.19 � 0.14 (30)6. Ligases 2.19 � 0.14 (19) 2.40 � 0.13 (65) 2.34 (2) 0 4.11 (2)

The values represented are the average distances (in A) plus or minus the standard deviation (in A) and, in brackets is the number of bonds measured for that class/

atom combination.

Figure 8Percentage of occurrence of distances between Zn and N atoms by enzyme

class.

Figure 9Relative percentage of Zn��N, Zn��S, and Zn��O interactions by enzyme

class.

Analysis of Zinc Ligand Bonds in Metalloproteins

DOI 10.1002/prot PROTEINS 471

Having this in mind, and representing both glutamate

and aspartate residues by a generic carboxylate group, we

end up with four different tetracoordinated Zn coordina-

tion spheres, if a single carboxylate group per Zn com-

plex is considered. These are His-His-His-Carboxylate,

His-His-Cys-Carboxylate, His-Cys-Cys-Carboxylate, and

Cys-Cys-Cys-Carboxylate. Models for these complexes

were prepared and energy minimized, as described in the

methodology section. The resulting geometries are pres-

ent in Figure 11.

A comparison of the Zn-ligand bond-lengths from the

energy-minimized structures in the four coordination

spheres studied reveals interesting trends, further high-

lighted in Figure 12. In fact, the average Zn��N(His) and

Zn��S(Cys) bond-lengths and the values of the Zn-Car-

boxylate oxygens (Zn��O1 and Zn��O2) are shown to

be highly dependent on the proportion of His/Cys

ligands. The higher the number of Cys residues in rela-

tion to the number of His residues, the longer the corre-

spondent Zn bond-lengths. This trend is in agreement

with the fact that thiolate ligands are strong r and pdonors, whereas histidine is a weaker electron donor.

Cysteines are thus able to stabilize at a higher extension

the positive charge at the Zn centre, weakening Zn-ligand

bond-strength, and increasing the corresponding bond-

lengths.

The Zn-ligand bond-lengths calculated from our mod-

els (Fig. 12) are in good agreement with the average val-

ues determined from the statistical analysis performed

(Table IV). In fact, both the Zn��N(His), Zn��S(Cys),

and Zn��O1(Carboxylate) average bond-lengths deter-

mined from the QM study are well within one standard

deviation of the statistical derived mean values for each

type of residue, and are in most cases clustered around

the mean value. The Zn��O2 distance is a small excep-

Table IIIInteraction Distances in the Zinc Coordination Sphere Dependent on X-ray Crystallography resolution

Resolution (�) Zn��N (His) Zn��S (Cys)

Zn��O (Asp and Glu)

Z1(0–2.5 �) Z2(2.5–3.5 �) Z3(>3.5�)

0.5–1.0 2.02 � 0.04 (11) 0 2.11 � 0.15 (30) 3.13 � 0.36 (11) 3.99 (1)1.0–1.5 2.08 � 0.09 (180) 2.33 � 0.05 (90) 2.10 � 0.16 (280) 3.00 � 0.38 (114) 4.35 � 0.28 (12)1.5–2.0 2.10 � 0.13 (1967) 2.32 � 0.09 (732) 2.10 � 0.16 (1527) 2.98 � 0.36 (786) 4.17 � 0.14 (65)2.0–2.5 2.13 � 0.16 (1470) 2.32 � 0.13 (1094) 2.11 � 0.18 (1396) 3.00 � 0.38 (753) 4.17 � 0.19 (140)

The values represented are the average distances (in A) plus or minus the standard deviation (in A) and, in brackets, is the number of bonds measured for that resolu-

tion/atom combination.

Figure 10Percentage of occurrence of distances between Zn and N atoms by resolution.

Figure 11Energy minimized structures for the Zn coordination spheres His-His-His-

Carboxylate, His-His-Cys-Carboxylate, His-Cys-Cys-Carboxylate, and Cys-Cys-

Cys-Carboxylate. Most relevant Zn-ligand bond-lengths indicated in Angstrom.

[Color figure can be viewed in the online issue, which is available at

www.interscience.wiley.com.]

B. Tamames et al.

472 PROTEINS DOI 10.1002/prot

tion, with one model (the Cys-Cys-Cys-Carboxylate)

exhibiting a value that is outside this confidence interval.

The specific nature of Zn��O2 interactions (frequently

involving a dangling oxygen, as in the case of the Cys-

Cys-Cys-Carboxylate model) can sometimes result in QM

derived bond-lengths significantly higher than the experi-

mental values obtained from the full enzymatic systems,

particularly when the correspondent carboxylate groups

can unequivocally be assigned as monodentate. This ob-

servation can be explained on the basis of the second-

sphere effect, which in the enzyme can stabilize the posi-

tion of the dangling oxygen by side-chain interactions

and through hydrogen bonds. In models encompassing

simply the first coordination sphere this distance can

normally be determined with relatively good accuracy for

bidentate carboxylate groups, carboxylate ligands with a

partially bidentate character, and for moderately mono-

dentate ligands (X-ray Zn��O2 distance below 3.2 A).

For distances beyond this value, the application of QM

calculations to the type of models used in this study typ-

ically results in increased Zn��O2 distances, and ulti-

mately in the conformational change of the carboxylate

group towards anti-coordination (Fig. 4).25

Another interesting aspect to take into consideration is

the change in the Zn-ligand distances upon alteration of

the residues at the first coordination sphere. According

to our calculations, a simple variation of the proportion

of Cys/His ligand residues at the Zn centre can produce a

change of up to 0.14 A on the average Zn-His bond-

length, and of up to 0.12 A on the average Zn��Cys

bond-length (Fig. 12). For the Zn��O1 distance such

change is not so significant. In fact, changing the first

coordination sphere from three Cys and one carboxylate

ligand to three His and one carboxylate ligand only pro-

duces a 0.10-A change in the Zn��O1 bond-length.

However, the correspondent Zn��O2 distances change

up to 0.91 A. These aspects suggest that carboxylate

ligands can accommodate more easily changes at the

metal coordination sphere through a balance between the

Zn��O1 and Zn��O2 interactions. At a higher extension,

this property of carboxylate ligands has been shown to

be involved in the stabilization of more dramatic changes

at the metal coordination sphere such as the ones arising

from ligand entrance or exit processes, through mono-

dentate to bidentate interconversion (carboxylate-

shift).16,18,25

In this study we have analyzed simple QM tetracoordi-

nated models, with one single carboxylate ligand (Asp or

Glu) and three amino acid residues that are either His or

Cys. However, when moving from these simple models

to real Zn biological systems, such as enzymes and Zn

metalloproteins, the number of variables acting upon the

metal coordination sphere is much larger. The second

coordination sphere, the enzyme backbone, the electro-

static environment of the enzyme, the presence of solvent

molecules, just to name a few examples, are all factors

that will diversify the specific characteristics surrounding

each Zn centre, and that will ultimately determine the

adopted geometry. What is curious to observe from our

statistical analysis (Figs. 7 and 8) is that despite all these

different aspects acting upon the metal sphere, the vari-

ability of the Zn��N and Zn��S bond-lengths in all the

X-ray crystallographic structures tested is remarkably small

(standard deviation of 0.14 and 0.11 A, respectively for

Zn-His and Zn-Cys bonds), at the level of dispersion that

one would expect simply from the existence of different

combinations of ligands at the first coordination sphere.

The first coordination sphere effect is hence paramount.

Carefully evaluating Zn-carboxylate interactions is a

thorny task,18,25 mainly because of the different possibil-

ities of coordination of this type of ligands that can

emerge as a consequence of the carboxylate-shift mecha-

nism. These aspects have been extensively analyzed quan-

tum mechanically elsewhere.25 To further analyze Zn-His

and Zn-Cys coordination, for each of the four models

initially prepared, a scan along a Zn-His and a Zn-Cys

distances was performed, resulting in six full scans, as

illustrated in Table IV.

Figure 13 represents the approximated values for the

Gibbs activation and reaction energies calculated from

the set of six scans performed. The results show that for

both His and Cys displacement, the activation, and reac-

tion energies tend to decrease as the proportion of Cys in

the two variable amino acid positions increases. Hence,

for the His displacement studies, the Gibbs activation

Figure 12Average bond-lengths obtained in the energy minimized structures for the four

combinations of ligands studied.

Table IVCombinations Tested in the QM Scans Performed

Combination Variation (SCAN of Zn-X Bond)

1 His-His-His-Carboxylate Zn��N(His)2 His-His-Cys-Carboxylate Zn��N(His)3 His-Cys-Cys-Carboxylate Zn��N(His)4 Cys-His-His-Carboxylate Zn��S(Cys)5 Cys-Cys-His-Carboxylate Zn��S(Cys)6 Cys-Cys-Cys-Carboxylate Zn��S(Cys)

Analysis of Zinc Ligand Bonds in Metalloproteins

DOI 10.1002/prot PROTEINS 473

and reaction energies are the smallest for the case of His-

Cys-Cys-Carboxylate, then for the system His-His-Cys-

Carboxylate, and are maximal for the combination His-

His-His-Carboxylate. The same trend is observed for the

Cys-X-X-Carboxylate systems, but in these cases the val-

ues are typically higher than in the correspondent His-X-

X-Carboxylate systems, which is in agreement with the

fact that Cys displacement leads to the formation of a

noncoordinated negatively charged thiolate, whereas His

displacement yields an unbound neutral amino acid resi-

due. Hence, in both types of displacement processes stud-

ied the relative values determined for the activation and

reaction energies is in agreement with the proportion of

His/Cys ligands, and with the specific nature of this

amino acid residues, as discussed earlier.

A notable aspect in the set of QM calculations per-

formed is the high activation barriers and the endother-

mic nature of almost all processes (the exception is His

exit in the His-Cys-Cys-Carboxylate system). The flexible

nature characteristic of Zn coordination spheres normally

yields much smaller activation barriers for ligand exit,

(typically below 10 kcal/mol), particularly when a car-

boxylate group is present in the first coordination sphere.

This can be seen in a variety of studies involving the

same general type of Zn models and the same methodol-

ogy (B3LYP/SDD).16,18,25,30,31 The present set of

results for His and Cys coordination demonstrates that

Zn coordination by these ligands is significantly stronger

than coordination by small molecule ligands, such as the

ones typically involved in catalysis (e.g. water25). His

and Cys are thus the main residues responsible for the

structural stabilization of the metal ion, whereas Asp and

Glu residues can play an important role in controlling

the chemical reactivity of the Zn centre.

CONCLUSIONS

The set of results presented in this study has enabled a

fresh analysis of the PDB content in terms of Zn-ligand

interactions, providing valuable guidelines for the study

of biological Zn systems. The study has shown that the

most common amino acid residues present at the metal

coordination sphere in Zn metalloenzymes are by far His

(48%), Cys (26%), Asp (15%), and Glu (7%). The other

amino acids have only a residual contribution in direct

Zn interactions (less than 4%). The average Zn-ligand

interaction distance is of 2.11 A for His, 2.09 A for Asp

and Glu, and 2.32 A for Cys. For higher resolution struc-

tures the average distances is visibly shorter for Zn-His

interactions, but remains roughly the same in the case of

Zn-Cys, Zn-Asp, and Zn-Glu interactions. The study also

reveals an important asymmetry in the proportion of the

various residues for the different classes of Zn enzymes.

Water, an almost universal hallmark of catalytic Zn bind-

ing sites, coordinates the metal atom with an average

bond-length of 2.22 A.

The use of quantum mechanical calculations on simple

models of the metal coordination sphere yielded addi-

tional insight into the structure of Zn coordination

spheres in biological systems. In particular, our calcula-

tions have shown that the alterations caused on the Zn-

ligand bond-lengths in model systems by simply chang-

ing the proportion of His/Cys residues at the metal coor-

dination sphere, are basically at the same level of the dis-

persion observed for the total of 10,776 Zn-ligand inter-

actions in the 994 X-ray structures of Zn metalloproteins

in the PDB with a resolution better than 2.5 A. Aspects

like the second coordination sphere, the enzyme back-

bone, the electrostatic environment of the enzyme, or the

presence of solvent molecules, all play an almost trifling

role in Zn-ligand bond-lengths, in comparison with the

effect caused by the combination of residues at the first

coordination sphere.

Altogether, these results provide useful data for the en-

hancement of the atomic models normally applied to the

theoretical and computational study of zinc enzymes at

the quantum mechanical level, and for the development of

molecular mechanical parameters for the treatment of zinc

coordination spheres with molecular mechanics or molec-

ular dynamics in studies with the full enzyme.

REFERENCES

1. Vallee BL, Auld DS. Active-site zinc ligands and activated H2O of

zinc enzymes. Proc Natl Acad Sci USA 1990;87:220–224.

2. Vallee BL, Auld DS. Zinc coordination, function, and structure of

zinc enzymes and other proteins. Biochemistry 1990;29:5647–5659.

3. Lipscomb WN, Strater N. Recent advances in zinc enzymology.

Chem Rev 1996;96:2375–2434.

4. Andreini C, Banci L, Bertini I, Rosato A. Counting the zinc-proteins

encoded in the human genome. J Proteome Res 2006;5:196–201.

5. Coleman JE. Zinc enzymes. Curr Opin Chem Biol 1998;2:222–234.

6. Hahn S. Structure and mechanism of the RNA polymerase II tran-

scription machinery. Nat Struct Mol Biol 2004;11:394–403.

7. Bushnell DA, Westover KD, Davis R, Kornberg RD. Structural basis

of transcription: an RNA polymerase II-TFIIB cocrystal at 4.5 Ang-

stroms. Science 2004;303:983–988.

Figure 13Estimated values for the Gibbs activation and reaction energies for ligand

exit.

B. Tamames et al.

474 PROTEINS DOI 10.1002/prot

8. Donaldson IM, Friesen JD. Zinc Stoichiometry of yeast RNA poly-

merase II and characterization of mutations in the zinc-binding do-

main of the largest subunit. J Biol Chem 2000;275:13780–13788.

9. Gnatt AL, Fu J, Bushnell DA, Kornberg RD. Structural basis of tran-

scription: an RNA polymerase II elongation complex at 3.3 A reso-

lution. Science 2001;292:1876–1882.

10. Cramer P, Bushnell DA, Kornberg RD. Structural basis of transcrip-

tion: RNA polymerase II at 2.8 Angstrom resolution. Science 2001;

292:1863–1876.

11. Brinckerhoff CE, Matrisian LM. Matrix metalloproteinases: a tail

of a frog that became a prince. Nat Rev Mol Cell Biol 2002;3:207–

214.

12. Egeblad M, Werb Z. New functions for the matrix metalloprotei-

nases in cancer progression. Nat Rev Cancer 2002;2:161–174.

13. Coussens LM, Fingleton B, Matrisian LM. Matrix metalloproteinase

inhibitors and cancer: trials and tribulations. Science 2002;295:

2387–2392.

14. Long SB, Casey PJ, Beese LS. Reaction path of protein farnesyltrans-

ferase at atomic resolution. Nature 2002;419:645–650.

15. Park HW, Boduluri SR, Moomaw JF, Casey PJ, Beese LS. Crystal

structure of protein farnesyltransferase at 2.25 Angstrom resolution.

Science 1997;275:1800–1804.

16. Sousa SF, Fernandes PA, Ramos MJ. Farnesyltransferase: theoretical

studies on peptide substrate entrance—thiol or thiolate coordina-

tion? J Mol Struct (THEOCHEM) 2005;729:125–129.

17. Sousa SF, Fernandes PA, Ramos MJ. Unraveling the mechanism of

the farnesyltransferase enzyme. J Biol Inorg Chem 2005;10:3–10.

18. Sousa SF, Fernandes PA, Ramos MJ. Farnesyltransferase—new

insights into the zinc-coordination sphere paradigm: evidence for a

carboxylate-shift mechanism. Biophys J 2005;88:483–494.

19. Tobin DA, Pickett JS, Hartman HL, Fierke CA, Penner-Hahn JE.

Structural characterization of the zinc site in protein farnesyltrans-

ferase. J Am Chem Soc 2003;125:9962–9969.

20. Cricco JA, Vila AJ. Class B b-lactamases: the importance of being

metallic. Curr Pharm Des 1999;5:915–927.

21. Heinz U, Adolph HW. Metallo-b-lactamases: two binding sites for

one catalytic metal ion? Cell Mol Life Sci 2004;61:2827–2839.

22. McCall KA, Huang CC, Fierke CA. Function and mechanism of

zinc metalloenzymes. J Nutr 2000;130:1437–1446.

23. Alberts IL, Nadassy K, Wodak SJ. Analysis of zinc binding sites in

protein crystal structures. Protein Sci 1998;7:1700–1716.

24. Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Brice MD,

Rodgers JR, Kennard O, Shimanouchi T, Tasumi M. Protein data

bank—computer-based archival file for macromolecular structures.

J Mol Biol 1977;112:535–542.

25. Sousa SF, Fernandes PA, Ramos MJ. The carboxylate-shift in zinc

enzymes: a computational study. J Am Chem Soc 2007;129:1378–

1385.

26. Ferraroni M, Rypniewski W, Wilson KS, Viezzoli MS, Banci L, Ber-

tini I, Mangani S. The crystal structure of the monomeric human

SOD mutant F50E/G51E/E133Q at atomic resolution. The enzyme

mechanism revisited. J Mol Biol 1999;288:413–426.

27. Zhou Z, Song X, Li Y, Gong W. Unique structural characteristics of

peptide deformylase from pathogenic bacterium Leptospira interrog-

ans. J Mol Biol 2004;339:207–215.

28. Mitsuhashi S, Mizushima T, Yamashita E, Yamamoto M, Kumasaka

T, Moriyama H, Ueki T, Miyachi S, Tsukihara T. X-ray structure of

b-carbonic anhydrase from the red alga, Porphyridium purpureum,

reveals a novel catalytic site for CO(2) hydration. J Biol Chem

2000;275:5521–5526.

29. Bae E, Phillips GN, Jr. Structures and analysis of highly homolo-

gous psychrophilic, mesophilic, and thermophilic adenylate kinases.

J Biol Chem 2004;279:28202–28208.

30. Sousa SF, Fernandes PA, Ramos MJ. Theoretical Studies on farnesyl-

transferase: evidence for thioether product coordination to the

active-site zinc sphere. J Comput Chem 2007;28:1160–1168.

31. Sousa SF, Fernandes PA, Ramos MJ. Theoretical studies on farnesyl-

transferase: the distances paradox explained. Proteins 2007;66:205–218.

32. Siegbahn PEM. Theoretical study of the substrate mechanism of

ribonucleotide reductase. J Am Chem Soc 1998;120:8417–8429.

33. Melo A, Ramos MJ, Floriano WB, Gomes JANF, Leao JFR, Magal-

haes AL, Maigret B, Nascimento MC, Reuter N. Theoretical study

of arginine–carboxylate interactions. J Mol Struct (THEOCHEM)

1999;463:81–90.

34. Ryde U. Carboxylate binding modes in zinc proteins: a theoretical

study. Biophys J 1999;77:2777–2787.

35. Fernandes PA, Ramos MJ. Theoretical studies on the mechanism of

inhibition of ribonucleotide reductase by (E)-20-fluoromethylene-20-deoxycitidine-50-diphosphate. J Am Chem Soc 2003;125:6311–6322.

36. Fernandes PA, Ramos MJ. Theoretical studies on the mode of inhi-

bition of ribonucleotide reductase by 20-substituted substrate ana-

logues. Chem Eur J 2003;9:5916–5925.

37. Lee C, Yang W, Parr RG. Development of the Colle-Salvetti correla-

tion-energy formula into a functional of the electron density. Phys

Rev B 1988;37:785–789.

38. Becke AD. Density-functional thermochemistry. III. The role of

exact exchange. J Chem Phys 1993;98:5648–5652.

39. Ziegler T. Approximate density functional theory as a practical tool

in molecular energetics and dynamics. Chem Rev 1991;91:651–667.

40. Bauschlicher CW. A comparison of the accuracy of different func-

tionals. Chem Phys Lett 1995;246:40–44.

41. Holthausen MC, Mohr M, Koch W. The performance of density

functional/Hartree-Fock hybrid methods: the bonding in cationic

first-row transition metal methylene complexes. Chem Phys Lett

1995;240:245–252.

42. Ricca A, Bauschlicher CW. A comparison of density functional

theory with ab initio approaches for systems involving first transi-

tion row metals. Theor Chim Acta 1995;92:123–131.

43. Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheese-

man JR, Montgomery JA, Vreven T, Kudin KN, Burant JC, Millam

JM, Iyengar SS, Tomasi J, Barone V, Mennucci B, Cossi M, Scalmani

G, Rega N, Petersson GA, Nakatsuji H, Hada M, Ehara M, Toyota K,

Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O,

Nakai H, Klene M, Li X, Knox JE, Hratchin HP, Cross JB, Adamo C,

Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi

R, Pomelli C, Ochterski JW, Ayala PY, Morokuma K, Voth GA, Salva-

dor P, Dannenberg JJ, Zakrzewski VG, Dapprich S, Daniels AD,

Strain MC, Farkas O, Malik DK, Rabuck AD, Raghavachari K, Fores-

man JB, Ortiz JV, Cui Q, Baboul AG, Clifford S, Cioslowski J, Stefa-

nov BB, Liu G, Liashenko A, Piskorz P, Komaromi I, Martin RL, Fox

DJ, Keith T, Al-Lahan A, Peng CY, Nanayakkara A, Challacombe M,

Gill PMW, Johnson B, Chen W, Wong MW, Gonzalez C, Pople JA.

Gaussian03 2004, Revision C. 02. Wallingford, CT: Gaussian; 2004.

44. Dolg M, Wedig U, Stoll H, Preuss H. Energy-adjusted ab initio

pseudopotentials for the first row transition elements. J Chem Phys

1987;86:866–872.

45. Andrae D, Haussermann U, Dolg M, Stoll H, Preuss H. Energy-

adjusted ab initio pseudopotentials for the second and third row

transition elements. Theor Chim Acta 1990;77:123–141.

46. Frenking G, Antes I, Bohme M, Dapprich S, Ehlers AW, Jonas V,

Neubaus A, Otto M, Stegmann R, Veldkamp A, Vyboishchikov SF.

Pseudopotential calculations of transition metal compounds: scope

and limitations. Rev Comput Chem 1996;8:63–144.

47. Bock CW, Katz AK, Glusker JP. Hydration of zinc ions: a compari-

son with magnesium and beryllium ions. J Am Chem Soc 1995;117:

3754–3765.

Analysis of Zinc Ligand Bonds in Metalloproteins

DOI 10.1002/prot PROTEINS 475