Structure of N-acetyl-β-D-glucosaminidase (GcnA) from the Endocarditis Pathogen Streptococcus...

13
Structure of N-acetyl-β-D-glucosaminidase (GcnA) from the Endocarditis Pathogen Streptococcus gordonii and its Complex with the Mechanism-based Inhibitor NAG-thiazoline David B. Langley 1 , Derek W.S. Harty 2 , Nicholas A. Jacques 2 Neil Hunter 2 , J. Mitchell Guss 1 and Charles A. Collyer 1 1 School of Molecular and Microbial Biosciences, University of Sydney, Sydney, Australia 2 Institute of Dental Research, Westmead Millennium Institute, Westmead, Australia Received 20 July 2007; accepted 11 September 2007 Available online 16 September 2007 The crystal structure of GcnA, an N-acetyl-β-D-glucosaminidase from Streptococcus gordonii, was solved by multiple wavelength anomalous dispersion phasing using crystals of selenomethionine-substituted pro- tein. GcnA is a homodimer with subunits each comprised of three domains. The structure of the C-terminal α-helical domain has not been observed previously and forms a large dimerisation interface. The fold of the N-terminal domain is observed in all structurally related glycosidases although its function is unknown. The central domain has a canonical (β/α) 8 TIM-barrel fold which harbours the active site. The primary sequence and structure of this central domain identifies the enzyme as a family 20 glycosidase. Key residues implicated in catalysis have dif- ferent conformations in two different crystal forms, which probably represent active and inactive conformations of the enzyme. The catalytic mechanism for this class of glycoside hydrolase, where the substrate rather than the enzyme provides the cleavage-inducing nucleophile, has been confirmed by the structure of GcnA complexed with a putative reaction intermediate analogue, N-acetyl-β-D-glucosamine-thiazoline. The catalytic mechanism is discussed in light of these and other family 20 structures. Crown Copyright © 2007 Published by Elsevier Ltd. All rights reserved. Edited by R. Huber Keywords: GcnA; glucosaminidase; family 20 glycoside hydrolase; sub- strate-assisted catalysis; Streptococcus gordonii; infective endocarditis Introduction Streptococcus gordonii is a primary coloniser of the oral cavity and contributes to the maintenance of a healthy oral flora. 1 However, should it gain access to the bloodstream it can colonise the heart valve in susceptible individuals resulting in infective endo- carditis, an often fatal disease. 2 The endocardial vegetation comprises a complex mix of activated platelets and serum components enmeshed in a fibrin matrix. Such an environment, whilst hiding colonising bacteria from the immune system, may also shield them from sugars abundant in the circulatory system. Survival in such a niche is thought to be mediated by a series of glycoside hydrolases capable of releasing dietary carbo- hydrates from host glycoproteins. Indeed, several S. gordonii gene clusters containing glycoside hydro- lases have been identified as being up-regulated in the context of an animal model of the disease. 3 One of these gene clusters, the gom regulon, contains 15 open reading frames, five of which sequence homology identifies as sugar degrading enzymes: a fucosidase; two mannosidases; a β-glucosidase; *Corresponding author. E-mail address: [email protected]. D.B.L. and D.W.S.H. contributed equally to this work. Abbreviations used: GcnA, N-acetyl glucosidase from Streptococcus gordonii; MAD, multiple wavelength anomalous dispersion; NAG, N-acetyl-β-D-glucosamine; TIM, triose-phosphate isomerase; NGT, NAG-thiazoline; PDB, Protein Data Bank; PEG, polyethylene glycol; Se-Met, selenomethionine. doi:10.1016/j.jmb.2007.09.028 J. Mol. Biol. (2008) 377, 104116 Available online at www.sciencedirect.com 0022-2836/$ - see front matter. Crown Copyright © 2007 Published by Elsevier Ltd. All rights reserved.

Transcript of Structure of N-acetyl-β-D-glucosaminidase (GcnA) from the Endocarditis Pathogen Streptococcus...

doi:10.1016/j.jmb.2007.09.028 J. Mol. Biol. (2008) 377, 104–116

Available online at www.sciencedirect.com

Structure of N-acetyl-β-D-glucosaminidase (GcnA) fromthe Endocarditis Pathogen Streptococcus gordoniiand its Complex with the Mechanism-based InhibitorNAG-thiazoline

David B. Langley1⁎†, Derek W.S. Harty2†, Nicholas A. Jacques2

Neil Hunter2, J. Mitchell Guss1 and Charles A. Collyer1

1School of Molecular andMicrobial Biosciences,University of Sydney,Sydney, Australia2Institute of Dental Research,Westmead Millennium Institute,Westmead, Australia

Received 20 July 2007;accepted 11 September 2007Available online16 September 2007

*Corresponding author. E-mail [email protected].† D.B.L. and D.W.S.H. contributedAbbreviations used: GcnA, N-ace

Streptococcus gordonii; MAD, multiplanomalous dispersion; NAG, N-aceTIM, triose-phosphate isomerase; NPDB, Protein Data Bank; PEG, polyeSe-Met, selenomethionine.

0022-2836/$ - see front matter. Crown C

The crystal structure of GcnA, an N-acetyl-β-D-glucosaminidase fromStreptococcus gordonii, was solved by multiple wavelength anomalousdispersion phasing using crystals of selenomethionine-substituted pro-tein. GcnA is a homodimer with subunits each comprised of threedomains. The structure of the C-terminal α-helical domain has not beenobserved previously and forms a large dimerisation interface. The fold ofthe N-terminal domain is observed in all structurally related glycosidasesalthough its function is unknown. The central domain has a canonical(β/α)8 TIM-barrel fold which harbours the active site. The primarysequence and structure of this central domain identifies the enzyme asa family 20 glycosidase. Key residues implicated in catalysis have dif-ferent conformations in two different crystal forms, which probablyrepresent active and inactive conformations of the enzyme. The catalyticmechanism for this class of glycoside hydrolase, where the substraterather than the enzyme provides the cleavage-inducing nucleophile, hasbeen confirmed by the structure of GcnA complexed with a putativereaction intermediate analogue, N-acetyl-β-D-glucosamine-thiazoline. Thecatalytic mechanism is discussed in light of these and other family 20structures.

Crown Copyright © 2007 Published by Elsevier Ltd. All rights reserved.

Keywords: GcnA; glucosaminidase; family 20 glycoside hydrolase; sub-strate-assisted catalysis; Streptococcus gordonii; infective endocarditis

Edited by R. Huber

Introduction

Streptococcus gordonii is a primary coloniser of theoral cavity and contributes to the maintenance of ahealthy oral flora.1 However, should it gain access tothe bloodstream it can colonise the heart valve in

ess:

equally to this work.tyl glucosidase frome wavelengthtyl-β-D-glucosamine;GT, NAG-thiazoline;thylene glycol;

opyright © 2007 Published b

susceptible individuals resulting in infective endo-carditis, an often fatal disease.2 The endocardialvegetation comprises a complex mix of activatedplatelets and serum components enmeshed in afibrin matrix. Such an environment, whilst hidingcolonising bacteria from the immune system, mayalso shield them from sugars abundant in thecirculatory system. Survival in such a niche isthought to be mediated by a series of glycosidehydrolases capable of releasing dietary carbo-hydrates from host glycoproteins. Indeed, severalS. gordonii gene clusters containing glycoside hydro-lases have been identified as being up-regulated inthe context of an animal model of the disease.3 Oneof these gene clusters, the gom regulon, contains 15open reading frames, five of which sequencehomology identifies as sugar degrading enzymes:a fucosidase; two mannosidases; a β-glucosidase;

y Elsevier Ltd. All rights reserved.

Figure 1. Glycoside hydrolase reaction schemes. (a) A locally generated hydroxide ion acts as the nucleophile. (b) and(c) An enzyme-derived acid acts as the nucleophile yielding an adducted intermediate. (d) and (e) Substrate-assistedcatalysis where the acetamido oxygen of the terminal NAG acts as the nucleophile yielding a bicyclic oxazolinium ionintermediate. (f) NAG-thiazoline (NGT) is a relatively stable analogue of the predicted oxazolinium ion.

105N-acetyl-β-D-glucosaminidase from S. gordonii

and an N-acetyl-β-D-glucosaminidase (GcnA alsoreferred to as BhsA3). GcnA has been cloned,sequenced, overexpressed, biochemically charac-terized,4 and crystallized.5 The overexpressed pro-tein is a homodimer with exo-glucosidase activity.The enzyme cleaves N-acetyl-β-D-glucosamine(NAG) and N-acetyl-β-D-galactosamine residuesfrom 4-methylumbelliferylated (4MU) substrates,as well as cleaving NAG from chito-oligosacchar-ides (i.e. NAG polymers). In contrast, sulphatedforms of the substrate are unable to be cleaved andact instead as mild competitive inhibitors. Addi-tionally, the enzyme is known to be poisoned byseveral first-row transition metals as well as bymercury.Based on its primary amino acid sequence the 627

amino acid GcnA is classified as a family 20glycosidase. At the time of writing there were 108different families of glycoside hydrolases which,when structural information is considered, can bemore broadly classified into 14 clans‡.6,7 Four ofthese clans, including clan K to which family 20enzymes belong, have a (β/α)8 triose-phosphateisomerase (TIM)-barrel fold. Structures of family 20glycosidases already known include a chitobiasefrom Serratia marcescens (SmCHB),8 as well asN-acetyl-glucosaminidases from Actinobacillus acti-nomycetemcomitans (AaDspB),9 Streptomyces plicatus(SpHEX),10 andHomo sapiens (HsHexB andHsHexA).11,12

‡ See Carbohydrate-Active enzymes website; http://afmb.cnrs-mrs.fr/CAZY

The human genes are clinically important as muta-tions in either hexA or hexB prevent the normaldegradation of GM2-gangliosides resulting in Tay-Sachs and Sandhoff diseases, respectively, both ofwhich are devastating neurodegenerative disordersusually resulting in early childhood death.Mechanistically, most glycoside hydrolases cleave

sugars by employing a pair of acidic active siteresidues. In the simplest case, one of these residuesacts as a general base, abstracting a proton from awater molecule to facilitate nucleophilic attack onthe anomeric carbon (Figure 1(a)). The secondresidue donates a proton to the newly formedhydroxyl group of the product. In this case thestereochemistry of the released sugar is inverted. Ina second mechanism an acidic residue, acting as abase, itself acts as the initial nucleophile, generatinga reaction intermediate in which the enzyme iscovalently attached to the terminal sugar (Figure 1(b) and (c)). In the second half of the reaction cyclethe other residue acts as a general base, activating awater molecule to facilitate nucleophilic attack onthe anomeric carbon releasing the terminal sugarfrom the enzyme (Figure 1(c)). In this case thestereochemistry of the released product sugar isretained. The family 20 glycosidases are thought toact via a third mechanism, in which the primarynucleophile is not provided by solvent or theenzyme, but by the substrate itself.8 In the case ofa substrate such as a polymer of NAG, the carbonyloxygen of the N-acetyl moiety acts as the nucleo-phile and the enzyme-stabilized intermediate con-sists of a bicyclic oxazolinium ion (Figure 1(d)

106 N-acetyl-β-D-glucosaminidase from S. gordonii

and (e)). Hydrolysis of this intermediate to a productwhere the stereochemistry is retained once againinvolves nucleophilic attack from an activated watermolecule. This “substrate-assisted” catalytic mecha-nism has been substantiated in part by the observa-tion that mimics of the bicyclic intermediate, suchas NAG-thiazoline (NGT, Figure 1(f)) are potentinhibitors of family 20 enzymes.10–12

In order to further characterize the gom-operonderived N-acetyl-β-D-glucosamine activity from S.gordonii, we have solved the structure of GcnA intwo different forms, corresponding to what appearto be active and inactive conformations of theenzyme, raising the possibility of dynamic substratecapture being a feature of catalysis. In addition thestructure of GcnA crystallized in the presence ofNGT places this putative reaction intermediatemimic in the active site, confirming the classificationof GcnA as a family 20 enzyme, and allowing thecatalytic mechanism to be scrutinized. The role of anovel dimerisation domain for the function of GcnAhas been investigated by mutagenesis. The mechan-sim of GcnA poisoning by mercury has beeninvestigated by the structure analysis of a mer-cury-derivitised crystal. The catalytic mechanism isdiscussed in light of the ensemble of structurespresented here, and other structurally characterizedglycoside hydrolases.

Results

Domain architecture

The structure of the GcnA can be partitioned intothree domains (Figure 2(a)). The central domain isa TIM-barrel and harbours the active site.13 Thestructure of GcnA has been refined in two crystalforms (P21212=form I, and I222=form II). In bothcases the enzyme molecules are packed within thecrystal lattice as dimers (Figure 2(b)). Whilst theform I crystal contains a dimer in the asymmetricunit, the form II crystal contains a monomer in theasymmetric unit and the dimer is created byrotation about a crystallographic 2-fold axis (i.e.both monomers of the dimer are identical). Theglobal folds of the five structures presented are thesame although some minor conformational yetsignificant differences will be discussed. Electrondensity was consistently absent for the N-terminalMet residue which is consistent with mass spectro-scopy (data not shown) and N-terminal sequencingdata4 indicating its removal. Otherwise, the ensem-ble of structures allow the main-chain for theremaining 626 amino acids to be traced, althoughshort segments of certain structures were notresolved (Table 1) and will be discussed explicitly.Seven cysteine residues are present in each mono-mer of the dimer, but no disulphide bonds areobserved. The only metal ions located in any of thestructures are in a form I crystal poisoned with amercury salt prior to data collection (discussedlater).

The N-terminal domain (domain I, residues 2–82),has an α+β fold comprising five strands of (mostly)parallel β-sheet which sandwich two approximatelyparallel α-helices against the outer helical surface ofthe TIM-barrel (domain II), and against a helixwhich connects domains II and III (Figure 2(a), blueribbon). These N-terminal domains are at theextreme outside ends of the dimer (Figure 2(b))and are not directly involved in the interface. In theform I crystals, residues 33 to 36 of both chains arepoorly resolved and in some cases have not beenmodelled (Table 1). The average B-factors for thisdomain are larger than those of domains II and III(Table 1).Domain II (residues 83–400) has a classic (β/α)8

TIM barrel fold with the active site located at theC-terminal ends of the β-strands (Figure 2(a), greenand yellow ribbon). Six of the loops at the active-siteend of the barrel interact with the partner monomer.Well resolved α-helices adorn all eight positionsdefining the outer edge of the barrel which is incontrast to the other reported family 20 glycosidasestructures where helices five and seven are replacedby sheet-like segments. Interestingly, the two GcnAcrystal forms refined display different conforma-tions for part of the TIM-barrel fold, which haveimplications with regards to the enzyme mechanism(discussed below).The C-terminal domain (domain III, residues 401–

627) is predominantly α-helical and forms a sig-nificant part of the dimer interface (Figure 2(a),red ribbon). Within this domain, five prominentα-helical segments (one segment being broken by aloop region of residues 428–441) are stacked ap-proximately side-by-side in an antiparallel arrange-ment, the ends of which rest against the TIM-barrelof the neighbouring molecule (Figure 2(b)). In thecontext of the dimer, the helix bundle from onemonomer stacks against the same feature of theother subunit. The peptide chain emerging from thelast of the α-helices in eachmolecule winds back intothe structure and makes extensive contact with thedimer partner such that the C-terminal residue(T627) projects into the extended volume of thesubstrate-binding pocket of the partner monomer(Figure 3(a), blue surface). Although each active-sitepocket is in part lined by a contribution from thedimer partner, no evidence of cooperativity wasobserved in assays using low concentrations of asynthetic substrate (data not shown).

Dimer interface and active site location

Both gel filtration chromatography4 and thepacking of molecules within the crystals showGcnA to be a dimer. The largely hydrophobicdimer interface buries a surface area of ∼3500 Å2

per monomer. Only ∼10% of this buried surface isthe result of contacts directly between the TIM-barrel domains. Hence, dimerisation is largely theresult of domain III interactions. Approximatelyequivalent surface area is buried as a result of directinteractions between domains III in the two subunits

Figure 2. GcnA domain architecture. (a) GcnA monomer with the N-terminal domain (blue), the central TIM-barreldomain (green) and its β-sheets (yellow), and the C-terminal domain (red). The right-hand panel looks down theapproximately 8-fold rotational axis of the TIM-barrel. (b) GcnA dimer with one subunit uniformly coloured (tan) and theother coloured as per domain (as in (a)). The right-hand panel looks down the 2-fold dimerisation axis. Immediatelystraddling this axis are residues G564 and G565 (coloured black in each subunit). (c) The structurally homologous CpNagJdimer, orientated as for (b). Domains I, II and III for one subunit are coloured cyan, green and orange, respectively. Theother subunit is coloured pink.

107N-acetyl-β-D-glucosaminidase from S. gordonii

and trans interactions between each dimerisationdomain and the TIM-barrel domain in the othersubunit. Disruption of the dimer was easilyachieved by mutating G564 and G565 to Glu andAsp, respectively. G564 and G565 are positionedon the domain III α-helix, which packs against thesame helix from the other subunit across the 2-foldaxis of the dimer (Figure 2(b), black ribbon). Whenoverexpressed and purified, the double mutantprotein migrates as a monomer during gel filtra-tion chromatography (Supplementary Data) andexhibits no enzyme activity (data not shown),

implying that catalysis is contingent upon dimerformation.In the context of the dimer (overall dimensions

∼125 Å×70 Å×60 Å), the active sites are positioned∼35 Å apart at the base of solvent filled cavities(∼15 Å in diameter at the surface, and ∼15 Å deep)(Figure 3(a)). At the base of each cavity is a smallhydrophobic pocket lined by the side-chains of fourtryptophan residues (W267, W307, W340 andW374). Directly adjacent to this pocket are theconserved acids D223 and E224 implicated incatalysis.4 Although each active site is located at

Table 1. Data collection and refinement statistics

GcnA-II-Se GcnA-II GcnA-II-Hg GcnA-I GcnA-I-NGT

A. Data collectionX-ray source APS 23ID-D APS 23ID-D Rotating anode MAX-lab I711 Rotating anodeWavelength 0.97928 0.97928 1.5418 1.0880 1.5418Space Group I222 I222 I222 P21212 P21212Unit-celldimensions a, b, c (Å)

123.9, 124.5, 148.5 123.6, 124.5, 147.8 123.0, 124.7, 147.9 110.0, 112.5, 104.0 107.6, 112.3, 103.1

Molecules/asu 1 1 1 2 2Resolution range (Å) 1.85–39.97 1.40–29.4 2.04–14.95 1.56–34.88 1.61–15.0No. of unique reflections 97,524 221,868 71,742 170,106 161,033Completeness (%) 99.6 (95.6)a 99.9 (99.4) 99.3 (94.3) 96.9 (77.6) 99.3 (94.4)Redundancy 7.4 (6.7) 9.8 (7.3) 8.1 (7.1) 3.6 (4.3) 9.1 (8.4)Average I/sig(I) 13.4 (3.24) 17.8 (2.53) 18.9 (3.74) 11.7 (4.36) 20.9 (3.51)Rmerge

b 0.10 (0.65) 0.090 (0.50) 0.064 (0.35) 0.093 (0.43) 0.064 (0.50)Wilson B value (Å2) 26.7 18.5 26.6 19.9 18.2

B. RefinementNo. of non-H atoms 5245 5561 5102 10,605 10,974Residues not includedin the model

276–287, 627 276–287 273–287, 311–329 B33–B36, B272–B292 A33–A35, B33–B35

Water molecules 217 463 215 605 777Other compounds Sulfate (7) Sulfate (8),

glycerol (1)Sulfate (7),glycerol (1)

Acetate (2) NGT (2)

r.m.s.d. bond lengths (Å) 0.010 0.011 0.011 0.011 0.011r.m.s.d. bond anlges (°) 1.22 1.30 1.25 1.27 1.33r.m.s.d. peptide planarity (°) 7.30 5.79 5.87 5.80 5.83Ramachandran residues inc

Favoured regions (%) 97.7 98.0 96.8 97.5 97.3Outlier regions (%) 0.5 0.3 0.2 0.2 0.2

ESU (A), based onmax likelihoodd

0.084 0.033 0.097 0.074 0.055

Mean B overall (Å2) 30.9 20.7 40.6 19.1 20.0Mean B domain I(res 1–82) (Å2)

38.0 28.5 51.0 24.8 25.3

Mean B domain II(res 82–400) (Å2)

33.2 22.2 43.4 19.3 20.6

Mean B domain III(res 401–627) (Å2)

32.3 22.0 42.1 20.8 22.1

Re 0.215 0.186 0.199 0.210 0.167Rfree

f 0.235 0.199 0.237 0.249 0.202PDB code 2EPK 2EPL 2EPM 2EPO 2EPN

a Values in parentheses are for the highest resolution shell.b Rmerge=∑|Ih–bIhN|/∑bIhN.c Calculated using the MOLPROBITY server.29d Diffraction-component precision index output by REFMAC5.28e R values=∑|Fobs–Fcalc|/∑Fobs.f 5% of the reflections were reserved for the calculation of Rfree.

108 N-acetyl-β-D-glucosaminidase from S. gordonii

the base of its own cavity, a solvent filled tunnel(∼12 Å by 7 Å wide and 5 Å long) connects the twocavities. In spite of the presence of this tunnel, thepit-like nature of the substrate binding pocket isconsistent with enzyme kinetics data4 that showsGcnA is an exo-glycosidase.

Two crystal forms and their active siteconformations

The primary structural differences between thetwo GcnA crystal forms involve active site residuesin the TIM-barrel domain. Superposition of thestructures reveals positional differences for theC-terminal ends of strands β4 and β5, theN-terminal end of helix α5, and the βα-loopconnecting strand β5 and helix α5. Strand β4 is ofparticular note as it carries D223 and E224, believedto be critical to the catalytic mechanism. In bothsubunits of the form I crystals, the carboxylate

groups of these residues are orientated towards thetryptophan-lined pocket (Figure 3(b), tan molecule).However, in the form II crystals (Figure 3(b), pinkmolecule) the alternate conformation of the β4strand positions E224 away from the tryptophan-lined pocket such that the side-chain of E224 isdisplaced ∼12 Å (Figure 3(b), broken line). In thisalternate conformation, the side-chain of D223 ismore subtly shifted (CG moves ∼2.5 Å), as are theside-chains of other residues (W267, W307 andH170) that partly line the substrate-binding pocket.As a result, the pocket is more open and solventexposed. In concert with these positional differencesin the β4 strand, a small section of helix at the end ofthe β5 strand also shifts dramatically (Figure 3(b),H). Downstream of this small helix, the remainingresidues, which constitute the βα-loop connectingstrand β5 and helix α5 are unresolved in form IIcrystals (Figure 3(b),βα5). In the native form I crystalthis loop is resolved in the A-subunit but disordered

109N-acetyl-β-D-glucosaminidase from S. gordonii

in the B-subunit of the dimer,whereas in the complexwith NGT, a reaction intermediate mimic bound atthe active site (discussed below), this loop is resolvedin both subunits.

Active site and catalytic mechanism

The prediction that GcnA might utilize a sub-strate-assisted catalytic mechanism was confirmedby the finding that the reaction intermediate mimic

Figure 3 (legend

NGT (Figure 1(f)) is a competitive inhibitor of theenzyme (Ki=60 (±17) nM; Supplementary Data). Inorder to better define the enzyme's active site form IGcnA crystals were grown in the presence of NGT.Unbiased electron density (Fobs–Fcalc) for the NGTmolecule in both active sites of the dimer, unequi-vocally reveals the position of all non-hydrogenatoms (Figure 3(c)). The positions of residues whichline the substrate-binding pocket are similar in formI apo crystals and in NGT co-crystals (Figure 3(c),

on next page)

110 N-acetyl-β-D-glucosaminidase from S. gordonii

light and dark sticks, respectively). Nine hydrogenbonds, three of which involve well-ordered watermolecules, contribute to the binding of NGT to thetryptophan-lined pocket. No solvent molecules areresolved on the side of the NGT molecule facingthe tryptophan pocket, consistent with the notionthat solvolysis is initiated from the solvent side ofthe NGT plane, resulting in the retaining stereo-chemistry expected for the released product. Anadditional well-resolved water molecule is hydro-gen bonded by E224, and is positioned on thesolvent side of NGT, ∼3.9 Å above the anomericcarbon (Figure 3(c), black sphere). Given itsproximity, it is plausible that this water representsthe nucleophile-in-waiting, which would ordinarilybecome deprotonated by E224 and attack theanomeric carbon of the bona fide oxazoliniumintermediate.Interestingly, in the absence of NGT, each active

site of the form I apo structure contains a planar tri-lobed electron density feature adjacent to D223 thatwas modelled as acetate (Figure 3(d); the well-solution for this crystal contained 0.2 M ammoniumacetate). Acetate is a weak competitive inhibitor ofGcnA (Ki=0.5 (±0.1)M; data not shown). The elec-tron density of this feature superposes with that ofthe thioacetamidate moiety of NGT (compare Figure3(c) and (d)). In both cases a methyl group is nestledagainst the non-polar face of W267. Whilst D223makes a hydrogen bond with one of the acetateoxygen atoms, similar to the one made with thenitrogen of NGT, Y309 reaches in to form ahydrogen bond with the other acetate oxygen. Thesteric bulk of the sulphur atom of NGT is probablyresponsible for the relatively large movement ofY309 on inhibitor binding (Figure 3(c), compare thelight and dark sticks).Another consequence of co-crystallising GcnA

with NGT in the form I crystal was the abovementioned ordering of the βα-loop connectingstrand β5 and helix α5 in both subunits of thedimer. In the A-subunit this loop is intimatelyinvolved in crystal contacts with a neighbouringmolecule, and is clearly resolved with and withoutNGT. In the absence of such crystal contacts, these

Figure 3. Stereo images of GcnA active site and ligand bindentry cavity with an NGT molecule (sticks) bound at the activother active site pocket (behind the silver transparent surface).contributed by the extreme C terminus of the dimer partner (Tindicated by the arrow. (b) Crystal forms I (A chain, tan) andsubstrate-binding pocket, H170, and the conserved acids are s∼12Å apart in the two crystal forms (broken line). Much of theβhelix α5 (H5) is not resolved in the form II structure. Helices αResidues of the form I crystal which define the pocket which binwhich bind acetate (pale green sticks). Thewater positioned aboOtherwater moleculess are shown as red spheres. Themesh repNGT in the model. Hydrogen bonds are shown as broken lineperspective as for (c)). Themesh represents electron density (Fobactive site pocket of the form II crystal before (pink sticks) andatom (metallic sphere and greenmesh). The repositioned side-cdensity map and its position is inferred by that of the polypresidual density observed in the form II crystal active sites reg

residues of the B-subunits are ordinarily disordered.NGT-binding would seem to stabilise the loop suchthat the electron density for the B-chain can betraced. Incidentally, the ordering of this loopcorrelates with the different morphology and betterdiffracting properties of the form I crystals grown inthe presence of NGT. Without NGT, the form Icrystals exist as “key”-shaped projections.5 In thepresence of NGT the edges of the now-sword likecrystals are smooth and sharp-faced, yet belong tothe same space group.In the case of the form II crystals, which were

grown in the absence of acetate (or NGT), at the endof refinement several electron density featuresremained at the active site pocket which we werenot able to interpret and they have not beenmodelled (Figure 3(e), blue mesh), in spite of thehigh resolution of one of these datasets.In order to investigate the mechanism by which

transition and heavy metals have been noted topoison GcnA,4 form II crystals were progressivelysoaked with ethylmercurithiosalicyclic acid priorto data collection. Of the seven cysteine residuespresent in GcnA, five are derivitized by mercury,two quite extensively (C164 and C361, with Hg2+occupancies of 0.9 and 1.0, respectively). Thebinding of Hg to C164 (Figure 3(e), green meshand metallic sphere), which is positioned in thelayer directly below the surface of the substrate-binding pocket, has a noticeable effect on theshape of this pocket. In order to accommodate thesteric bulk of the Hg atom, W267 is pushed intothe pocket (the CZ3 atom moves ∼2 Å), the main-chain of W307 shifts, and the side-chain of W307 isno longer resolved in electron density.

Active site comparison with other glycosidehydrolases

The active-sites of the form I GcnA structures,which adopt a more “active” conformation, areclosely similar to those of other family 20 glycosidehydrolases, many of which have also been co-crystallized with NGT (Figure 4(a)). The identityand position of most of the residues that line the

ing. (a) Transparent surface representation of one substratee site pocket. A second NGT molecule is positioned in theSurface contributed by E224 is coloured yellow whilst that627) is coloured blue. The tunnel linking the active sites isII (pink) overlaid. The tryptophan residues which line thehown as sticks. The terminal atoms of E224 are positionedα-loop (βα5, containing helixH) connecting strandβ5 and4 and α6 are also indicated (H4 and H6, respectively). (c)ds NGT (carbon atoms as green sticks) overlaid with thoseve the anomeric carbon of NGT is shown as a black sphere.resents electron density (Fobs–Fcalc) prior to the inclusion ofs. (d) The active site in the context of bound acetate (sames–Fcalc) prior to the inclusion of acetate in themodel. (e) Theafter (green sticks) the adduction of C164 with a mercuryhain ofW307 (pale green sticks) is not visible in the electroneptide backbone. The blue mesh represents unmodelledardless of the presence of mercury.

Figure 4. Stereo images of GcnA–NGT superposed with other family 20 glycoside hydrolase structures. (a)Superposition of all structures bound by NGT. Sticks coloured green, orange, dark blue and red correspond to structuresfrom GcnA, HsHexA, HsHexB and SpHEX, respectively. (b) Superposition of GcnA (green sticks) with structures fromSmCHB (with di-NAG ligand, pink sticks), and AaDspB (with acetate ligand, blue sticks).

111N-acetyl-β-D-glucosaminidase from S. gordonii

substrate-binding pocket are retained, or conser-vatively replaced (note the substitution of a His inplace of Y122 in the other structures). Although thehydroxyl group of Y309 is similarly positioned tothose in the homologous structures, in GcnA theTyr residue projects from the C terminus of the β6strand rather than from the C terminus of the β7strand. The most significant difference concerns thesubstitution of an Asp (SpHEX and HsHexB), orAsn (HsHexA) in place of W340 in GcnA. Thecharged/polar residues in the related enzymesform hydrogen bonds with the O6 oxygen of theNGT, which consequently is oriented differently tothe conformation observed in the GcnA–NGTstructure (Figure 4(a)). In GcnA, the hydrophobicnature of W340 results in the O6 hydroxyl of NGTbeing orientated towards alternative hydrogen-bond partners (Figure 3(c)). Otherwise, the NGTmolecules in the complexes are essentially identi-cally positioned in the active sites. The GcnAstructure can also be superposed with other family20 enzymes which are bound by alternativeligands (Figure 4(b)). The acetate ion in the formI crystal structure of GcnA superposes perfectly onthe same ion in the AaDspB structure (Figure 4(b),blue sticks). Incidentally, W340 of GcnA is sub-stituted by Val in this enzyme. Such subtleremodelling of the active site might reflect thefact that AaDspB cleaves β(1,6)-linkages, asopposed to the β(1,4)-linkages cleaved by GcnA.Superposing GcnA and the structure of SmCHBcomplexed with the disaccharide chitobiose (di-NAG) allows the two-residue substrate to bepositioned approximately in the active site ofGcnA (Figure 4(b), pink sticks). This comparisonallows one to visualise how substrate mightapproach the active-site pocket of GcnA prior to

the cyclisation event which produces the oxazoli-nium intermediate.

Discussion

GcnA is one of several sugar degrading enzymesthat are up-regulated by S. gordonii in response toconditions akin to those expected of endocardialinfection. The structure reveals the enzyme to be ahomodimer, with each subunit comprised of threedomains, and confirms the enzyme to be a family 20glycosidase.The N-terminal domains have an α+β fold and are

positioned at the extreme edges of the dimer (Figure2). Similar N-terminal domains have been observedin all family 20 glycosidases structurally character-ized although the number of β-sheet strands variesfrom five (GcnA), to six in HsHexA and HsHexB,and seven in SpHEX and SmCHB, and the arrange-ment of strands in these structural homologues isalways antiparallel, as opposed to the largelyparallel arrangement in GcnA. Such domains arenot unique to family 20 enzymes but are widelyconserved features of other glycosidase families.Although the function of these domains is yet to beelucidated, in some cases they have a distinctpositive surface potential and are suggested to beinvolved in membrane association.14 Without asimilar abundance of positive charge on thesedomains in GcnA (data not shown), and as theyare located at extreme ends of the dimer, such a rolein this case seems unlikely.The second domain of each monomer (domain

II), with its classic (β/α)8 TIM-barrel fold, is also afeature of family 20 enzymes. As expected, theactive site is formed by the C-terminal ends of theβ-strands, which form the barrel. The location of

112 N-acetyl-β-D-glucosaminidase from S. gordonii

the active site is confirmed by the binding of NGT,a putative reaction intermediate mimic, in thisregion (discussed later).The C-terminal domain (domain III) is comprised

of a bundle of α-helices and is involved in dimer-isation. Although no close structural homologues ofthis fold exist in the protein data bank (most family20 enzymes lack such a C-terminal domain), thehydrolase containing a domain best resembling it inboth structure (also with five prominent helices) andfunction is a similarly sized glucosaminidase fromClostridium perfringens (CpNagJ).15 Like GcnA, thehelical dimerisation bundles in CpNagJ pack againsteach other in a side-by-side arrangement (compareFigure 2(b) and (c)). However, unlike GcnA,the dimerisation surface in CpNagJ is whollyattributable to domain III interactions, and theTIM-barrel domains do not directly contact eachother. Incidentally, the structural homology betweenGcnA and CpNagJ also extends to domains I and II,which superpose as an intact ensemble. Addition-ally, as with GcnA, the extreme C terminus ofCpNagJ (I624) projects from the dimerisationdomain into the substrate-binding pocket of theother subunit. Thus GcnA and CpNagJ are structu-rally related, in spite of sharing poor identity at theamino acid level (∼7%), and despite CpNagJ beingclassified as a family 84 glycosidase.The active sites of GcnA lie at the base of cavities

predominantly formed by the surface of one subunitof each dimer (Figure 3(a)). A solvent channel linksthe two active-site cavities. It is conceivable that theterminal sugar of a carbohydrate polymer substratemight reach an active site pocket after threadingthrough this internal passage (i.e. entering one activesite from the other active-site cavity), but weconsider this unlikely due to the convoluted natureof the path required. The active site pocket intowhich the terminal carbohydrate becomes bound islined by four tryptophan residues and is adjacent tothe two conserved acids (D223 and E224). Disruptionof the dimer interface destroys enzyme activity,consistent with the C terminus of each monomerhelping line the active site pocket of the dimerpartner.The structure of GcnA has been refined in two

different crystal forms (II and I), with one and twosubunits in the asymmetric unit, respectively. Themain difference between the two crystal formsinvolves the conformation of the β-strand contain-ing the catalytic acids (D223 and E224) and theβ-strand and βα-loop directly adjacent to these acids(Figure 3(b)). In both subunits of the form Istructure, the catalytic acids are both positionedabove the tryptophan-lined pocket, as if poised forcatalysis. We therefore propose that the structure ofthis crystal form represents the active conformationof GcnA. In the form II structure, conformationalchanges relocate the side-chain of E224 away fromthe tryptophan-lined pocket, which appears to belarger and more exposed to solvent. Hence, weascribe this crystal form as representing an inactiveor resting conformation of the enzyme. In part, the

active-site pocket appears more solvent exposed inthis “inactive” crystal form as the βα-loop ordinarilyadjacent to the catalytically poised acids is notobserved. This loop is resolved in the A-chain mole-cule of the “active” structure, where it makes crystalcontacts with a neighbouring molecule within thecrystal, although it is not able to be traced in the B-chain molecule which lacks analogous contacts.Interestingly, the binding of the reaction inter-mediate mimic NGT (Figures 1(f) and 3(c)) ordersthis B-chain loop in the active structure.Ascribing one form of the enzyme structure as

“inactive”, perhaps incorrectly implies that insolution this conformation is incapable of revertingto the active conformation, an event which mighteven be accompanied by the association of substratewith the active site pocket. Observation of asubstrate mimic, such as an acetate ion (Figure3(d)), in the substrate binding pocket of the form Icrystals might in itself be evidence that dynamicsubstrate capture is a feature of this (and other)glycoside hydrolases. Given that the NGT–GcnAcomplex appears to be more ordered than theacetate–enzyme complex, or apo enzyme, it can beargued that residues in the vicinity of the active-sitepocket are better able to close around and collec-tively embrace the larger NGT ligand, hencebecoming ordered, than is the case with the smalleracetate ligand or, by extension, in the absence ofligand. Such flexibility in the loops of a TIM-barrelenzyme is not a unique observation. In fact,numerous TIM-barrel enzymes exhibit some sort ofloop flexibility which is sometimes intimatelycoupled with catalysis.16–18 A similarly disorderedregion to that noted for GcnA (in this case thepolypeptide chain immediately following the twocatalytic acids) is also observed in the AaDspBstructure9 Coincidentally, AaDspB also contains anacetate ion in the active site pocket in the sameposition as that observed in active GcnA.The structure of GcnA complexed with NGT

reveals very little difference in the positions ofresidues that line the substrate-binding pocketcompared with the binding of an acetate ion (Figure3(c)). Themost significant difference is the position ofY309, which in the presence of the acetate ion forms ahydrogen bond with one of the oxygen atoms of theligand, and in the presence of NGT retracts such thatthe bulky sulphur atom can be accommodated. It ispossible that in the context of the bona fide oxazoli-nium intermediate, where the sulphur of NGT isreplaced by oxygen, the tyrosyl hydroxyl of Y309acts as a hydrogen bond donor, withdrawingelectrons from the anomeric carbon making it moresusceptible to nucleophilic attack (Figure 1(e)). Well-resolved water molecules are observed on thesolvent side of the NGT molecule. One in particularis hydrogen-bonded by E224 and sits ∼3.9 Å abovethe anomeric carbon mimic. The positions of active-site waters have been highlighted in other family20 structures, especially those also co-crystallizedwith NGT. A similarly positioned water sits ∼5.2 Åabove the same NGT carbon in the HsHexB–NGT

113N-acetyl-β-D-glucosaminidase from S. gordonii

structure,11 and the hydroxyl group of an orderedglycerol molecule sits ∼3.5 Å above the same NGTcarbon in the SpHEX-NGT complex.10 The NGTmolecule is located in essentially the same relativeposition in these related structures although the O6oxygen atom is orientated towards hydrogen-bond-ing partners which substitute for one of thetryptophan residues lining the hydrophobic pocket(Figure 4(a)). An enzyme kinetics study found thatGcnAwas unable to cleave a substrate where the O6oxygen is sulphated.4 Exactlywhy such a substrate isresistant to cleavage is unclear although afterresident water molecules are removed sufficientspace might still not be available in the active siteto accommodate a sulphate group attached to the O6oxygen. An appreciation of what is required for thecleavage of O6-sulfated substrates is gained byinspection of the HsHexA structure, for whichcleavage of sulphated as well as negatively chargedsialated substrates is observed. Not only is thesubstrate cleft considerably more open to bulksolvent than is the case in GcnA (lacking surfacecontributed by the C terminus of the neighbouringsubunit), HsHexA also has a critically positionedArg, absent in GcnA, which is believed to stabilisethe negative charge on the O6-sulfate. Substitution ofthis residue increased the Km value of this enzymefor a sulphated reporter molecule by an order ofmagnitude.11

Superposing GcnA–NGT with the SmCHB struc-ture, which contains a di-NAG substrate in theactive-site pocket, gives a tantalising view of how thesubstrate might become positioned in the active-sitepocket just before cleavage to the bicyclic reactionintermediate (Figure 4(b)). During the binding event,the methyl group of the terminal N-acetyl moiety isdirected into the side of the hydrophobic pocketdefined by the aromatic face of W267. As well asbeing observed in GcnA, the methyl group of anacetate ion has also been observed in a similarposition in AaDspB, as well as in a family 84glycoside hydrolase from Bacteroides thetaiotaomicron(BtGH84).19 Perturbation of this methyl-bindingposition can easily be achieved in the case of GcnAby derivatising a nearby cysteine residue withmercury such that W267 and the binding pocket itdefines becomes distorted, thus eliminating allenzyme activity (a truly inactive form of the enzyme;Figure 3(e)). In light of the GcnA–NGT and otherNGT-bound structures, the role of the first of theconserved acids (D223) appears to be restricted tostabilising the oxazolinium intermediate (Figure 1(d)and (e), “bottom” acid), although it has also beensuggested that, along with Y309,11 D223 helpspolarise the acetamido group of the incomingsubstrate. In contrast, E224 appears to play the roleof a general acid and base, either donating oraccepting a proton in the classically portrayedreaction pathway (Figure 1(d) and (e), “top” acid).Incidentally, in spite of family 84 glycoside

hydrolases being more distantly related at theamino acid level, representatives such as CpNagJand BtGH84 are also predicted to utilise the same

substrate-assisted catalytic mechanism as the family20 enzymes. However, family 84 enzymes employ asequential pair of conserved Asp residues, ratherthan a sequential Asp-Glu pair, and employ adifferent arrangement of residues to line their activesite pockets. Coincidently, like CpNagJ, BtGH84 alsohas a helical dimerisation domain (domain III),which more closely resembles domain III of GcnAthan those of the other family 20 glycoside hydro-lases characterized structurally thus far. Hence, suchdomain ensemble and catalytic mechanism simila-rities somewhat blur the line that currently distin-guishes these two classes of glycoside hydrolase.

Materials and Methods

Protein purification

GcnAwas expressed and purified as described,4 exceptthat ion-exchange was performed on a HiLoad 16/10 Qcolumn (GE Healthcare). For the expression of the Se-MetsubstitutedGcnA,plasmidpHAR101was transformed intoEscherichia coli B834 (DE3) (a methionine auxotroph). TheSe-Met substitutedGcnAwas expressedusing 1lOvernightExpress auto induction medium (Novagen, Merck). Cellswere harvested by centrifugation, washed once in 50 mMTris (pH 7.4) and lysed using Bugbuster protein extractionagent (Novagen, Merck). Extracts were clarified bycentrifugation (48,000g for 30 min at 4 °C), purified as des-cribed previously and concentrated to ∼10 mg/ml using a30 kDa centrifugal concentrator (Millipore). Purity wasexamined by SDS–PAGE. Successful Se-Met substitutionwas confirmed by mass spectrometry on a tryptic digest.

Enzyme kinetics

The kinetics of GcnA were determined using 4-nitrophenyl N-acetyl-β-D-glucosaminide (pNp-NAG;Sigma) as substrate with reference to a standard curve of4-nitrophenol. Standard reactions were performed in 1 mlof assay buffer (50 mM piperazine-1,4-bis(2-ethanesulfo-nic acid), 10 mM EDTA (pH 6.6)) as follows. GcnA (3.43pmol) in 50 μl of assay buffer was combined with an equalvolume of assay buffer with or without added inhibitor(NGT, 22.5 to 90 nM) in a 1.5 ml disposable plastic cuvette.After incubation at 37 °C for 5 min reactions were initiatedby the addition of 900 μl of pre-warmed buffer containingsubstrate (29.2 μM to 292 μM). Reactions were monitoredevery 15 s at 400 nm on a Beckman DU640 spectro-photometer fitted with a Peltier temperature controller.The initial reaction velocity was determined by linearregression analysis (Sigma-Plot). Reactions were per-formed in duplicate. The Km and Ki were determinedfrom the respective Lineweaver–Burk plots (Sigma-Plot)calculated from the regression lines. For this syntheticsubstrate the observed values for GcnA were as follows;Km=239(±12) μM and kcat=323(±11) s−1. Analysis ofcooperativity was performed as above at substrateconcentrations between 219 nM and 292 μM with allassays performed in triplicate.

Mutagenesis and analysis

Site-directed ligase-independent mutagenesis20 wasperformed using Phusion DNA polymerase (Finnzymes)

114 N-acetyl-β-D-glucosaminidase from S. gordonii

with primers outlined in Supplementary Data. TemplateDNA for mutagenesis was the plasmid pHAR101.4

Mutations were confirmed by DNA sequencing (West-mead Millennium Institute, DNA sequencing facility).Expression and extraction of native and mutant GcnA(G564E, G565D) was performed as described.4 Crudeprotein extracts were separated by gel-filtration chroma-tography (S-200 column; GE Healthcare) calibratedagainst protein standards (of masses 670, 158, 44, 17 and1.35 kDa, Bio-Rad). Fractions covering the expectedmolecular masses for the GcnA dimer and monomerwere individually pooled for both the mutant and wild-type extracts. Fractions were analysed by SDS–PAGE andtransferred by Western blot to a nitrocellulose membrane.GcnA was detected with a polyclonal primary antibodyraised in rabbits against formalin-killed whole cells of S.gordonii and an alkaline phosphatase conjugated goat-anti-rabbit secondary antibody (Dako). Prior to use theprimary antibody was first partially purified by overnightabsorption (4 °C) to the E. coli host strain used for GcnAoverexpression.

Crystallization and cryo-protection

Two crystal forms of GcnA were obtained in crystal-lization experiments. One of these, (form I) has an unusual“key”-shaped morphology and belongs to space groupP21212. The other (form II) has a more typical orthorhom-bic shape and is in space group I222. Of the structurespresented in this work (Table 1), three belong to form II,and include a crystal prepared from Se-Met substitutedprotein (GcnA-II-Se), a crystal derivatised with mercury(GcnA-II-Hg) and a crystal derived fromwild-type protein(GcnA-II). The two form I crystals were both preparedfrom wild-type protein although one was co-crystallizedin the presence of NGT (GcnA-I and GcnA-I-NGT,respectively). Most crystals were grown using the hang-ing-drop vapour diffusion method where 2 μl of protein(∼10 mg/ml) and an equal volume of well solution werecombined. In the case of the GcnA-I-NGT crystal contain-ing the complex of GcnA and NAG-thiazoline, 200 nl ofboth protein and well solutions were combined. The wellsolution for the GcnA-II-Se crystal consisted of 100 mMHepes (pH 7.0), 2 M ammonium sulphate, and 0.5% (v/v)polyethylene glycol (PEG) 400. The well solution for theGcnA-II-Hg crystal consisted of 100 mM Tris (pH 8.3),1.9 M ammonium sulphate, and 0.5% PEG400. The wellsolution for the GcnA-II crystal consisted of 100 mM Tris(pH 8.1), 2 M ammonium sulphate, and 2% PEG 400. Thewell solution for the GcnA-I crystal comprised 200 mMammonium acetate, 100 mM sodium citrate (pH 6.0), and24% PEG4000. The well solution for the GcnA-I-NGTcrystal comprised 200 mM di-ammonium tartrate, 20%PEG3350, and ∼4.5 mM NGT (equating to a ∼30-foldmolar excess over enzyme active sites).Cryo-protection for most of the crystals involved first

increasing the volume of the drop with a small volume (2–8 μl) of well solution, before swimming the crystal in∼30 μl of well solution doped with glycerol to a finalconcentration of 15% (v/v) for ∼1 min before mounting ina cold (100 K) nitrogen stream. In the case of the GcnA-II-Hg crystal, once bolstered with well solution a few grainsof ethylmercurithiosalicyclic acid were added to the dropand the solution incubated for 24 h at room temperatureprior to cryo-protection and snap freezing. In the case ofthe GcnA-I-NGT crystal, an excess of NGT was alsopresent in the cryo-protection solution which employed15% (v/v) 2-methyl-2,4-pentanediol rather than glycerol.

X-ray data collection and processing

X-ray diffraction data on the GcnA-II-Se and GcnA-IIcrystals were collected at the Advanced Photon Sourcebeamline 23ID-D (Argonne, USA) using a marMOSAICCCD detector. In the case of the GcnA-II-Se crystal, datawere collected at three wavelengths corresponding to thepeak (λ=0.97928 Å) the inflection (λ=0.97945 Å) and ahigh energy remote (λ=0.94929 Å) of a Se Kα absorptionprofile. Data for the GcnA-II crystal was recorded at λ=0.97928 Å. Data for the GcnA-I crystal was recorded atMAX-lab beamline I711 (λ=1.0880 Å; Lund, Sweden)using a marMOSAIC CCD detector. Data for the GcnA–II-Hg and GcnA–I-NGTcrystals were recorded in-house on amar345 image-plate detector and Rigaku RU-200 rotatinganode generator with Osmic mirror optics (Auburn Hills,MI, USA) and copper target (λ=1.5418 Å). Data collectedat Argonne were indexed and scaled with HKL2000.21

Data collected at Lundwere processed with mosflm,22 anddata collected in-house were processed and scaled withDENZO and SCALEPACK from the HKL crystallographysuite.21

Crystal structure solution and refinement

The structure of GcnA was solved by the multiplewavelength anomalous dispersion MAD method asdescribed below using the data recorded at Argonne(Supplementary Data, Table 2). The program SOLVE23

was used to search for anomalous difference peakscorresponding to the Se atoms. Although the nominalresolutions for the three datasets recorded on the GcnA-II-Se crystal were 1.85 Å (peak), 1.95 Å (inflection) and 2.10 Å(remote), data extending to only 2.5 Å were employed. AMatthews' coefficient24 of ∼4.0 suggested that there wasonly one 72 kDa molecule (627 amino acid residues) in theasymmetric unit (corresponding to ∼69% solvent). Thetop solution found by SOLVE23 was in space group I222and had 12 peaks corresponding to Se positions (of amaximum of 15 Met residues predicted from the primaryamino acid sequence), nine with height N10 σ. The phaseestimates based on the positions of these peaks were inputinto the program RESOLVE,25,26 which performed solventflattening and traced ∼65% of the main-chain structure.The rest of the structure was progressively built manuallyusing the program Coot27 between cycles of restrainedB-factor refinement using the program REFMAC5.28

Refinement was performed using “peak” wavelengthdata (collected first) as progressive degradation wasnoted in the “inflection” and “remote” datasets. Towardsthe latter stages of building the resolution limit wasextended to use all the data to 1.85 Å. The coordinatesfrom the GcnA-II-Se structure were used as a startingmodel for molecular replacement and refinement of theother crystal structures. No non-crystallographic symme-try averaging of the subunits was employed duringrefinement of the form I structures (i.e. subunits weretreated as totally independent molecules).

Crystal structure analysis and Figures

Structure validationwas performedusingMOLPROBITY.29

Comparisons of GcnA structural elements with otherstructures in the PDB was performed with Dali.30 Super-position of the structures was performed with Coot.27 Thecalculation of buried surface was performed with the aid oftotal surface areas output by AREAIMOL, part of the CCP4suite for protein crystallography.31 Electrostatic surface

115N-acetyl-β-D-glucosaminidase from S. gordonii

potentials were calculated with MOLMOL.32 Figures wereprepared with PyMol§ and ISIS-draw∥.

Acknowledgements

We thank Dr Stephen Graham for recording dataat MAX-lab, Dr Stephen Harrop for indexing andscaling the high resolution I222 dataset, Dr ChuKong Liew for performing electrostatic surfacecalculations, and Dr Stephen Withers for the gift ofthe NAG-thiazoline used in this study. We thank DrBen Crossett for performing mass spectrometryusing the Australian Proteome Analysis Facilityestablished under the Australian Government'sMajor National Facilities Program. We thank theNational Institute of Genetic and Medical Science,and the National Cancer Institute of the NationalInstitute of Health for access to beamline 23ID-D atthe Advance Photon Source. The research wasfunded in part by a grant (to N.A.J., D.W.S.H. andN.H.), from the Institute of Dental and CraniofacialResearch, NIH, USA, grant no R01 DE 013234.

Supplementary Data

Supplementary data associated with this articlecan be found, in the online version, at doi:10.1016/j.jmb.2007.09.028

References

1. Nyvad, B. & Kilian, M. (1987). Microbiology of theearly colonization of human enamel and root surfacesin vivo. Scand. J. Dent. Res. 95, 369–380.

2. Doern, G. V., Ferraro, M. J., Brueggemann, A. B. &Ruoff, K. L. (1996). Emergence of high rates of anti-microbial resistance among viridans group strepto-cocci in theUnited States.Antimicrob. Agents Chemother.40, 891–894.

3. Kilic, A. O., Tao, L., Zhang, Y., Lei, Y., Khammanivong,A. & Herzberg, M. C. (2004). Involvement ofStreptococcus gordonii β-glucoside metabolism systemsin adhesion, biofilm formation, and in vivo geneexpression. J. Bacteriol. 186, 4246–4253.

4. Harty, D. W., Chen, Y., Simpson, C. L., Berg, T., Cook,S. L., Mayo, J. A. et al. (2004). Characterisation of anovel homodimeric N-acetyl-β-D-glucosaminidasefrom Streptococcus gordonii. Biochem. Biophys. Res.Commun. 319, 439–447.

5. Langley, D. B., Harty, D. W., Graham, S. C., Guss,J. M., Hunter, N. & Collyer, C. (2004). Crystal-lization of GcnA, an N-acetyl-β-D-glucosaminidase,from Streptococcus gordonii. Acta Crystallog. sect. D,60, 1910–1911.

6. Henrissat, B. & Davies, G. (1997). Structural and

§http://www.pymol.org∥http://www.mdli.com

sequence-based classification of glycoside hydrolases.Curr. Opin. Struct. Biol. 7, 637–644.

7. Davies, G. J., Gloster, T. M. & Henrissat, B. (2005).Recent structural insights into the expanding world ofcarbohydrate-active enzymes. Curr. Opin. Struct. Biol.15, 637–645.

8. Tews, I., Perrakis, A., Oppenheim, A., Dauter, Z.,Wilson, K. S. & Vorgias, C. E. (1996). Bacterialchitobiase structure provides insight into catalyticmechanism and the basis of Tay-Sachs disease. NatureStruct. Biol. 3, 638–648.

9. Ramasubbu, N., Thomas, L. M., Ragunath, C. &Kaplan, J. B. (2005). Structural analysis of dispersin B,a biofilm-releasing glycoside hydrolase from theperiodontopathogen Actinobacillus actinomycetemcomi-tans. J. Mol. Biol. 349, 475–486.

10. Mark,B.L.,Vocadlo,D. J.,Knapp,S.,Triggs-Raine,B.L.,Withers, S. G. & James, M. N. (2001). Crystallographicevidence forsubstrate-assistedcatalysis inabacterialβ-hexosaminidase. J. Biol. Chem. 276, 10330–10337.

11. Mark, B. L., Mahuran, D. J., Cherney, M. M., Zhao, D.,Knapp, S. & James, M. N. (2003). Crystal structure ofhuman β-hexosaminidase B: understanding the mole-cular basis of Sandhoff and Tay-Sachs disease. J. Mol.Biol. 327, 1093–1109.

12. Lemieux, M. J., Mark, B. L., Cherney, M. M., Withers,S. G., Mahuran, D. J. & James, M. N. (2006). Crystal-lographic structure of human β-hexosaminidase A:interpretation of Tay-Sachs mutations and loss of GM2ganglioside hydrolysis. J. Mol. Biol. 359, 913–929.

13. Wierenga, R. K. (2001). The TIM-barrel fold: a versatileframework for efficient enzymes. FEBS Letters, 492,193–198.

14. Nurizzo, D., Nagy, T., Gilbert, H. J. & Davies, G. J.(2002). The structural basis for catalysis and speci-ficity of the Pseudomonas cellulosa α-glucuronidase,GlcA67A. Structure, 10, 547–556.

15. Rao, F. V., Dorfmueller, H. C., Villa, F., Allwood, M.,Eggleston, I. M. & van Aalten, D. M. (2006). Structuralinsights into the mechanism and inhibition of eukar-yotic O-GlcNAc hydrolysis. EMBO J. 25, 1569–1578.

16. Parthasarathy, S., Ravindra, G., Balaram, H., Balaram,P. & Murthy, M. R. (2002). Structure of the Plasmodiumfalciparum triosephosphate isomerase-phosphoglyco-late complex in two crystal forms: characterization ofcatalytic loop open and closed conformations in theligand-bound state. Biochemistry, 41, 13178–13188.

17. Lee, M., Maher, M. J. & Guss, J. M. (2007). Structure ofthe T109S mutant of Escherichia coli dihydroorotasecomplexed with the inhibitor 5-fluoroorotate: catalyticactivity is reflected by the crystal form.Acta Crystallog.sect. F, 63, 154–161.

18. Hur, S. & Bruice, T. C. (2002). Molecular dynamicstudy of orotidine-5′-monophosphate decarboxylasein ground state and in intermediate state: a role of the203–218 loop dynamics. Proc. Natl Acad. Sci. USA, 99,9668–9673.

19. Dennis, R. J., Taylor, E. J., Macauley, M. S., Stubbs,K. A., Turkenburg, J. P., Hart, S. J. et al. (2006).Structure and mechanism of a bacterial β-glucosami-nidase having O-GlcNAcase activity. Nature Struct.Mol. Biol. 13, 365–371.

20. Chiu, J., March, P. E., Lee, R. & Tillett, D. (2004). Site-directed, ligase-independent mutagenesis (SLIM): asingle-tube methodology approaching 100% effi-ciency in 4 h. Nucl. Acids Res. 32, e174.

21. Otwinowski, Z. & Minor, W. (1997). Processing ofX-ray diffraction data collected in oscillation mode.Methods Enzymol. 276, 307–326.

116 N-acetyl-β-D-glucosaminidase from S. gordonii

22. Leslie, A. G. W. (1992). Recent changes to theMOSFLM package for processing film and imageplate data. Joint CCP4+ ESF-EAMCB Newsletter onProtein Crystallography, 26.

23. Terwilliger, T. C. & Berendzen, J. (1999). AutomatedMAD and MIR structure solution. Acta Crystallog. sect.D, 55, 849–861.

24. Matthews, B. W. (1968). Solvent content of proteincrystals. J. Mol. Biol. 33, 491–497.

25. Terwilliger, T. C. (2003). SOLVE and RESOLVE:automated structure solution and density modifica-tion. Methods Enzymol. 374, 22–37.

26. Terwilliger, T. C. (2000). Maximum-likelihood densitymodification. Acta Crystallog. sect. D, 56, 965–972.

27. Emsley, P. & Cowtan, K. (2004). Coot: model-buildingtools for molecular graphics. Acta Crystallog. sect. D,60, 2126–2132.

28. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997).Refinement of macromolecular structures by themaximum-likelihood method. Acta Crystallog. sect. D,53, 240–255.

29. Lovell, S. C., Davis, I. W., Arendall, W. B., deBakker, P. I., Word, J. M., Prisant, M. G. et al. (2003).Structure validation by C alpha geometry: phi, psiand C beta deviation. Proteins: Struct. Funct. Genet.50, 437–450.

30. Holm, L. & Sander, C. (1995). Dali: a network tool forprotein structure comparison. Trends Biochem. Sci. 20,478–480.

31. (1994). The CCP4 suite: programs for protein crystal-lography. Acta Crystallog. sect. D, 50, 760–763.

32. Koradi, R., Billeter, M. & Wuthrich, K. (1996).MOLMOL: a program for display and analysis ofmacromolecular structures. J. Mol. Graph. 14, 51–55.