The sequence, crystal structure determination and refinement of two crystal forms of lipase B from...

Post on 04-Mar-2023

2 views 0 download

Transcript of The sequence, crystal structure determination and refinement of two crystal forms of lipase B from...

The sequence, crystal structure determinationand refinement of two crystal formsof lipase B from Candida antarctica

Jonas Uppenberg1, Mogens Trier Hansen 2,Shamkant Patkar 2 and T Alwyn Jones1*

1Department of Molecular Biology, Uppsala University, Biomedical Centre, Box 590,S-751 24 Uppsala, Sweden and 2Novo Nordisk, Novo All, DK-2880 Bagsvaerd, Denmark

Background: Lipases constitute a family of enzymesthat hydrolyze triglycerides. They occur in many organ-isms and display a wide variety of substrate specificities.In recent years, much progress has been made towardsexplaining the mechanism of these enzymes and theirability to hydrolyze their substrates at an oil-water in-terface.Results: We have determined the DNA and amino acidsequences for lipase B from the yeast Candida antarc-tica. The primary sequence has no significant homol-ogy to any other known lipase and deviates from theconsensus sequence around the active site serine thatis found in other lipases. We have determined the crys-tal structure of this enzyme using multiple isomorphousreplacement methods for two crystal forms. Models forthe orthorhombic and monoclinic crystal forms of the

enzyme have been refined to 1.55 A and 2.1A resolu-tion, respectively. Lipase B is an o./P type protein thathas many features in common with previously deter-mined lipase structures and other related enzymes. Inthe monoclinic crystal form, lipid-like molecules, mostlikely -octyl glucoside, can be seen close to the activesite. The behaviour of these lipid molecules in the crystalstructure has been studied at different pH values.Conclusion: The structure of Candida antarctica li-pase B shows that the enzyme has a Ser-His-Asp cat-alytic triad in its active site. The structure appears to bein an 'open' conformation with a rather restricted en-trance to the active site. We believe that this accounts forthe substrate specificity and high degree of stereospeci-ficity of this lipase.

Structure 15 April 1994, 2:293-308

Key words: Candida antarctica, crystal structure, lipase, sequence, X-ray

IntroductionLipases (EC 3.1.1.3) make up a diverse group of en-zymes that have the ability to hydrolyze triglycerides ata lipid-water interface. Activity is dramatically increasedupon binding to the lipid surface due to a conforma-tional change of the enzyme. This mechanism of inter-facial activation on triglyceride substrates distinguisheslipases from other esterases which primarily hydrolyzewater-soluble esters. A large number of lipases havebeen characterized that display a wide variation in effi-ciency and substrate specificity [1]. Triglycerides maybe cleaved at all three ester bonds or specifically atonly one or two positions. Lipases can also show dif-ferent specificities depending on the lengths of thefatty acids. In organic media, the enzymatic behaviourchanges and lipases can be used for transesterificationand other synthetic reactions to produce new kinds oflipids. The three-dimensional crystal structures of fivelipases have been published to date [2-6]. The five en-zymes show many similarities in structure and they allhave a catalytic triad similar to the one found in serineproteases [7,8]. The catalytic residues in the triad occurin a different order in the protein sequence for lipases(Ser-His-Asp/Glu), compared with serine proteases

(Asp-His-Ser in the subtilisin family, His-Asp-Ser inthe chymotrypsin family). The catalytic serine in lipasesis usually identified by the conserved sequence GxSxG[9].The five structures are made up of a mostly parallel3-sheet, surrounded by a-helices. The human pancre-

atic lipase (HPL) has an additional domain that bindscolipase, a small protein involved in lipid binding. Anoutstanding feature of most lipases is a mobile lid cov-ering the catalytic site in its inactive form. The openingof this lid is believed to be one of the key features ofinterfacial activation. This has been demonstrated bythe structures of Rhizomucor mieheilipase (RML) [10]and HPL [11], in complex with an inhibitor and a sub-strate analogue, respectively, where the lid has movedconsiderably to make the active site accessible to theligand and made possible the formation of an oxyanionhole. The movement of the lid also changes the overallsurface at the entrance of the active site, making it morehydrophobic, and thereby changing the lipid-bindingproperties. The structures of Geotrichum candidumlipase (GCL) and Candida rugosa lipase (CRL) havebeen identified as members of the ct/3-hydrolase fam-ily [12]. This group of enzymes shares a similar fold

'Corresponding author.

( Current Biology Ltd ISSN 0969-2126 293

294 Structure 1994, Vol 2 No 4

ATGmet

CCTpro

CTC

AAGlys

TTGleu

GATleu asp ala

CTC GTC CCCleu val pro

TCA ACG CAGser thr gin

ACC CAG GTCthr gin val

AAC AAC AAGasn asn lys

ACC TTC TTCthr phe phe

AAG GGC ACClys gly thr

CAG CAA ACCaln qln thr

ATCile

AACasn

TGTcys

GTCval

ACGthr

GCGala

GACasp

GTGval

TCGser

GGGgly

GGTgly

GACasp

CTCleu

CTCleu

CTA CTC TCTleu eu ser

-1GTG AAG CGTval lys arg

GCG GGT CTG

CCCpro

CCApro

CCGpro

CGAarg

TGCcys

CTGleu

ATGmet

gly

GGAgly

TTGleu

AACasn

CTTleu

CCCpro

GTCval

ACCthr

ACCthr

CTCleu

CTGleu

TCCser

AACasn

GCGala

CCCpro

leu

ACCthr

GGTgly

ACGthr

CCCpro

AGTser

CTCleu

GGTgly

ACCthr

GACasp

TTCphe

GCCala

CCTpro

CCGpro

TACtyr

-20CTGleu1CTAleu21

ACCthr41

GGCgly61

TACtyr81

GAGglu101GTGval121ATCile141CCCala161TCGser181AACasn201TCAser221GTCval241CTGleu261CTTleu281GCAala301GCCala

ACC CGT GTG GCT GGT GTG CTT GCG ACTthr gy val ala gly val leu aa thrCGT TGG GGT TGGCCTpro

TGCcys

ACCthr

ACAthr

TACtyr

CTTleu

AGGarg

GGCgly

GCAala

CTCleu

TCCser

ATCile

CGCarg

CCCpro

GCTala

CGCarg

TCC GGT TCGser gly ser

CAG GGT GCTgin gly ala

ACA GGT CCAthr gly pro

CCC TGC TGGpro cys trp

ATG GTC AACmet val asn

ACC TGG TCCthr trp ser

TCC AAG GTCser lys val

CCT CTC GATpro leu asp

CTC ACC ACCleu thr thr

TACtyr

TACtyr

TCGser

CTCleu

CCCala

TTCphe

GAC CAT GCAasp his ala

TCC ACC ACGser thr thr

GCC AAT GATala asn asp

GCA GCC ATCala ala ile

CCC TTT GCApro phe ala

GACasp

TCGser

CAGgin

ATCile

GCCala

CAGgin

GATasp

GCAala

GCAala

ACCthr

AACasn

GGCgly

GGCgly

CTGleu

GTGval

GTAval

CCTpro

CCApro

TCGser

-10TGC GTTcys va

GCA GCC ACTala aa thr

GCC TTT TCG CAG CCC AAG TCG GTGala

TCCser

TTCphe

TCA CCCser pro

ATCile

GGTgly

CGAarg

CTCleu

CTCleu

GACasp

GGAgly

TCGser

CAGgin

ACTthr

GCGala

GGCgly

ACCthr

GGTgly

CTTleu

GCGala

CGAarg

GAGglu

AAGlys

CTCleu

GCTala

CCCpro

GGTgly

AAAlys

phe ser gin31

TCG GTC TCCser val ser

51GAC TCG AACasp ser asn

71CCG CCG TTCpro pro phe

91GCG CTC TACala leu tyr

111CTG GTT GCAleu val ala

131ATG GCC TTTmet ala phe

151GTT AGT GCAval ser ala

171AAC GCA GGTasn ala gly

191ATC GTT CAGile val gin

211AAC GTC CAGasn val gin

231ACC TCG CACthr ser gin

251CGT AGT GCAarg ser ala

271GAG CAA AAGglu gin lys

291CCA AAG CAGpro lys gin

311AGG ACC TGCarg thr cys

pro

AAAlys

TGGtrp

ATGmet

GCTala

CAGgin

GCGala

CCCpro

GGTgly

CCTpro

GCAala

TTCphe

GACasp

GTCval

AACasn

TCCser

lys ser val

CCC ATC CTTpro ile leu

ATC CCC CTCile pro leu

CTC AAC GACleu asn__asp

GGT TCG GGCgly ser gly

TGG GGT CTGtrp gly leu

CCC GAC TACpro asp tyr

TCC GTA TGGser val trp

CTG ACC CAGleu thr gin

CAG GTG TCCgin val ser

CAG GCC GTGgin ala val

TCC TAC GTCser tyr val

TATtyr

GCCala

TGCcys

GGCgly

ACC CCC TGAthr pro OPA

Fig. 1. DNA sequence of the Candida antarctica lipase B gene and the deduced amino acid sequence. Numbers refer to the aminoacid position in the mature lipase. The pre-propeptide (amino acids -25 to -1, shown in italics) contains a sequence (-25 to -8)typical of signal peptides [51] and a short propeptide ending in two basic amino acids forming a possible target for KEX2 type [52]proteolytic processing into the mature protease. The position of the probes used for screening are indicated by solid lines: NOR930(sense, line above sequence) and NOR929 (antisense; line below sequence). The active site residues are underlined with double lines andthe N-glycosylation is marked with a dashed line.

with an identical connectivity of the central -sheet,although no overall sequence homology can be de-tected. The members of this newly categorized fold all

have a catalytic triad with the same sequential order ofthe catalytic residues (nucleophile-His-Asp/Glu). Thedifferent nucleophiles account for much of the diver-

GGCgly

GCGala

GAGglu

ATCile

ATTile

GCTala

CCCpro

GTCval

Two crystal forms of lipase B from Candida antarctica Uppenberg et al. 295

sity in the reactions performed by these enzymes andso far serine, aspartic acid and cysteine residues havebeen identified in this role. All lipases that have beencharacterized to date have a serine as the nucleophilicresidue. GCL and CRL have a high overall sequencehomology and very similar structures. The lid regions,however, are quite different. It is believed that the GCLstructure represents a closed form of the enzyme, whileCRL has been crystallized in its open, activated form.Recently, the first bacterial lipase structure was deter-mined from Pseudomonas glumae [6]. This structurecontains much of the a/13-hydrolase fold, but also ex-hibits details usually not present in lipases, such as acalcium-binding site and a partially redundant catalyticaspartate (a calcium site is also present in HPL, al-though further away from the active site). The structureof a related enzyme, cutinase, has also been determined[13]. Cutin, the natural substrate of this enzyme, is alipid polyester matrix found on the surface of plants.Cutinase has lipolytic activity but does not display inter-facial activation. The structure, which is similar in manyrespects to the lipases, does not have a lid coveringthe active site. A model of guinea pig pancreatic lipase(GPL) has also been constructed, based on the crystalstructure of HPL [14]. The sequence identity betweenthe two enzymes is high, except for the region of thelid, where GPL has a large deletion. This is believed toaccount for the fact that this enzyme does not displayinterfacial activation. GPL also shows phospholipase ac-tivity in addition to its ability to hydrolyze triglycerides.

The yeast Candida antarctica displays a non-specificlipase activity towards triglycerides which is retainedeven at high temperatures [15]. Two different lipases,called A and B, with different isoelectric points andmolecular weights, have been isolated [16]. Lipase A isnon-specific and the more thermostable lipase. It hasa molecular weight of 45 kDa and a pI of 7.5. Lipase B(CALB) has a molecular weight of 33 kDa and a pI of6.0. This enzyme has proven to be a very stereospecificenzyme both in hydrolysis [1] and in organic synthesis[17-19] and has a potentially important application inglucolipid synthesis [20].

We have determined the DNA and amino acid se-quences (Fig. 1) of lipase B from C antarctica (CALB)

Fig. 2. Stereo drawing of the Ca trace ofCALB. The structure is coloured red atthe amino terminus, then orange, lightPreen. dark green. nale blue, and finallyd k bu t 0 - t r terminu.-dark blue at the carboxvl terminus.

and its three-dimensional crystal structure, using mul-tiple isomorphous replacement (MIR) methods. Thislipase has been crystallized under a variety of condi-tions [21] and the structure has been determined intwo different crystal forms that grow under identicalconditions. The orthorhombic crystals have the bestdiffraction properties and for the general descriptionof the lipase, the model determined from this crystalform will be used. The monoclinic crystal form of theenzyme displays some interesting properties, includingthe binding of a detergent molecule in the active site.Two data sets at different pH values have been col-lected from the monoclinic crystals.

Results and discussionSequenceCALB is made up of 317 amino acid residues with aformula weight of 33 273 Da. The amino acid sequence(Fig. 1) shows no significant homology to other lipasesequences. From the structures determined for other li-pases, we assumed that CALB would most likely containa Ser-His-Asp/Glu catalytic triad. However, the consen-sus sequence found in lipases around the active siteserine, GxSxG, is not present in CALB. The sequencearound SerlO5 has the highest similarity to the con-sensus sequence but the first conserved glycine hasbeen replaced by a threonine to give TWSQG. Since thewhole sequence includes only one histidine residue,residue 224 was rather easy to identify as a likely candi-date for the active site. The catalytic aspartic or glutamicacid residue could not be identified from the sequenceand was resolved only after the crystal structure hadbeen determined.

Description of the moleculeCALB is a globular ca/3 type protein with approximatedimensions of 30A x 40A x 50A (Fig. 2). The centralP3-sheet is composed of seven strands of which the lastsix are parallel, with the following strand topology [22]:+ 2, - x, + 2x, + x, + Ix, + lx. The numbering ofhelices and strands is shown in the secondary structurediagram in Fig. 3. Most connections between strandsare formed by the right handed P-a-3 structural motif.

$

296 Structure 1994, Vol 2 No 4

Fig. 3. Secondary structure diagram ofCALB. The assignment of secondarystructure was carried out with the pro-gram DSSP [53]. Helices 2, 9 andc10 all have short regions where thehydrogen bonding pattern for helices isbroken and the direction of the helixchanges.

Two exceptions involve the antiparallel connection be-tween strands 131 and P2, and the last two strands of thesheet which form a right-handed [3-loop-P motif. An-other P-sheet-like region is found in the last 12 residuesof the protein that form a hydrogen bonded pair ofstrands with a type I 3-hairpin connection. There are10 ao-helices in the structure. The first helix is foundimmediately before the first strand. Four helices con-nect neighbouring P-strands: a3, cx4 and ea7 on one sideof the sheet and cc2 on the other side. Helices ct5, c6and a10 make up most of the active site pocket andare likely to be important in interfacial activation andsubstrate specificity.

The active siteIn CALB, a serine triad is found at the carboxy-terminaledge of the parallel -sheet. This suggests that this en-zyme has the same reaction mechanism as the other li-pases that have been studied to date. The catalytic triadis made up of SerlO5, Asp187 and His224 and, there-fore, shares the sequential order of catalytic residuesof all lipases and c/p-hydrolases for which structureshave been determined. The catalytic serine is locatedin the tight turn between 34 and the following helix,a4, and has a similar conformation to that observed inother lipases. This is a strained conformation, with kand values of 53.4° and - 126.9° respectively, thatlie outside the most favoured regions found in pro-teins. The sequence around the serine is usually theonly conserved region found in lipases. A consensussequence GxSxG exists for lipases, as well as for manyof the ca/3-hydrolases [23]. The X-ray structures haveshown how the tight tum in this region brings the Caatoms of the two conserved glycines into close contactwith each other, leaving no space for side chain atoms.In CALB, this consensus is broken by the sequenceTWSQG. As shown in Fig. 4, the relative orientation

of the strand and the helix is different in CALB com-pared with the other lipases. The helix is slightly morebent away from the strand, providing extra space forthe threonine residue that lies in the middle of 04. Asimilar situation is found in the c/1-hydrolase enzymehaloalkane dehalogenase [24], where the first glycinein the consensus sequence is replaced by a valine. Onecan only speculate whether the substitution G -T/V isthe cause of the helix movement or a result of it.

The active site histidine, His224, is located at the begin-ning of the helix, 9, such that the side chain projectsinto the active site. The active site aspartic acid, Asp187,is found in a turn after the sixth strand, as expectedfor a member of the t/1-hydrolase fold. The side chainoxygen atoms of Asp187 form hydrogen bonds to mainchain and side chain atoms as well as to a buried watermolecule (Fig. 5). The next residue is a glutamic acid.The same pair of residues is also found in GCL, CRLand acetylcholine esterase [25] where the glutamic acidis part of the catalytic triad. In CALB, the glutamic acidpoints away from the active site into the surroundingsolvent and has no obvious functional role.

The region around SerlO5 is remarkably polar in na-ture. In addition to His224, there are three residues(Thr40, Asp134 and Gln157) that have polar side chainatoms within 5 A of the Oy of the catalytic serine (Fig.6). The amide group of the side chain of Gln157 is in-volved in three hydrogen bonds. A close contact to thecarbonyl oxygen of Ser153 leads us to the suggestedacceptor/donor assignments implied in Fig. 6. Thr40,therefore, accepts a hydrogen bond and the proto-nated carboxylate of Asp134 donates a hydrogen bondto Gln157. The three polar residues form a hydrogenbond network that is fully accessible to the solvent.This may impose restrictions on how amphipathic lipidsubstrates can be oriented in the active site pocket by

313 A14

20

22

Two crystal forms of lipase B from Candida antarctica Uppenberg et al. 297

Fig. 4. The superposition of CALB (green),Rhizomucor miehei lipase (magenta),Geotrichum candidum lipase (red) andhuman pancreatic lipase (purple) aroundthe active site serine. Only the Cxatoms of the 13-strand were used in thisalignment. Thr103, a buried water andSerl05 in CALB are also shown.

Fig. 5. A stereo picture showing densityin a 2Fobs- Falc map around the cat-alytic triad at 1.55 A resolution. A buriedwater residue is tightly associated withthe catalytic residue Asp187 througha hydrogen bond. Carbon atoms areshown in green, nitrogens in purple andoxygens in red.

requiring that they make favourable hydrogen bondsto polar protein atoms. One such set of interactions islikely to be the formation of an oxyanion hole. Fromthe enzymology of serine proteases, it is known thatthe negative charge of the tetrahedral substrate inter-mediate must be stabilized by hydrogen bonds fromthe enzyme [26]. A similar oxyanion stabilization ap-pears to be present in lipases [2,3]. In the ligandedstructures of HPL and RML, probable oxyanion holeinteractions have been identified, based on hydrogenbonds from the protein to the oxyanion intermediate-like inhibitor [10,11]. These open forms of HPL andRML are very similar to each other in this function-ally important region. Two main chain nitrogens formhydrogen bonds to the ligand in both structures. Thefirst hydrogen bond donor is the residue following theactive site serine and the other is at the carboxy-ter-minal end of 2. In RML, the latter residue, Ser82, alsocontributes a third hydrogen bond through its hydroxyl

group. We have aligned CALB with these structures andhave found a strong resemblance in this region (Fig.7). Two backbone nitrogens, in residues Gln106 andThr40, are present at positions equivalent to those inRML and HPL. Serine 82 from RML has a direct counter-part in Thr40 from CALB with side chain conformationssuch that their hydroxyls are closely superimposable(Fig. 7). The hydrogen bond assignments in Fig. 6 allowthe hydroxyl of Thr40 to act as a hydrogen bond donorto the oxyanion without rearrangement. An oxyanionhole stabilized by three hydrogen bonds has also beenidentified in cutinase [27].

The residue immediately before Thr40 in CALB is aglycine, which is a structurally conserved residue in li-pases and most of the /[-hydrolases. A side chain atthis position in CALB would create close contacts be-tween its C and the C[3 of the active site serine (3 A)and possibly disturb the oxyanion hole interactions.

Ser 105 Ser 105

Ser 105 Ser 105

298 Structure 1994, Vol 2 No 4

Fig. 6. A stereo picture of the catalytictriad residues and nearby polar residues,Asp134, Gln157 and Thr40 that form ahydrogen bonding network with the sol-vent in the active site cavity. Colourscheme as for Fig. 5. At the top of thepicture is the most likely candidate fora lid in CALB and the side chains thatform stabilizing hydrogen bonds in thisregion. The residues that are disorderedin one of the molecules of the mono-clinic crystal form in the high pH struc-ture are shown in magenta.

Fig. 7. A stereo picture of the RML-phosphonate inhibitor complex and analignment with CALB in this region. Allresidues believed to make up the oxyan-ion hole have a similar conformation inthe two enzymes. Hypothetical hydro-gen bonds from the inhibitor to CALBare indicated by dashed lines. RML isshown in black, CALB in the colourscheme used for Fig. 5.

The active site pocket and lidIn the structures of CALB, the active site is accessi-ble to external solvent through a narrow channel (Fig.8). It is approximately 10A x 4A wide and 12A deep,as measured from the Oy of SerlO5 to the surface.Most of the channel is formed by three parts of thestructure: helices a5 and cl1O and a loop region whichprojects Ile189 into the channel. The channel walls arevery hydrophobic and are lined with mostly aliphaticresidues. No aromatic side chains are found in thechannel except Trp104, which precedes the catalytic

serine in the sequence. The side chain nitrogen of thisresidue makes a hydrogen bond with the backbonecarbonyl oxygen of the active site histidine and stabi-lizes this region. In other lipases, this residue is oftena histidine with a similar hydrogen bond to the activesite histidine backbone.The accessible active site suggests that the enzyme hasadopted a conformation close to an activated state inboth crystal forms. Therefore, we cannot be certainwhich part of the protein, if any, functions as a lid tocontrol entry to the active site. The most likely can-

Two crystal forms of lipase B from Candida antarctica Uppenberg et al. 299

Fig. 8. The active site pocket in its openconformation from the orthorhombicmodel. (a) View from above and (b) as across section. A solvent accessible sur-face was calculated with VOIDOO [54]using a 1 A probe radius.

didate is the short helix a5. In the monoclinic crystalform, a5 is disordered in one of the molecules, suggest-ing a region of high mobility that may undergo confor-mational changes of importance for lipid binding andcatalysis. In the open form of the enzyme we observea surface of aliphatic residues from cL5 that lines thechannel leading into the active site. The helix also hasanother surface which makes contact with the interiorof the protein. The most striking feature of this re-gion is a buried aspartic acid side chain, Asp145, which

makes stabilizing hydrogen bonds with the side chainsof Serl50 and Thr158 (Fig. 6). The long helix at thecarboxy-terminal end of the structure, cal0, is anotherpossible candidate for changing the accessibility of theactive site. This helix is dominated by alanines andother hydrophobic residues on all sides and is kinkedin the middle at a proline residue. It has no hydrogenbonds to the rest of the protein, which suggests it maybe relatively mobile. It also displays higher main chaintemperature factors than the rest of the structure. A

300 Structure 1994, Vol 2 No 4

conformational change of c10 could change the sizeand shape of the active site channel and the surround-ing enzyme surface.

Why the enzyme crystallizes in an open conformationis not clear, but this form has most likely been stabi-lized by the crystallization medium. In the orthorhom-bic crystal form, we believe the open form is stabilizedin part by Leu199 from a symmetry-related moleculewhich points into the active site channel and keeps itopen. In the monoclinic form, density for a detergent-like molecule can be seen at the entrance to the activesite which may also prevent the lid from closing (seethe section describing the monoclinic crystal form). Inboth HPL and RML, a tryptophan side chain covers theactive site in their inactive forms. There is no equivalentaromatic side chain present in CALB.

The external hydrophobic surfaceCALB has a large hydrophobic surface surrounding theentrance of the active site channel (Fig. 8). It has an ap-proximate area of 450A2 and is probably in close con-tact with a lipid surface during hydrolysis. It is nearlytriangular in shape and is slightly concave. The surfaceis dominated by side chains from aliphatic residues,oriented towards the solvent. The surface displays onlytwo carboxyl residues, Asp223 and Glu188, that areclose to each other and near the entrance of the ac-tive site pocket. On the opposite side of the entrance,there is a lysine residue that shows high mobility inthe monoclinic crystal form. In this crystal form, thesurface plays an important role in the crystal packing(see below). In both crystal forms, this hydrophobicsurface interacts with neighbouring enzyme molecules.

Disulphide bridges and glycosylationThere are six cysteine residues in the sequence ofCALB. The crystallographic work shows that they areall involved in disulphide bonds. The first is found be-tween Cys22 and Cys64 and connects the antiparallelfirst and third strands where the first strand ends and

begins a short loop to the second strand. The secondbridge is.formed between Cys216 and Cys258 and con-nects two loop regions at the surface of the protein.The third disulphide connects Cys293 and Cys311 andstabilizes the carboxy-terminal end of the enzyme.

The amino acid sequence suggests one possible N-gly-cosylation site, with the characteristic sequence NxT,at Asn74. This asparagine, which is followed by an as-partic acid and a threonine, is indeed the only residuewhere the electron density indicates a glycosylation.Two N-acetylglucosamine molecules have been builtinto the density in the orthorhombic crystal form (Fig.9). In the monoclinic crystal form, glycosylation canbe seen for only one of the two lipase molecules inthe asymmetric unit. The carbohydrate molecules arelocated in a loop region after the third strand and pointinto the surrounding solvent. The innermost carbo-hydrate unit makes hydrogen bonds with side chainsfrom residues Glnll and Asp75 and with two well-or-dered water molecules. The outermost molecule hasno clear hydrogen bonds. There are no visible inter-actions between the carbohydrates and neighbouringlipase molecules.

Multiple conformationsThe high resolution refinement has revealed a few sidechains that adopt multiple, discrete conformations. TheOy atom of Ser26 assumes two different positions. InIle87, the density indicates that the C81 has two differ-ent positions and, similarly, Leu144 has two conforma-tions for the 8-carbons. The density for the Oy atom ofSerl05 in the active site is somewhat extended and theatom has a higher temperature factor than comparableatoms in this region, indicating some mobility whichmight be of functional interest. A few residues showweak density for their side chains, suggesting high mo-bility; the most prominent of these are Arg242, Arg249and Glu269 (this is also apparent in their real space fitvalues shown in Fig. 10). The average main chain tem-perature factors show the usual variation expected for

Fig. 9. A 2Fobs - Fcalc map around theN-glycosylation site in CALB in the or-thorhombic crystal form. Two N-acetyl-glucosamine molecules have been builtin the density. The average tempera-ture factor for the carbohydrate atomsis 27 A2 in the orthorhombic model.

Two crystal forms of lipase B from Candida antarctica Uppenberg et al. 301

loop regions (e.g. between et9 and 0tlO) and exposedsecondary structure elements (e.g. amino-terminal partof ca9).

0.9

0.s

E 0.7O.S

*Q

E 0.6

. O.

. o.0.4

0U0.3

0.2

0.1

0

Fig. 10. Plots of the real space fit for all atoms (solid lines) and av-erage temperature factors for main chain atoms (dashed lines) ofthe current orthorhombic model as a function of residue number.The scale on the left shows the real-space correlation coefficientand the scale on the right shows B-factor values in A2. For thereal space fit, a 2Fobs- Fcalc map was used. All 317 residues arevisible in the electron density map and have been modelled.

Solvent moleculesIn the current orthorhombic model, 286 water moleculeshave been included. Of these only seven are completelyburied, lacking contact with external solvent molecules.Of special interest is the water molecule bound to an06 atom of the active site residue Asp187 (Fig. 5).In cutinase, a water atom is also hydrogen bonded tothe active site aspartic acid [27]. As mentioned earlier,one water molecule is hydrogen bonded to ThrlO3 andforms part of the turn around the active site serine, butthere is no clear electron density for a water moleculeat the proposed oxyanion hole of CALB. Since the ac-tive site is exposed to the solvent continuum, it is notsurprising that a number of solvent molecules are alsofound in the active site pocket. Two water moleculesform a bridge between the hydroxyl of SerlO5 andthe carbonyl oxygen of Thr40. Two well-ordered wa-ter molecules are packed between helices cL5 and lCo0.These are present in both crystal forms and are absentonly in molecule B of the high pH monoclinic form,where the displaced helix c5 changes the hydrogenbond network.

Given the hydrophobic character of the entrance tothe active site, it cannot be ruled out that isopropanolrather than water molecules are responsible for someof the density peaks.

Crystal contacts in the orthorhombic crystal formIn the orthorhombic form, the lipase molecule hascrystal contacts with 12 symmetry-related molecules.

The majority of contacts are of a polar nature, withmany water-mediated hydrogen bonds. Only eight di-rect hydrogen bonds have been found between proteinmolecules. The hydrophobic surface is packed againsta neighbouring molecule. This interaction is highlyhydrophobic and only one protein-protein hydrogenbond can be located. In this region, the side chain ofLeul99 from a symmetry-related molecule points intothe active site and could be partly responsible for sta-bilization of the open active site channel.

The monoclinic crystal formsSeveral native data sets have been collected on themonoclinic crystal form at different pH values. Thestriking feature of this crystal form is the packing ofthe two molecules in the asymmetric unit (designatedmolecules A and B) such that the large hydrophobicsurface around the active site pocket of one moleculepacks against the corresponding surface of the othermolecule (Fig. 11). Both molecules have an open con-formation with the active site accessible from the out-side, just as in the orthorhombic form. In the mon-oclinic crystal form, however, density for a lipid-likemolecule has been located at the entrance to the activesite (Figs 11 and 12). This is most likely a P-octyl glu-coside detergent molecule, since no other lipid com-pound was added in the crystallization experiment. Thedetergent molecules are most clearly visible in the mapscalculated from data collected at pH 3.6. When thepH is raised to 5.5, a dramatic change takes place inthe crystal which includes a reduction in the length ofthe unit cell a-axis by 2A. The most drastic structuralchange is seen in the active site of molecule B. Thereis no longer any density for the detergent moleculenor for most of the residues forming ct5. This effect isnot seen in molecule A, where the detergent moleculeremains clearly visible.

The detergent molecules are in contact with both en-zyme molecules (Fig. 11). The lipid tail projects intothe active site pocket of one molecule while the carbo-hydrate portion has polar interactions with the otherenzyme molecule. The electron densities for the car-bohydrate moiety of both molecules are rather poorand do not allow us to accurately position them. In ourcurrent model, there are two hydrogen bonds from thesugar hydroxyls to main chain atoms of residues Val221and Asp223.

The hydrophobic surfaces of the two moleculesare separated at many places by a layer of solventmolecules and no direct hydrogen bonds exist be-tween the two protein molecules. We believe that muchof the observed electron density may be occupied bymore hydrophobic molecules of the crystallization mix-ture, either disordered 3-octyl glucoside or isopropanolmolecules. This conclusion is based on the observa-tion that much of the solvent density is rather largeand packed against hydrophobic side chains that donot allow for effective hydrogen bonding. These solventmolecules have so far been modelled as waters.

0I

-oM

302 Structure 1994, Vol 2 No 4

Fig. 11. A stereo plot of the asym-metric unit in the monoclinic crys-tal form at pH 3.6 showing how onemolecule packs its hydrophobic surfaceagainst that of another, thereby mini-mizing their exposure to the surround-ing solvent. Molecule A in red, moleculeB in green, waters and P-octyl glucosidemolecules in purple. The molecules arerelated by an almost exact two-fold ro-tation.

Fig. 12. Stereo picture of an Fobs - FcaIcmap around the lipid molecule, mostlikely -octyl glucoside, in the mon-oclinic crystal form. The lipid part ofthe molecule points into the active sitepocket of molecule A and the carbohy-drate moiety forms hydrogen bonds tomolecule B. The map is contoured at 20.

Comparison with enzymes of similar structure

In recent years, a number of structures have been de-termined that share a common overall motif, called thea/P-hydrolase fold [12]. The structure of CALB con-tains a subset of this fold. It has the same connectiv-ity of the 13-sheet as observed in other members ofthis group, with the characteristic non-sequential align-ment of the first four strands in the sheet, 1, 33,f2, 4. Since all connections are of the right handedtype, the crossovers make possible an equal distribu-

tion of helices on both sides of the -sheet. CALB isdifferent from the other a/3-hydrolases in having onlyseven strands. All other enzymes in the group have atleast one extra strand at the amino-terminal part of thesheet. Many parts of CALB can be superimposed withequivalent residues in other ct/3-hydrolase enzymes, al-though the sequence identity is very low (Table 1).The overall structural alignments of CALB with RMLand HPL are poorer. This can be partly explained bytheir slightly different secondary structure topology thatdeviates from the o/'-hydrolase fold [23].

Two crystal forms of lipase B from Candida antarctica Uppenberg et al. 303

The active site residues in CALB are located at the samepositions in the structure as defined by the c/3-hydro-lase fold. The catalytic triad in CALB can be superim-posed on the triad of carboxypeptidase II (CPII) [28],for example, with a root mean square deviation (rmsd)of 0.96 A for all non-hydrogen atoms. The same com-parison with RML gives an rmsd of 0.68 A (Fig. 7). Thetriad in HPL is less similar because of the different topo-logical origin of the aspartic acid.

The active site serine in CALB is surrounded by manypolar residues (Fig. 6). In the activated form of RML,there is an aspartic acid residue in the same region asGln157 in CALB. In the open form of pancreatic lipase,no polar side chains can be found in this region whichis much more hydrophobic. The unusual buried polarcluster found in a number of fungal lipases [29] is notapparent in CALB.

Biological implicationsLipases are able to hydrolyze triglycerides at anoil-water interface, where their activity is drasti-cally increased. Although Candida antarctica li-pase B (CALB) is not as efficient as other lipasesin hydrolyzing triglycerides, this enzyme is of par-ticular interest because it displays strong stere-ospecificity on chiral substrates during hydrolysisor organic synthesis.Here we report the sequence and structure ofCALB. The structure has many features in com-

mon with other lipases. It is built up from asubset of the c/1-hydrolase fold and contains aSer-His-Asp active site triad. In the present crys-tal forms, a rather narrow and deep channel leadsinto an open active site that contains an oxyanionhole. The shape of the channel probably accountsfor the enzyme's stereospecificity. The lipase crys-tal structures that have appeared in recent yearsindicate that activation at the interface may becaused by a conformational change that exposesthe active site of the enzyme. A putative lid hasbeen identified based on the observed mobility ofa short -helix (5). The long carboxy-terminalhelix (el0) may also play an important role sinceit has no hydrogen bonds to other parts of thestructure and interacts mainly through hydropho-bic side chains. However, we cannot rule out theexistence of a closed form of CALB resulting froma conformational change affecting a larger part ofthe structure.

The relatively low activity of the enzyme on largetriglyceride substrates and the easy adoption ofan open conformation suggests that CALB maybe an intermediate between an esterase (whichhydrolyzes water soluble substrates) and a true,interfacially activated lipase.

Materials and methodsCloning and sequencing of the C. antarctica lipase B geneThe amino-terminal protein sequence of C antarctica lipaseB was initially determined as LPSGSDPAFSQPKSVLDAGLTNEG.Two slightly degenerate oligonucleotide probes, NOR929 [CC-CTC GTT(C/G) GT(C/G)A GGCC(C/G) GCGTC (C/G)AGCACCGA CTTGG GCTG] and NOR930 [CC(C/G)T CGGGCTCGGA CCC(C/G) GC(C/G/T) TTCTT CTCGC AGCCC AAG],were synthesized on the basis of the extremely biased codon usein a gene from the same organism, which had previously beencloned and sequenced (MTH, unpublished data). Total DNAfrom C antarctica LF058 was isolated after grinding in a mortarwith quartz essentially by the method of Yelton et al. [30]. Pu-rified DNA was partially digested with Sau3A and fragments inthe range of 3-10 kb were isolated after agarose electrophoresis.These were ligated into plasmid pBR322, which had been cutwith BamH1 and dephosphorylated using standard procedures[31 ]. An Escherichia coli MC1000 restriction deficient derivativewith ampicillin resistance was transformed with the plasmids.The colonies were replicated onto filters and screened by hy-bridization to 32p-labelled probes as previously described [32].Replica filters were screened with labelled probes NOR929 andNOR930. Seven colonies were identified, which hybridized toboth probes after washing at 55°C in 6 x SSC (1 x SSC is 150 mMNaCl and 15mM sodium citrate, pH 7.0). The hybridizing plas-mids were shown to contain overlapping inserts. From one orig-inal insert of 7.8 kb, a 2.1 kb subclone was chosen for sequenc-ing. Nested deletions were made from one end of the geneusing the exonuclease III-based Erase-a-Base System (PromegaCorporation). The sequence in one direction was determinedfrom the deletion plasmids by the dideoxy method [33] using

Table 1. A structural comparison of CALB with lipases and ca/hydrolase enzymes.

Enzyme Number of Percent of Rms Ca (A) PDB entryequivalent smaller enzymeresiduesa

RML 79 (12) 29 1.8 3TGLHPLb 114(9) 36 1.8 -GCL 134 (12) 42 2.0 1THGAChE 143 (12) 45 2.0 1ACEHAD 115(17) 36 2.0 2HADCPII 131 (13) 41 2.1 2SC2

The alignment was determined with O's Isq.improve option, where eachpair of equivalent atoms are separated by less than 3.8 A after super-position. aThe number of sequence identities in the set of structurallyequivalent residues are in parentheses. bThe human pancreatic lipasecoordinates were kindly provided by Dr F Winkler, Hoffman-La Roche,Basel. Enzyme name abbreviations: RML = Rhizomucor miehei lipase,HPL = human pancreatic lipase, GCL = Geotrichum candidum lipase,AChE = acetylcholine esterase, HAD = haloalkane dehalogenase, CPII= carboxypeptidase II.

304 Structure 1994, Vol 2 No 4

Table 2. Data collection and heavy atom refinement statistics.

Compound Resolution No. of unique No. of Completeness Rmerge(%) Rdi(%) Heavy atom Number Phasing Rcullis(%)(A) reflections observations (%) conc. (mM) of sites power

Monoclinic crystal formNative (pH 3.6)Native (pH 5.5)

2.1 31323 585822.5 18925 48771

8694

3.05.6

UO 2CI2 3.5UO2(NO 3)2 3.5CH3PbCH3COO 3.5

K2 PtCI4(1) 3.5

K2PtCI4(2) 3.5Overall figure of merit: 68 % (3.5 A)

Orthorhombic crystal formNative

UO2(CH 3COO) 2 3.0Hg(CH 3COO)2 3.0K2PtCI4 3.0

Overall figure of merit: 54 % (3.0 A)

71325857684766526388

2232416819199261221612745

97 4.6 18.0 3 12 2.3 5980 4.2 21.3 10 15 2.3 5794 4.0 11.1 2 8 1.4 7891 4.1 12.9 25 11 1.0 8188 2.6 6.5 3 11 1.1 83

1.55 37486 14 04 16a

4042

51955379

150301480914875

69 6.3 18.0 55 9 2.1 5891 3.6 12.7 10 8 1.1 7593 4.5 10.4 30 1 1.0 79

aThe observations are for the high resolution data set only (see Methods). bThe overall Rmergefor the native data has been calculated for reflectionsbetween 100 and 1.8 A (see Methods). Rmerge = (lj- < Ii > 1)/ < I >, where is the intensity of an observation of reflection j and < I > is theaverage intensity for reflection j. Rdiff = (lnati - deril)/y < I >, where nati is intensity for the native reflection and Ideriis the intensity for the derivativereflection and < I > is the average intensity of Inati and deri Rcullis = (I IFPH - Fp I - FH(calc))/(1IFpH-FpJ) for centric reflections where Fpand Fp arethe structure factors for derivative and native data respectively and FH(calc) is the calculated structure factor for the heavy atom contribution.

the Sequenase DNA sequencing kit (United States BiochemicalCorporation). From this, oligonucleotide primers were madefor sequencing the opposite strand. Due to the high CG con-tent (63 %), several areas of severe compression were seen. Toelucidate these areas, part of the gene was also sequenced us-ing ITP (inosine-5'-triphosphate) rather than GTP (guanosine-5'-triphosphate) in the sequencing reactions.

X-ray crystallographyProtein for the X-ray structure determination of CALB was puri-fied from the native organism as described previously [34]. Thecrystallization of CALB in five different space groups has beenpublished elsewhere [21 and will only be summarized here forthe relevant crystal forms.

The protein was crystallized at room temperature with the hang-ing drop method [35]. The reservoir contained 20 % polyethy-lene glycol 4000, 50mM sodium acetate buffer, pH 3.6 and10 % isopropanol. The protein concentration before any addi-tions was 10mgml- 1. The protein solution was mixed with anequal volume from the reservoir. -octyl glucoside was added toreach a final concentration of 0.6 %. Two different crystal formsgrew under identical conditions. The monoclinic crystal formbelonged to space group P2 1 and diffracted to 2.OA resolution.The crystals had cell constants a=69.2A, b=50.5A, c=86.7Aand = 101.5'. The second crystal form grew as smaller crystalswhich were more elongated in shape. They were orthorhombicwith space group P212121 and diffracted to 1.8A with a rotatinganode X-ray source and to 1.5 A at a synchrotron source. The celldimensions were a=62.1 A, b=46.7A, c=92.1 A. Both crystalforms were used in the MIR determination of the structure. Thedata collection statistics are summarized in Table 2. In orderto increase the chances of heavy-atom binding, the heavy-atomcompounds were dissolved in a solution similar to the crystal-lization reservoir, but with the pH raised to 5.5. In the case of

the monoclinic crystal form, this led to significant changes inthe cell constants, a=67.0A, b= 50.5A, c= -86.7Aand [3= 100.1°.The diffraction limit for these modified crystals was 2.5 K

Native and derivative data were collected for both space groupsat 20°C on an SDMS Mark III multiwire area detector [36],mounted on a Rigaku rotating anode operating at 50kV and90 mA. The SDMS software was used for collecting the data andthe stored images were processed with MADNES [37], followedby profile fitting in PROCOR [38]. Subsequent treatment of thedata was mainly performed with the CCP4 program package[39]. A high resolution data set of the orthorhombic crystal formwas collected at 20°C on beamline X11 at the EMBL outstation atDESY, Hamburg using a Mar-Research image plate. This datasetwas processed with DENZO [40]. The low resolution data werenot used in this data set since many strong reflections wereoverexposed on the image plate. Instead the high resolution datawere merged with the SDMS data set. The R-merge between thetwo data sets in the overlapping resolution shell, 2.5-2.0A, was9.6%.

The monoclinic crystals were used initially in the search forheavy-atom derivatives. Difference Patterson maps gave a fewoutstanding peaks, which led to the identification of the ma-jor heavy-atom sites in the U02C12 and K2PtCI4 data sets. Fur-ther sites were found by inspection of difference Fouriers. Theheavy-atom parameters were refined with MLPHARE [41]. Thetwo uranyl data sets and the methyl lead acetate data set hadmost of their sites in common. They were, therefore, alwaysrefined separately since joint refinement of these gave unreason-ably high figures of merit, without improving the resulting MIRmap. Anomalous data from the uranyl data sets were included inthe phasing. MIR phases were calculated to 3.5 A resolution andfurther improved by solvent flattening, histogram matching andapplication of Sayre's equation, using SQUASH [42]. An electron

95 4.0b

Two crystal forms of lipase B from Candida antarctica Uppenberg et al. 305

density map was calculated at this point and the skeletonizedmap showed many of the secondary structure elements. Twoenzyme molecules could be identified in the asymmetric unit.These were superimposed manually in O [43] to give an ap-proximate transformation operator. This operator was furtherimproved with the rt-improve program and several cycles oftwo-fold density averaging was performed with A at 3.5 A resolu-tion [44]. The resulting map was used for model building. Thetracing of the peptide backbone was done using skeletonizeddensity and a Cat trace was constructed with the 'baton' option in0, placing Ca atoms into the density at a separation of 3.8A. Thepositions of the main chain atoms were generated from the Cacoordinates using a database of refined structures [45] and theside chains were added in their preferred rotamer conformationsand the best fitting rotamer selected by visual inspection of themap [46]. Two thirds of the sequence could easily be fitted tothe density. This partial model was subjected to one round ofenergy minimization and simulated annealing molecular dynam-ics refinement in X-PLOR, using a 'slow cool' protocol [47].

The orthorhombic crystal form was then solved by molecularreplacement, using the crude monoclinic structure as a searchmodel. All diffraction data with F > 2, in the resolution range8.0-3.5 A, were used in the rotation and translation functionsin X-PLOR [48]. The rotation function gave a unique solutionwith a 25a peak height. The translation function also returned asingle solution, with an 18o peak height. Using this correctlyplaced model for phasing, it was then possible to locate theheavy-atom sites using difference Fourier maps. These sites wererefined as described above and a new MIR map was calculated.The MIR maps of the two crystal forms were not averaged butwere displayed, superimposed in O. Together with phase com-bined maps, this allowed us to interpret the rest of the structure.

The structure was refined with X-PLOR by simulated annealingand energy minimization using force-field parameters derivedfrom the Cambridge Structural Data Base [49]. Individual re-strained temperature factor refinement was performed on theorthorhombic data and the low pH monoclinic data set. For thehigh pH monoclinic data set, main chain and side chain atomswere grouped for each residue during temperature factor refine-ment, starting with the initial values from the low pH model. Theprotein structure was primarily refined with the orthorhombicdata to 1.8 A resolution. This model was then used as the startingmodel for refinement of the two monoclinic data sets. Waterswere added in steps during the refinement. The real space fitof the model to calculated 2Fo - F maps was used to find in-correctly built regions of the model [46]. The model was alsoclosely inspected at positions with unusual main chain dihedralangles, peptide flips or torsion angles. A sequencing error wasfound at residue 276 which had initially been determined as aglycine. The electron density indicated an alanine that was laterconfirmed to be correct by partial resequencing. At a late stageof the refinement, the 1.55A data set was collected for the or-thorhombic crystal form and used to further improve the model.The correct conformations for prolines could be identified andthe first residue could be placed correctly into density.

Heavy-atom sites in the monoclinic crystal formSince the uranyl and lead derivatives shared most sites, they willbe described together. There was one outstanding site, withmore than twice the occupancy of any other site. It was locatedat the interface between the two molecules in the asymmetricunit, bound to Asp223 of molecule A and Glu188 and Asp223of molecule B. Most other sites were bound to single asparticacid side chains. One of them was bound to Aspl45 on moleculeB, while molecule A had no equivalent site. This is another indi-cation that helix ta5 has indeed been perturbed in this molecule,

making the aspartic acid accessible for heavy-atom binding. Oneuranyl site was located between two proline residues from dif-ferent lipase molecules, and another site was found at the car-boxyl terminus of molecule A The strongest platinum sites inthe monoclinic form were located near the sulphur of Met298,one for each molecule in the asymmetric unit. This is also theonly methionine found on the surface of the enzyme.

Heavy-atom sites in the orthorhombic crystal formThe main uranyl site found in the monoclnic form was stabilizedby the special packing of the two molecules in the asymmetricunit. This packing is not present in the orthorhombic crystalform. The highest occupancy site, however, was still a uranylacetate molecule bound to Asp223 and Glu188. Other sites wereall located near aspartic acid side chains. One platinum site wasfound at the cysteine bridge Cys293-Cys311 and another waslocated between two lysine residues. The only methionine acces-sible from the surface was also modified. The active site histidineHis224, as well as Lysl36, both bound the platinum compound.There was one outstanding mercurial site, with an occupancyfour times higher than the next site, and the only site used inMIR phasing. It was bound to Tyr91, with the difference Fourierpeak at a distance of 2.3 A from the C&2 atom. A similar site butwith lower occupancy was found at Tyr234.

Fig. 13. Ramachandran plot of the current orthorhombic model.Two residues with unusual conformations are evident. Asn51 islocated in a kinked helix, adding an extra residue to one of theturns. Ser105 has the typical conformation for the active site nu-cleophile found in lipases and at/3-hydrolases.

Quality of the modelThe current model in the orthorhombic crystal form has beenrefined using data between 7.5A and 1.55A to an R-factor of15.6% for reflections with amplitudes above 2cr and 15.8% forall measured reflections. The R-factor for all reflections in theresolution shell 1.58-1.55A is 21.9%. All 317 residues can beseen in the map, although density for some atoms is lacking,particularly for a number of exposed side chains. A total of 2324non-hydrogen protein atoms, 286 water molecules and two car-bohydrate molecules have been included in the model; in totalthere are 2638 non-hydrogen atoms.

.,

Phi

8U

U

306 Structure 1994, Vol 2 No 4

(a) (b)

0S~

4 !

I0 i

D U.0

0~

U

(C) (d)1

0.9

0.0

_ 0.7

.2

E 0.6

Uo 0.5

.=

E 040U

0.3

0.2

0.1

0

Fig. 14. Real-space fit and main chain temperature factor diagrams for the monoclinic models for both molecules in the asymmetricunit as a function of residue number. The scales on the left show the real-space correlation coefficient and the scales on the right showB-factor values in A2. (a) Molecule A at pH 3.6. (b) Molecule B at pH 3.6. (c) Molecule A at pH 5.5. (d) Molecule B at pH 5.5. Part of theproposed lid region, helix c5, lacks continuous density in the structure of molecule B at high pH. In this molecule, density for a lipidmolecule in the active site can be seen for the low pH structure. This density disappears in the structure determined at pH 5.5.

U

Q U"r 0CZ m°

dCul

In the monoclinic crystal form, two models have been refinedagainst the data sets collected at pH 3.6 and 5.5, respectively.The model representing the low pH structure has been refinedto 2.1 A, with an R-factor of 19.0 % for all reflections and consistsof two protein chains with the complete sequence, 470 waters,two carbohydrates and two detergent molecules, with a total of5186 non-hydrogen atoms. The high pH model has been refinedto 2.5 A, to an R-factor of 20.1 % for all reflections. One detergentmolecule, two carbohydrates and 159 waters have been includedin this model.

No non-glycine residues fall in the disallowed region of the Ra-machandran plot (Fig. 13), but two fall in the generously allowedregion as defined by PROCHECK [50]. One is SerlO5 in the

catalytic triad, which is part of a tight tum, found in all lipasesand oa/-hydrolases at this position. The other is Asn51, locatedin a kink of a helix. There are two residues, Ser195 and Val306,that have high (> 2.5 A) peptide flip values [46]. Serine 195 is inthe middle of a nine residue long surface loop, where the endsof the loop are hydrogen bonded by main chain atoms. Valine306 forms a -bulge at the beginning of the carboxy-terminalP-hairpin.

The two molecules in the asymmetric unit of the monocliniccrystals are related by an almost perfect two-fold rotation. Forthe low pH structure the rotation is 179.56', with a translationalong the axis of 0.15k The direction cosines for the rotationaxis are (0.798, 0.587, 0.135). For the high pH structure, the ro-

Two crystal forms of lipase B from Candida antarctica Uppenberg et al. 307

tation is 179.41', the translation 0.30A and the direction cosines(0.800, 0.585, 0.130). These calculations were made with theprogram COORD2 (J Deisenhofer, unpublished program).

The rmsd after superposition of the two molecules in the asym-metric unit of the low pH monoclinic model is 0.18A for Casand 0.32 A for all non-hydrogen atoms. The rmsd after superpo-sition of the orthorhombic model and molecule A in the low pHmonoclinic model is 0.24 A for Cots. The rmsd between modelsof the high and low pH forms are 0.12 and 0.15A for Cas ofmolecule A and B, respectively.

The temperature factor and real space fit diagrams reveal thatmost of the structure is well ordered and shows low mobility(Figs 10 and 14). The solvent exposed loop region from 242to 268 has a higher average main chain temperature factor thanthe rest of the structure, indicating higher mobility. Many of theresidues that lack side chain density are located in this region.Also the following helix, ozl0, has a high average temperaturefactor. The possible movement of this helix is of great interestsince it makes up a large portion of the active site pocket. Inthe high pH monoclinic crystal form, the high mobility of thec5 region is clearly manifested by a low real space fit and hightemperature factors for molecule B. The refinement statistics forthe models are summarized in Table 3.

The coordinates for the three models have been deposited at theProtein Data Bank. The DNA sequence has been deposited at theEMBL sequence data base with accession number z230645.

Acknowledgments- We wish to thank Dr Fritz Winkler for provid-ing us with the coordinates for the closed form of human pancre-atic lipase, Dr Christian Cambillau for the coordinates of pancreaticlipase-procolipase complex and Dr David Lawson for the coordinatesof the phosphonate inhibited form of Rhizomucor miehei lipase. Thisinvestigation was carried out with financial support from Nordisk In-dustrifond and the Swedish Natural Science Research Council. Theexpert technical assistance by Ms Inge Hoegh with the cloning andDNA sequencing is gratefully acknowledged. We wish to thank DrMorten Kjeldgaard for his valiant efforts in tracing a nasty bug inqoplot, which allowed us to make the coloured figures. The construc-tive comments of Dr C Cambillau are gratefully acknowledged.

References1. Rogalska, E., Cudrev, C., Ferrato, F. & Verger, R. (1993). Stere-

oselective hydrolysis of triglycerides by animal and microbiallipases. Chirality 5, 24-30.

2. Brad, L, et al.. & Brzozowski, A.M. (1990). A serine proteasetriad forms the catalytic centre of a triglyceride lipase. Nature343, 767-770.

3. Winkler, F.K., D'Arcy, A. & Hunziker, W. (1990). Structure ofhuman pancreatic lipase. Nature 343, 771-774.

4. Schrag, J.D., Li, Y., Wu, S. & Cygler, M. (1991). Ser-His-Gluforms the catalytic site of a lipase from Geotrichum candidum.Nature 351, 761-764.

5. Grochulski, P., et al., & Li, Y. (1993). Insights into interfacialactivation from an open structure of Candida rugosa lipase.J Biol. Chem. 268, 12843-12847.

6. Noble, M.E.M., Cleasby, A., Johnson, LN., Egmond, M.R. &Frenken, L.G.J. (1993). The crystal structure of triacylglycerollipase from Pseudomonas glumae reveals a partially redundantcatalytic aspartate. FEBS Lett. 331, 123-128.

7. Wright, C.S., Alden, R.A. & Kraut, J. (1969). Structure of sub-tilisin BPN' at 2.5A resolution. Nature 221, 235-242.

8. Blow, D.M., Birktoft, J.J. & Hartley, B.S. (1969). Role of buriedacid group in the mechanism of action of chymotrypsin. Na-ture 221, 337-340.

9. Boel, E., Huge-Jensen, B., Christens, M., Thim, L & Fiil, N.P.(1988). Rhizomucor miehei triglyceride lipase is synthesized asa precursor. Lipids 23, 701-706.

10. Brzozowski, AM., et al., & Derewenda, U. (1991). A model forinterfacial activation in lipases from the structure of a fungallipase-inhibitor complex. Nature 351, 491-494.

11. van Tilbeurgh, H., Egloff, M.-P., Martinez, C., Rugani, N.,Verger, R. & Cambillau, C. (1993). Interfacial activation of thelipase-procolipase complex by mixed micelles revealed by X-ray crystallography. Nature 362, 814-820.

12. Ollis, D.L, et al., & Cheah, E. (1992). The a/-hydrolase fold.Protein Eng. 5, 197-211.

13. Martinez, C., DeGeus, P., Lauwereys, M., Matthyssens, G. &Cambillau, C. (1992). Fusarium solani cutinase is a lipolyticenzyme with a catalytic serine accessible to solvent. Nature356, 615-618.

14. Hjorth, A., et al., & Carrir, F. (1993). A structural domain(the lid) found in pancreatic lipases is absent in the guineapig (phospho)lipase. Biochemistry 32, 4702-4707.

15. Michiyo, M. (1989). Purification of a thermostable, nonspe-cific lipase from Candida and its use in transesterification. WOPatent 8802775, 1986. Chem. Abstr. 110, 20529.

16. Heldt-Hansen, H.P., Ishii, M., Patkar, S.A, Hansen, T.T. &Eigtved, P. (1989). Biocatalysis in agricultural biotechnology.In ACS Symposium Series.389. (Whitaker, J.R. & Sonnet, P.E.),pp. 157-172.

17. Frykman, H., Ohmer, N., Norin, T. & Hult, K. (1993). S-ethylthiooctanoate as acyl donor in lipase catalysed resolution ofsecondary alcohols. Tetrahedron Lett. 34, 1367-1370.

18. Mattson, A, Ohmer, N., Hult, K. & Norin, T. (1993). Resolutionof diols with C2-symmetry by lipase catalysed transesterification.Tetrahedron Asymm. 4, 925-930.

19. Partali, V., Waagen, V., Alvik, T. & Anthonsen, T. (1993). En-zymatic resolution of butanoic esters of 1-phenylmethyl and1-[2-phenylethyl] ethers of 3-chloro-1,2-propanediol. Tetrahedron Asymm. 4, 961-968.

20. Adelhorst, K., Bjorkling, F., Godtfredsen, S. & Kirk, 0. (1990).Enzyme catalyzed preparation of 6-O-acylglucopyranosides. Syn-thesis 2, 112-115.

21. Uppenberg, J., Patkar, SA, Bergfors, T. & Jones, TA (1994).Crystallization and preliminary X-ray studies of lipase B fromCandida antarctica J. Mol Biol 235, 790-792.

22. Richardson, J.S. (1981). The anatomy and taxonomy of proteinstructure. Adv. Protein Chem. 34, 167-399.

23. Cygler, M., Schrag, J.D. & Ergan, F. (1992). Advances in struc-tural understanding of lipases. Biotech. Genet. Eng. Rev. 10,143-184.

24. Franken, S.M., Rozeboom, HJ., Kalk, K.H. & Dijkstra, B. (1991).Crystal structure of haloalkane dehalogenase; an enzyme todetoxify halogenated alkanes. EMBO J. 10, 1297-1302.

25. Sussman, J.L, et a, & Harel, M. (1991). Atomic structureof acetylcholinesterase from Torpedo californicca a prototypicacetylcholine-binding protein. Science 253, 872-879.

Table 3. Summary of refinement.

Orthorhombic Monoclinic crystal formcrystal form pH 3.6 pH 5.5

Resolution of data (A) 7.5-1.55 7.5-2.1 7.5-2.5R-factorb(%) 15.8 19.0 20.1

Non-hydrogen 2324 4648 4648protein atomsWater molecules 286 470 159

Deviations from idealityaBond lengths (A) 0.007 0.006 0.006Bond angles (°) 1.1 0.9 1.3Dihedrals (o) 24.3 24.1 24.1

Impropers (°) 0.9 0.8 1.2Average B-factors

Main chain atoms 8.7 19.2 23.5All protein atoms 9.7 19.9 24.7Water 34.1 46.2 43.2

aValues from X-PLOR. Parameters from the Cambridge data base ofsmall molecule structures [491 were used for the bond lengths and bondangles. bR-factor = hIFobs-FcalcI/Fobs, where Fobs and Fcalc are theamplitudes of the observed and the calculated structure factors.

308 Structure 1994, Vol 2 No 4

26. Carter, P. & Wells, JA (1990). Functional interaction amongcatalytic residues in subtilisin BPN'. Proteins 7, 335-342.

27. Martinez, C., et al, & Nicolas, A (1994). Cutinase, a lipoly-tic enzyme with a preformed oxyanion hole. Biochemistry 33,83-89.

28. Liao, D.-I. & Remington, SJ. (1990). Structure of wheat ser-ine carboxypeptidase II at 3.5A resolution. J Biol Chem 256,6528-6531.

29. Derewenda, U., et al, & Derewenda, Z.S. (1994). An unusualburied polar cluster in a family of fungal lipases. Nature Struct.BioL 1, 36-47.

30. Yelton, M.M., Hamer, J.E. & Timberlake, W.E. (1984). Trans-formation of Aspergillus nidulans by using trpC plasmid. Proc.Nat Acad Sci USA 81, 1470-1474.

31. Sambrook, J., Fritsch, E.F. & Maniatis, T. (1989). MolecularCloning. Cold Spring Harbor Press, New York.

32. Boel, E., Hjort, I., Svensson, B., Norris, F., Norris, K.E. & Fill,N.P. (1984). Glucoamylases G1 and G2 from Aspergillus nigerare synthesized from two different but closely related mRNAs.EMBO J 3, 1097-1102.

33. Sanger, F., Nicklen, S. & Coulson, A (1977). DNA sequencingwith chain-terminating inhibitors. Proc Natl Acad Sci USA74, 5463-5467.

34. Patkar, SA, et al, & Bjorkling, F. (1993). Purification of twolipases from Candida antarctica and their inhibition by variousinhibitors. Ind of Chem 32 B, 76-80.

35. McPherson, A (1982). Crystallization. In Preparation and Anal-ysis of Protein Crystals pp. 96-97, J. Wiley & Sons Inc., NewYork.

36. Hamlin, R. (1985). Multiwire area X-ray diffractometers. In Metbods in Enzymology. (Wyckoff, H. W., Hirs, C. H. W. & Timash-eff, S. N. eds), Academic Press, London. pp. 416-452.

37. Messerschmidt, A. & Pflugrath, J.W. (1987). Crystal orientationand X-ray pattern prediction routines for area-detector diffrac-tometer systems in macromolecular crystallography. J. AppZCrystallogr. A 30, 306-315.

38. Kabsch, W. (1988). Evaluation of single-crystal X-ray diffractiondata from a position-sensitive detector. J. Appl Crystallogr. A21, 916-924.

39. CCP4 (1979). The SERC (UK) collaborative computing projectno. 4. A suite of programs for protein crystallography, dis-tributed from Daresbury Laboratory, Warrington, WA4 4AD, UK.

40. Otwinowski, Z. (1988). DENZO. A Program for Automatic Eval-uation of Film Densities Department of Molecular Biophysicsand Biochemistry, Yale University, New Haven, CT.

41. Otwinowski, Z. (1991). Isomorphous replacement and anoma-lous scattering. In Proceedings of the CCP4 Study Weekendpp. 80-86, Daresbury Laboratory, Warrington, UK.

42. Zhang, K.YJ. & Main, P. (1990). The use of Sayre's equationwith solvent flattening and histogram matching for phase ex-tension and refinement of protein structures. Acta Crystallogr.A 46, 377-381.

43. Jones, TA & Kjeldgaard, M. (1992). O - The Manual Upp-sala, Sweden.

44. Jones, TA (1992). a, yaap, asap, @#'? A set of averagingprograms. In Molecular Replacement Proceedings of the CCP4Study Weekend pp. 99-105, SERC, Daresbury Laboratory, War-rington, UK.

45. Jones, TA & Thirup, S. (1986). Using known substructuresin protein model building and crystallography. EMBO J 5,819-822.

46. Jones, TA, Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. (1991).Improved methods for building protein models in electrondensity maps and the location of errors in these models. ActaCrystallogr. A 47, 110-119.

47. Bringer, AT. & Krukowski, A (1990). Slow-cooling protocolsfor crystallographic refinement by simulated annealing. ActaCrystallogr. A 46, 585-593.

48. Brunger, AT. (1990). Extension of molecular replacement: anew search strategy based on Patterson correlation refinement.Acta Crystallogr. A 46, 46-57.

49. Engh, RA & Huber, R (1991). Accurate bond and angle pa-rameters for X-ray protein structure refinement. Acta Crystal-logr. A 47, 392-400.

50. Morris, AL, MacArthur, M.W., Hutchinson, E.G. & Thomton,J.M. (1992). Stereochemical quality of protein structure coor-dinates. Proteins 12, 345-364.

51. von Heijne, G. (1986). A new method for predicting signalsequence cleavage sites. Nucleic Acid Res 14, 4683-4690.

52. Julius, D., Brake, A, Blair, L, Kunisawa, R & Thomer, J.(1984). Isolation of the putativestructural gene for the lysine-arginine-cleaving endopeptidase required for processing ofyeast prepro-a-factor. Cell 37, 1075-1089.

53. Kabsch, W. & Sander, C. (1983). Dictionary of protein sec-ondary structure: pattern recognition of hydrogen-bonded andgeometrical features. Biopolymers 22, 2577-2637.

54. Kleywegt, GJ. & Jones, TA (1994). Detection, delineation,measurement and display of cavities in Macromolecular struc-tures. Acta Crystallogr. D 50, in press.

Received: 9 Feb 1994; revisions requested: 18 Feb 1994;revisions received: 7 Mar 1994. Accepted: 7 March 1994.