Crystallographic analysis of ribulose 1,5-bisphosphate carboxylase from spinach at 2·4 Å...

48
J. Mol. Biol. (1990)215, 113-160 Crystallographic Analysis of Ribulose 1,5-Bisphosphate Carboxylase from Spinach at 2.4 A Resolution Subunit Interactions and Active Site Stefan Knight, Inger Andersson and Carl-Ivar Br~ind~n Swedish University of Agricultural Sciences Uppsala Biomedical Centre Department of Molecular Biology P.O. Box 590 S-751 24 Uppsala, Sweden (Received 1 February 1990; accepted 27 April 1990) The X-ray structure of the quaternary complex of ribulose 1,5-bisphosphate carboxylase/ oxygenase from spinach with CO2, Mg z+ and a reaction-intermediate analogue (CABP) has been determined and refined at 2.4 A resolution. Cyclic non-crystallographic symmetry averaging around the molecular 4-fold axis and phase combination were used to improve the initial multiple isomorphous replacement phases. A model composed of one large subunit and one small subunit was built in the resulting electron density map, which was of excellent quality. Application of the local symmetry gave an initial model of the LaSs molecule with a crystallographic R-value of 0"43. Refinement of this initial model was performed by a combination of conventional least-squares energy refinement and molecular dynamics simulation using the XPLOR program. Three rounds of refinement, interspersed with manual rebuilding at the graphics display, resulted in a model containing all of the 123 amino acid residues in the small subunit, and 467 of the 475 residues in the large subunit. The R-value for this model is 0.24, with relatively small deviations from ideal stereochemistry. Subunit interactions in the LsSs molecule have been analysed and are described. The interface areas between the subunits are extensive, and bury almost half of the accessible surface areas of both the large and the small subunit. A number of conserved interaction areas that may be of functional significance have been identified and are described. The binding of 2-carboxy-arabinitol 1,5-bisphosphate to the active site is described, and biochemical and mutagenesis data are discussed in the structural framework of the model. 1. Introduction The bifunctional enzyme ribulose 1,5-bisphos- phate carboxylase/oxygenase (Rubisco~f; EC 4.1.1.39) catalyses the initial steps of two opposing metabolic pathways in photosynthetic organisms: Abbreviations used: Rubiseo, ribulose 1,5-bisphosphate carboxylase/oxygenase; RuBP, ribulose 1,5-bisphosphate; PGA, 3-phosphoglycerate; CABP, 2-carboxy-arabinitol 1,5-bisphosphate; 3-keto-CABP, 2-carboxy-3-keto-arabinitol 1,5-bisphosphate; EMTS, ethylmereurithiosalicylic acid; r.m.s., root-mean-square; m.i.r., multiple isomorphous replacement; e.p.r., electron paramagnetic resonance; COBR, quaternary complex Rubisco- CO2-Co2+- CABP; Cab, carbamylated lysine. 113 photosynthetic carbon fixation and photo- respiration (for reviews, see Andrews & Lorimer, 1987; Miziorko & Lorimer, 1983). Carboxylation of the five-carbon sugar substrate common to both reactions, RuBP, yields two molar equivalents of PGA (Quayle et al., 1954; Weissbach et al., 1954). The oxygenase activity, on the other hand, yields one molar equivalent each of PGA and 2-phospho- glycolate. The latter molecule is the major substrate for photorespiration (Andrews et al., 1973; Bowes et al., 1971; Lorimer et al., 1973). Both these reactions take place at a common site on the enzyme. The subsequent metabolism of phosphoglycolate in the glyeolate pathway consumes energy and reducing equivalents to redirect three-quarters of the carbon diverted from photosynthesis back into the Calvin 0022-2836/90/170113--48 $03,00/0 (~ 1990 AcademicPress Limited

Transcript of Crystallographic analysis of ribulose 1,5-bisphosphate carboxylase from spinach at 2·4 Å...

J. Mol. Biol. (1990)215, 113-160

Crystallographic Analysis of Ribulose 1,5-Bisphosphate Carboxylase from

Spinach at 2.4 A Resolution

Subunit Interactions and Active Site

S t e f a n K n i g h t , I n g e r A n d e r s s o n a n d C a r l - I v a r Br~ind~n

Swedish University of Agricultural Sciences Uppsala Biomedical Centre

Department of Molecular Biology P.O. Box 590

S-751 24 Uppsala, Sweden

(Received 1 February 1990; accepted 27 April 1990)

The X-ray structure of the quaternary complex of ribulose 1,5-bisphosphate carboxylase/ oxygenase from spinach with CO2, Mg z+ and a reaction-intermediate analogue (CABP) has been determined and refined at 2.4 A resolution. Cyclic non-crystallographic symmetry averaging around the molecular 4-fold axis and phase combination were used to improve the initial multiple isomorphous replacement phases. A model composed of one large subunit and one small subunit was built in the resulting electron density map, which was of excellent quality. Application of the local symmetry gave an initial model of the LaSs molecule with a crystallographic R-value of 0"43. Refinement of this initial model was performed by a combination of conventional least-squares energy refinement and molecular dynamics simulation using the X P L O R program. Three rounds of refinement, interspersed with manual rebuilding at the graphics display, resulted in a model containing all of the 123 amino acid residues in the small subunit, and 467 of the 475 residues in the large subunit. The R-value for this model is 0.24, with relatively small deviations from ideal stereochemistry. Subunit interactions in the LsSs molecule have been analysed and are described. The interface areas between the subunits are extensive, and bury almost half of the accessible surface areas of both the large and the small subunit. A number of conserved interaction areas that may be of functional significance have been identified and are described. The binding of 2-carboxy-arabinitol 1,5-bisphosphate to the active site is described, and biochemical and mutagenesis data are discussed in the structural framework of the model.

1. Introduction

The bifunctional enzyme ribulose 1,5-bisphos- phate carboxylase/oxygenase (Rubisco~f; EC 4.1.1.39) catalyses the initial steps of two opposing metabolic pathways in photosynthetic organisms:

Abbreviations used: Rubiseo, ribulose 1,5-bisphosphate carboxylase/oxygenase; RuBP, ribulose 1,5-bisphosphate; PGA, 3-phosphoglycerate; CABP, 2-carboxy-arabinitol 1,5-bisphosphate; 3-keto-CABP, 2-carboxy-3-keto-arabinitol 1,5-bisphosphate; EMTS, ethylmereurithiosalicylic acid; r.m.s., root-mean-square; m.i.r., multiple isomorphous replacement; e.p.r., electron paramagnetic resonance; COBR, quaternary complex Rubisco- CO 2 - Co 2+- CABP; Cab, carbamylated lysine.

113

photosynthetic carbon fixation and photo- respiration (for reviews, see Andrews & Lorimer, 1987; Miziorko & Lorimer, 1983). Carboxylation of the five-carbon sugar substrate common to both reactions, RuBP, yields two molar equivalents of PGA (Quayle et al., 1954; Weissbach et al., 1954). The oxygenase activity, on the other hand, yields one molar equivalent each of PGA and 2-phospho- glycolate. The latter molecule is the major substrate for photorespiration (Andrews et al., 1973; Bowes et al., 1971; Lorimer et al., 1973). Both these reactions take place at a common site on the enzyme. The subsequent metabolism of phosphoglycolate in the glyeolate pathway consumes energy and reducing equivalents to redirect three-quarters of the carbon diverted from photosynthesis back into the Calvin

0022-2836/90/170113--48 $03,00/0 (~ 1990 Academic Press Limited

114 S. Knight et al.

cycle. The remaining carbon is oxidized to carbon dioxide and the released energy dissipated as heat. Up to 50~Zo of the photosynthetically reduced carbon may be oxidized through this pathway, thus severely limiting crop yield. The possibility of increasing the carboxylase/oxygenase ratio has therefore attracted substantial interest. An under- standing of the structural basis for the two activi- ties would greatly facilitate the successful use of site-directed mutagenesis techniques towards this end.

Rubisco from all higher plants, as well as from blue-green algae, is built up from two types of subunit; eight large (L, 55,000 Mr) and eight small (S, 15,000 Mr), forming an LsS 8 molecule of relative molecular mass around 550,000. In contrast, the enzyme from the photosynthetic bacterium Rhodospirillum rubrum is a homodimer of L sub- units. The large subunit is responsible for the cata- lytic activity, details of which seem to be modulated by the small subunit. The small subunit influences the catalytic activity of the enzyme, increasing the kca , for the carboxylase reaction more than 100-fold (Andrews, 1988).

There is extensive sequence homology (around 80 ~/o) among large subunits from the hexadecameric enzymes. In contrast, the homology between the large subunit of LsS s Rubisco and the single sub- unit of the R. rubrum enzyme is only around 25 °/o. However, as has been shown by crystallographic studies of Rubisco from spinach (Andersson et al., 1989), tobacco (Chapman et al., 1987, 1988) and R. rubrum (Schneider et al., 1986b), the folds of the polypeptide L-chains are highly similar in spite of this low homology. Several of the highly conserved active site loop regions have been identified as active site peptides by chemical labelling (Hartman et al., 1984; Igarashi et al., 1985; Lorimer, 1981) and site-directed mutagenesis experiments (Hartman et al., 1987a; Lorimer et al., 1987).

Common to all Rubisco molecules is an activation process, during which a lysine residue (K201 in the spinach enzyme) becomes carbamylated by an acti- vator C02 molecule, which is distinct from the substrate C02 (Lorimer, 1981; Lorimer & Miziorko, 1980). The carbamate is further stabilized by a magnesium ion (Pierce & Reddy, 1986). This activa- tion step is necessary for both the carboxylation and the oxygenation reactions. The activated ternary complex is able to bind the substrate, RuBP, in such a manner that it may subsequently be attacked by either carbon dioxide or oxygen. CO2 thus has a dual role in the reaction mechanism; as effector molecule and as substrate, using two different binding sites for the two reactions.

The complex reaction mechanism for the carboxylation reaction is understood in some detail (Andrews & Lorimer, 1987). The reaction is initiated by abstraction of the C-3 proton from the five- carbon substrate to give a 2,3-enediol intermediate (Fig. 1). Stereospecific carboxylation of C-2 of the enediol creates the six-carbon, fl-keto-acid inter- mediate 3-keto-CABP, which is rapidly hydrated to

,OPO:~- C,H z C=O I

H-C-OH -H+ H-I~ -OH"

I

C,H 2 OPO~"

0 PO~- C, H2

HO-C -co i H-C, -OH H-C, -OH

C,H z OPO~- CABP

OPO;- OPt- OH2 C, H2 c-o- co H°-C-COi c - o . i = o

- . - c - o .

0 PO]- 0 PO~- H20 "~

oPo;- C, H2

HO-C-CO~ HO-C_-CO~. HO-(~-OH--" + H-t -OH C, OF,

l

CH22 H-C, -OH OPO]- C H 2

OPO~-

OPO~- Fr 9P~ - OH2 ._L. C, H2

HO-C, -CO H

Figure 1. Five steps in the reaction mechanism of the carboxylation of RuBP catalysed by Rubisco: enoliza- tion, carboxylation, hydration, carbon-carbon cleavage and protonation. An analogue (CABP) to the gem-diol form of the 6-carbon reaction intermediate is shown in the inset.

give the gem-diol form. This is the predominant form of the six-carbon intermediate on the enzyme (Pierce et al., 1986). Deprotonation of one of the hydroxyl groups on C-3 of this intermediate initiates carbon-carbon cleavage, which is followed by stereospecific protonation of the aci-acid of PGA to complete product formation. On the basis of stereochemical constraints, at least two different bases are required at the active site (Andrews & Lorimer, 1987); one to initiate the reaction by deprotonation of C-3 of the substrate and a second base to donate a proton to the aci-acid.

Compared to the carboxylase reaction, the oxygenase reaction is less well understood. The reac- tion is believed to proceed through attack of 02 on the 2,3-enediol of RuBP leading to the formation of an intermediate hydroperoxide at C-2 of the substrate (Br~nd~n et al., 1984b; Lorimer et al., 1973), which is then cleaved to yield one molecule of PGA and one molecule of phosphoglycolate. Both the carboxylation and the oxygenation reactions thus involve nucleophilic attack of the 2,3-enediol of RuBP on one of the gaseous substrates. After this step, the reactions are practically irreversible and the enzyme committed to either carboxylation or oxygenation (Pierce et al., 1986).

The possibility that a Rubisco molecule with a higher carboxylation/oxygenation ratio than is found in nature might be constructed, given detailed knowledge of the reaction mechanisms of carboxyla- tion and oxygenation in a structural framework, has prompted a number of crystallographic studies of Rubisco from various sources (Andersson & BrEnd~n, 1984; Andersson et al., 1989; Baker et al., 1975, 1977a,b; Barcena et al., 1983; Chapman et al.,

Ribulose 1,5-Bisphosphate Carboxylase Structure 115

1987, 1988; Choe et al., 1985; Holzenburg et al., 1987; Janson et al., 1984; Knight et al., 1989; Lundqvis t & Schneider, 1988, 1989; Nakagawa et al., 1986; Pal et al., 1985; Schneider et al., 1986a,b, 1990). The first Rubisco s t ructure to be reported was tha t of the non-act ivated enzyme from R. rubrum (Schneider et al., 1986b). Subsequently, preliminary descriptions of the structures of the non-act ivated tobacco enzyme (Chapman et al., 1987, 1988) and the act ivated spinach enzyme with bound CABP (Andersson et al., 1989; Knight et al., 1989) have been published. These different struc- tures will form the basis for comparat ive studies tha t should lead to a bet ter understanding of the catalytic mechanisms of carboxylat ion and oxygenat ion and possibly also shed some light on the factors tha t determine the part i t ioning between the two competing reactions catalysed by Rubisco.

We have obtained several crystal forms (Andersson & Brand , n, 1984; Andersson et al., 1983) of an act ivated qua te rnary complex of spinach Rubisco, CO 2, Mg 2+ and a react ion-intermediate analogue, CABP, where one of the hydroxyl groups at C-3 of the six-carbon reaction intermediate has been reduced (Pierce et al., 1980). The s t ructure of spinach Rubisco presented herein has b e e n deter- mined using the form D crystals (Andersson & Br/ind~n, 1984). We have previously given pre- l iminary descriptions of the active site of spinach Rubisco in this crystal form (Andersson et al., 1989) as well as of the fold of the small subuni t (Knight et at., 1989). Here, we describe the refined s t ructure of spinach Rubisco at 2"4 A resolution (1 A=0-1 nm). Subuni t interactions in the LsS s molecule are described and discussed in terms of amino acid sequence variations among different species. We also give a more detailed description of the active site and discuss some of the existing biochemical and mutagenesis da ta in the framework of our model.

A

60

50

40

30 0"I

I I I I l I 0 - 2 0 " 3 0 - 4

Old (~-q (a)

I

0*5

30

2O A

V

I0

0 i I ~, I ,I t I I o-t o,2 0-3 0.4 0.5

(b)

Figure 2. Diagrams illustrating the degree of tetragona- lity and F-centring of the diffraction data from the spinach Rubisco crystals used in this study. (a) Phcl= Z][Fhkl]--]Fkht]]fZ0"5[[Fhkl[ + ]Fhhz] [ as a function of resolu- tion. (b) Mean F as a function of resolution for reflections where h+k=2n ([]) and reflections where h + k = 2 n + 1 (¢).

2. Exper imenta l Procedures and Resu l t s

(a) Protein purification and crystallization

Spinach Rubisco was purified as described (Andersson et al., 1983). Large single crystals of an activated quaternary complex of spinach Rubisco, Mg 2+, CO2 and CABP were obtained from ammonium sulphate solutions at pH 7"3 (Andersson & Br//nd6n, 1984). The crystals belong to space group C2221 with cell dimensions a = 157"2 A, b=157-2 A, c=201"3 A and diffract to approx. 1"7 A resolution. Half of the LaSs molecule is present in the asymmetric unit, giving a packing density V m (Matthews, 1968) of 2"26 AS/dalton.

The presence of a local 4-fold axis parallel to the c axis is clearly seen in the diffraction data (Fig. 2(a)) and is reflected in the similar lengths of the a and b axes. Rotation function calculations confirm the presence of the local 4-fold axis (Andersson & Br/ind6n, 1984). In addi- tion, the diffraction pattern shows pseudo F-centring to low resolution (Fig. 2(b)).

A substantial proportion of the crystals exhibit mero- hedral twinning, in which the a axis and the b axis in

different regions of the crystals are overlapping. Before data collection, all crystals were routinely screened for the absence of twinning by determining the degree of tetrag- onality (Andersson & Br//nd6n, 1984).

(b) Heavy-atom derivatives

Heavy-atom derivatives were prepared by soaking native crystals in the mother liquor with appropriate heavy-atom compounds added. Soaked crystals were mounted in glass capillaries and a small set of scaling reflections measured. Soaked crystals giving significant intensity differences with cell parameters deviating less than 1% from the native crystals were used to collect low-resolution diffraetometer data sets before eolleetion of any high-resolution data. Out of 15 compounds tested, 3 gave useful derivatives: K2Hg(CN)4, ethylmercuri- thiosalieylic acid (EMTS) and KAu(CN)2 (see Table 1 for soaking conditions). In addition, crystals where the active site magnesium ions were substituted by cobalt ions were prepared by using 5 mM-CoC12 instead of MgC12 in the crystallization medium.

116 S. Knight et al.

Table 1 Preparation of heavy-atom derivatives

Concentration Soak time Number of R~ Resolution Derivative (mM) (days) sites (%) (h)

KzHg(CN)4 0-1 7 28 17 2"6 EMTS l'0 7 28 18 2-8 KAu(CN)2 5"0 5 12 l0 2"6

~R = :E(IFph- Fpl)/(Fp), where Fph and Fp are the structure factor amplitudes for the derivative and native crystals, respectively.

(c) Data collection and data reduction

In the initial stage of the structure determination, low-resolution data to 7 A, comprising 4109 unique reflec- tions, were collected on a PHILIPS-STOE 4-circle diffractometer at 4°C using CuKa radiation from a sealed X-ray tube. To avoid overlap of reflections due to the long c axis, we used a crystal-to-detector distance of 400 ram. The focal-to-crystal distance was 300 mm. Absorption correction was done according to North et al. (I968). Three standard reflections were measured at regular intervals to monitor and correct for cl:¥stal decay.

High-resolution data were collected on oscillation photographs using synchrotron X-ray som'ces. A full data set comprised 90 film packs of l ° oscillations around the c axis, with 3 films/pack. Still photographs were taken beibre and aider data collection. During data collection, crystals were held at 4°C by blowing a stream of cold air onto them.

Due to the twinning problem, all crystals were mounted, aligned and tested for twinning at home before being transported to the synchrotron source. Crystals were mounted in glass capillaries with the c axis along the capillary axis. The capillaries were fixed to small metal plates with clay. These metal plates had 2 differently sized holes corresponding to 2 pins on a magnet at the top of a goniometer head. The crystals were aligned with the c axis parallel to the spindle axis on the diffractometer. For each crystal, a small set of scaling reflections, as well as the degree of tetragonality, was measured. By taking notes of

the goniometer settings, the crystals could easily be mounted in the correct orientation on the oscillation camera at the synchrotron radiation source using the same goniometer head.

Initially, a native data set and a K2Hg(CN)4 derivative data set, both to 2"6 A resolution, were collected at the EMBL outstation at DESY, Hamburg, FRG, using a wavelength of 1"49/1,. Subsequently, 1 native data set, 1 data set of the cobalt substituted enzyme, and 1 KAu(CN)2 derivative data set, all to 2"4 A resolution, were collected using a wavelength of 0"87 A from the wiggler beam line at the synchrotron radiation source (SRS), in Daresbury, U.K. A partial data set to 2-8 A resolution of the EMTS derivative was also collected at the SRS using a wavelength of 1"69 A. Data collection statistics are presented in Table 2. The use of synchrotron radiation of high brilliance and short wavelength increased the life of the crystals dramatically. For example, whereas 2 native crystals were needed for the 7 A data collection, a full high-resolution data set could be collected from a single crystal using the wiggler beam line. The wiggler data are also of very high quality, presum- ably due to lack of absorption effects at the low wave- length used, as well as the possibility of collecting complete data sets from l crystal.

Films were digitized with an Optronics rotating-drum densitometer using a raster step-size of 50ttm and evaluated with the program OSC (Rossmann, 1979; Schmid et al., 1981). Only fully recorded reflections were

Table 2 X-ray data collection statistics

No. of No. of unique Completenesst Data set measurements reflections R~c,g c Ram (%)

Native 1:~ 165,262 57,239 0"071 0'055 77'6 to 2,6 A Native 2:~ 175,242 77,710 0'049 0-039 82"8 to 2"4 A COBR 159,165 69,760 0'046 0-031 74'3 to 2.4 A K2Hg(CN)4 94,518 48,871 0"051 0'053 66'2 to 2'6 A EMTS 58,040 32,117 0"082 0"061 54"4 to 2"8 A KAu(CN)2 194,571 74,977 0"066 0"045 76"0 to 2"4 h

me r moo

mean intensity of that reflection, and the summation is over all measured reflections.

P~m= ,~_(~. , / i . ,--(/ ,) , /(/a) ), where 1,., isan individual measurement of reflection h, (1 , ) i s the mean

intensity of that reflection, and the summation is over symmetry-related reflections measured on the same film.

¢Completeness is the ratio of the measured to the possible number of unique reflections. :~Native 1 and 2; data sets of native crystals measured at the synchrotron stations at DESY.

Hamburg, F.R.G.. and Daresbury, U.K., respectively.

Ribulose 1,5-Bisphosphate Carboxylase Structure 1 I7

used. Film-to-film scale factors were determined using the PROTEIN program package (Steigemann, 1974) and applied using a program written by A. Jones.

Different da ta sets of the same derivative were scaled with anisotropic scale factors in resolution shells using ANISO, a program writ ten by one of us (C.I.B.) and modified by A. Jones. The sealed da ta sets were then merged. The 2 native film da ta sets scaled with an R-value of 0"077. Native diffractometer da t a and wiggler fihn da ta scaled with an R-value of 0"094. Merging of the 2 film da ta sets and the diffractometer da ta gave 83,083 unique reflections to 2"4A resolution (88"5°/0 of the possible number of reflections) with an R-merge ( R = ZIFh.i--<Fh>I/Z<Fh> ) of 0"087 for 52,519 merged reflections. Similarly, merging of diffractometer and film data gave a K2Hg(CN) 4 derivative da ta set containing 50,416 reflections to 2"6A resolution and a KAu(CN) 2 derivative da ta set containing 75,613 reflections to 2"4/~ resolution. The R-merge was 0.113 for the K2Hg(CN), derivative da ta set and 0"118 for the KAu(CN) 2 derivative da ta set.

Derivative da ta were scaled versus the native da ta using the ANISO program. Weak and very strong reflec- tions were not used in the scaling. A small number of reflections where the absolute difference between native and derivative structure factor ampli tudes was larger than twice the mean structure factor ampli tude for all reflections were rejected.

(d) Local symmetry

The local symmetry is clearly reflected in the diffrac- tion data , which show pseudo 4-fold symmetry extending into the high-rssolution range (Fig. 2(a)) and pseudo F-centring to low resolution (Fig. 2(b)). The native Pat terson map to 7 A resolution shows clear F-centred peaks at 63 % height of the C-centred peaks with no other peaks above 11% of the C-centred peaks (Andersson & Brand,n, 1984). In the native high-resolution Pat terson map, the F-centred peaks are offset from their exact positions by 2 A in the a direction.

From packing considerations, together with the pseudo F-centring of the diffraction lattice, it was clear tha t there were 4 molecules in the unit cell. The molecules thus have to occupy special positions in space group C2221 with a crystallographic 2-fold axis as well as the local 4-fold axis passing through the molecular centre. Inspection of the crystallographic symmetry shows tha t there are 2 different ways of positioning 4 molecules in the unit cell so tha t F-centring is produced. In one packing arrangement, the molecular centre is a t x--1/4 , y=O, z = 0 and in the other it is a t x=O, y= 1/4, z-- 1/4.

To determine which of the 2 packing arrangements was correct, a difference Patterson search for heavy-atom positions was performed using, in turn, each of the 2 al ternat ive positions for the 4-fold axis. This led to a clear indication tha t the molecular centre was a t x=O, y= 1/4, z--1/4. After localization and refinement of the heavy- atom positions, they were used to refine the local symmetry operations. The least-squares program HOMO (Rao & Rossmann, 1973) was used to determine indepen- dently each of the 4 rotat ions relating the subunits in the asymmetric unit. All the rotat ion matrices obtained were essentially equivalent and corresponded to a rotat ion axis with poLar angles co=--1 .8 °, ¢ = 0 - 0 °, K=90-0 ° through the molecular centre; 18 heavy-atom positions were used in each case, giving r.m.s, deviations of the positions around 0"6 A.

(e) Determination and refinement of heavy-atom positions

Derivative difference Patterson maps were calculated with the P R O T E I N program package, using isomorphous da ta to 3-2 A resolution. An anomalous Pat terson map to the same resolution was also calculated from the KAu(CN)2 derivative anomalous da ta collected at the SRS wiggler beam line. All the Pat terson maps were rather noisy with a large number of peaks.

Given the orientation and the position of the local 4- ibld axis, strongly occupied heavy-atom positions were located from difference Patterson maps using an auto- mated Pat terson search program written by one of us (S.K.). This program can be run in essentially 2 modes: H A R K E R mode and CROSS mode. Local symmetry, if present, may be used in both modes. H A R K E R mode implies a single site search using self vectors. In CROSS mode, additional sites are searched for by scanning a grid covering the asymmetric unit and looking a t cross vectors to a few known input positions. The program includes the option to ext rac t sets of positions from a long list, usually obtained from the H A R K E R mode, by examining the cross vectors between all pairs of positions on the list. The sum function, product function or min(n) (sum of n smallest peaks) function may be used to compute scores, which are stored as a CCP4 (Daresbury, U.K.) format map. In the case of the sum and the min(n) function, the scores are normalized to represent the mean peak density expressed in units of the s tandard deviation of the Patterson map. Positions with more than a specified number of peaks missing or below a specified level are rejected. The program is available from the authors on request.

The Patterson search for heavy-atom positions in our Rubisco crystals was done in 2 steps. The 1st step was a self vector search using H A R K E R mode on a grid twice as fine as the Pat terson grid. Space group and local symmetry were applied to each search point and Pat terson self vectors computed. The search was carried out over a volume covering the asymmetric par t of the molecule, i.e. 1/32 of the unit cell. Scores were calculated for each point on the search grid as a weighted sum of the densities a t the calculated vector positions. The local symmetry was then used to generate the 4 pseudo- symmetrical ly related positions from each of the positions generated in this first step. In the 2nd step, the 4 crystal- Iographically independent sites representing the best solu- tion from this search were used as input to a cross vector search. The asymmetric unit was scanned, and, for each point on the search grid, cross vectors to the input posi- tions were calculated, this t ime using only space group symmetry. The densities found at the calculated vector positions were summed to give a score. Sets of positions related by the 4-fold symmetry were then extracted from the resulting list and checked against the result from the self vector search.

Once the major sites had been located in this way, they were refined and the resulting m.i.r. "bes t" phases used to compute difference Fourier maps to locate minor sites. In this way, 28 sites were found for each of the 2 mercury derivatives and 12 for the gold derivative (Table 3). The 2 mercury derivatives have 24 sites in common, and the 12 gold sites are all present in both mercury derivatives. In spite of this, the derivatives are independent, with different relative occupancies for the sites, as is also reflected in an R-value between the 2 mercury derivatives of 13"9% for the 18,389 reflections in common to 3"2 A.

Using the procedure described above, we were able to

118 S. Knight e t al.

T a b l e 3 Heavy-atom parameters

Derivative Site x y z Occupancy B Location

K2Hg(CN)4 Al 0.1171 0 .4311 0.4596 34.3 32.7 S1 A2 0"1876 @3756 0.0584 32-9 26-6 $2 A3 0"1335 0.0713 0.0572 34-3 31-8 $3 A4 0.1706 0.1257 0.4502 29.2 33.2 $4

EMTS

Bl 0"1960 0'3810 0"3930 27-7 29"6 S l -B B2 0"1354 0"4529 0'1155 27-3 30'1 S2-D B3 0"2064 0"1208 0"1178 31"1 32'1 S3-F B4 0" 1244 0'0503 0-3900 28'0 26"4 S4-H

Dl @1609 0"4652 0-4601 20-2 37"8 Sl D2 0"2227 0"4223 @0490 22"0 30"9 $2 D3 0-1793 0"0370 @0481 19"9 35-5 $3 D4 0"2063 0~)786 0"4609 19"2 25-3 $4

E1 0"1904 0.2144 @3990 8-7 23"1 B E2 0"0301 0"4514 0"3973 8.7 38"9 D E3 0'2085 0-2908 0'1087 6-5 41'2 F E4 0'0416 0"0555 0'1029 9'0 35'2 H

F1 @3051 0"3683 0.2148 5-8 34"3 B F2 0'3870 0"0574 0"2028 16"4 31"9 D F3 0'3033 0"1372 0'3011 13'7 34.0 F F4 0'3841 0.4446 0"2907 8"8 27'8 H

G 1 0"2564 0 -2972 0"4400 7-6 31" 1 B G2 0"4490 0"0146 0"4309 19"5 29"3 D G3 0-2695 0"2093 0-0740 9-5 22"5 F G4 0"4605 0'4885 0"0659 6-6 35"3 H

A1 0-1165 @ 4 3 2 8 0-4513 25-0 35-8 S1 A2 @1879 0"3761 0"0573 20-2 35-0 $2 A3 @1333 0"0736 0"0550 24.4 37"6 $3 A4 0"1706 0"1284 0"4512 20.9 32"2 $4

B1 0"1966 0"3848 0"3925 29"5 16"9 S l -B B2 0"1366 0"4540 0"l 146 27-1 23"4 S2-D B3 0"2068 0"1197 0"1182 31-7 25"3 S3-F B4 0'1257 0"0481 0"3896 32"4 23"4 S4-H

C1 0"2901 0"2380 0"4562 16"7 32"4 B C2 0"5004 0"0468 0"0480 10"0 34"6 D C3 0'3103 0"2649 0"0616 18-0 28'6 F C4 0"4802 0"4529 0"4493 13-1 33"3 H

Dl 0"1619 0"4688 0"4613 19"7 32"6 S1 D2 0"2230 0"4227 0-0484 18.8 32.8 $2 D3 0"1825 0"0392 0"0499 19"6 35"5 $3 D4 0"2068 0"0784 0"4604 17-7 12"5 $4

E1 @1898 0"2100 0"3970 16"7 42"0 B E2 0"0325 0"4546 0"3969 16"0 5@4 D E3 0.2096 0 -2911 @1062 i@3 32-6 F E4 0"0431 0"0537 @0985 10-2 23"4 H

F1 0"2956 0'3890 0.2196 4.7 26"9 B F2 0"3863 0'0621 0"2057 13"0 8'0 D F3 0"3058 0" 1379 0-2963 12-7 29" 1 F F4 0'3874 0"4468 0"2876 9"5 21"9 H

HI 0"1185 0"4134 0"4206 11"4 25"6 SI-D H2 0"1696 0'3739 0'0858 13"l 30"8 S2-F H3 0"i318 0'0908 0'0842 12"4 41'3 S3-H H4 0"1509 0"1283 0"4233 12"9 20'3 S4-B

C1 0.2896 0"2331 0"4573 9.1 38"2 B C2 0"5024 0'0462 0"0485 10.3 33'3 D C3 0"3092 0"2669 0"0620 7"0 35'0 F CA 0.4847 0"4547 0"4512 9"0 57"8 H

Ribulose- l ,5-bisphosphate Carboxylase Structure ! 19

T a b l e 3 (continued) Heavy-atom parameters

Derivative Site x y z Occupancy B Location

KAu(CNh C1 0"2911 0"2337 0'4579 7"0 23.1 B C2 0"4873 0"0510 0'0500 6"1 36"6 D C3 0.3108 0"2699 0.0587 7"6 37.8 F CA 0.4759 0.4535 0.4508 8.0 41.3 H

COBR

D1 0"1657 0"4862 0-4622 26-7 28"8 S1 D2 0'2251 0"4251 0-0474 28-8 29-4 $2 D3 0"1824 0'0354 0-0466 29'0 29"8 $3 D4 0"2084 0"0767 0"4628 29"0 29"4 $4

E1 0"1881 0"2134 @3986 7"5 393 B E2 0"0331 0-4539 0"3940 7-7 42.3 D E3 0"2096 0"2905 0.1121 5-4 47"3 F E4 0'0419 0"0572 0.1016 7"4 45'2 H

Co 1 0.2161 0"2156 0-3408 14" 1 20-4 B Co2 @0313 0-4712 @3356 14-3 23-3 D Co3 @2226 0-2855 0"1706 14-2 22-5 F Co4 0"0382 0"0327 0"1660 15'0 24"7 H

x, y and z are fractional co-ordinates. B is the isotropic temperature factor, Occupancies have been scaled such that the highest occupancy of Co is equal to the difference in the number of electrons between Co and Mg. The sites have been grouped in sets of 4 related by the local 4-fold symmetry. Sites occupying similar positions in different derivates have been assigned the same label. The subunit labels B, D, F, H and S1-S4 are defined in Fig. 12. When 2 subunits are listed for l site, this implies that the site is located at the interface between those 2 subunits.

locate the main gold sites from the KAu(CN)2 anomalous Patterson map as well as the 4 unique cobalt ions from the difference Patterson map of the cobalt substituted enzyme. The mean peak height of the vectors generated by the 4 Co sites was 1-3 standard devial~ions. In the anomalous gold Patterson map, the mean peak height of the vectors generated by the major four sites was 5"5 standard deviations. This was the top site in the self vector search. The ability to interpret these maps reflects the high quality of the wiggler data as well as the strength of including local symmetry in the interpretation of difference Patterson maps.

Refinement of heavy-atom parameters and calculation of m.i.r, best phases were done with the phase refinement program PHASEREF, originally written by. M.G. Rossmann and modified by L . F . Ten Eyek and S .J . Remington (Remington et al., 1982). Both isomorphous data and anomalous data for the KAu(CN)2 derivative were used. The derivatives were 1st refined separately at 5-5 A resolution starting with only centrie data and subse- quently including all data. The derivatives were then jointly refined at increasing resolution out to 2-8 A. All derivatives were used simultaneously in both phasing and refinement. Final refined parameters for all derivatives are given in Table 3 and some refinement statistics in Table 4.

(f) Cyclic solvent-flattening

At an early stage of the structure determination, before the collection of the EMTS derivative data set, a solvent- flattened (Wang, 1985) map was calculated, starting from isomorphous K2Hg(CN)4 and KAu(CN)2 phases to 5"5 A resolution. On each cycle, phases calculated from the solvent-flattened map were weighted according to Sim {1959) and combined with the phases from the previous cycle. After 5 cycles of solvent-flattening and phase com- bination assuming 40~/o solvent, the refinement had

converged to a mean figure of merit of 0-71 with an accumulated phase shift of 30"5 ° .

The resulting electron density map showed helical features having the characteristic arrangement of an 8-stranded ~/fl-barrel structure. The structure of Rubisco from R. rubrum has been solved (Schneider et al., 1986b) and shown to have an ~/fl-barrel domain. The overall sequence homology between the R. rubrum enzyme and the large subunit of spinach Rubisco is about 28%. We thus expected the 2 chains to have similar folds. Starting from the 8 barrel helices, the R. rubrum model was then fitted to the spinach density and local symmetry applied to generate a model for the L 8 core.

A difference Fourier map of the cobalt substituted complex was also computed using the phases from the solvent-flattening. In this map, there were 4 peaks above 12 standard deviations of the map. No other peak above 5 standard deviations was observed. The positions of these 4 high peaks agreed very well both with the solution of the cobalt difference Patterson map and with the position of the active site in the R. rubrum structure.

(g) Real space averaging

Cyclic averaging and phase combination were per- formed using the program system of Bricogne (1976). Space group-specific routines (SPGRP1, SPGRP2} fo r C2221 were written and linked to the GENERATE main program. A modified version of the phase combination program, written by L. Liljas, which permitted the use of calculated structure factors for unobserved reflections, as well as local scaling between observed and calculated structure factors, was used. The molecular envelope was initially defined from the 3"2 A m.i.r, map, also taking into consideration the preliminary model of the L s core based on the solvent-flattened map. A few cycles of pre- liminary averaging resulted in a map in which a crude skeleton model (see below} of the small subunit could be

120 S. Knight et al.

T a b l e 4 Multiple isomorphous replacement phasing statistics

Resolution (A) 7-57 6.09 5"09 4"37 3"84 3"41 3"08 2 .80 1(}0-2-80 Number of reflections 1854 2766 3996 5534 7268 9264 il,181 13 ,147 55,010 Mean figure of merit 0"67 0"67 0.64 0'60 0.54 0.51 0.46 0"42 0"51

K2Hg(CN)4 R k 0.44 0"45 0"52 0'60 0.62 0.64 0"64 0'65 0.59 R~ 0"49 0"45 0'62 0'60 0'53 0'63 0'61 0'64 0'57 FH/res 2-56 2.57 2'18 1"67 1"51 1"45 1'44 1-39 1"85

EMTS R~ 0"56 0"57 0-64 0"62 0"67 0"71 0"73 0"75 0"68 R¢ 0-51 0 .54 0"61 0"55 0"59 0"63 0"68 0"65 0"59 FH/res 1"83 1"99 1"67 1-60 1-32 1-19 l'10 0"95 1-45

KAu(CN)~ R~ 0"65 0"60 0'62 0'68 0"72 0"73 0"77 0"78 0'71 R, 0'66 0"60 0"58 0"63 0"66 0"66 0'69 0'66 0"65 FH/res 1"53 1"94 1"79 1-42 1"24 l'19 l ' l l 1"03 1-41

R k =ZlFph(obs)--Fph(calc)[/~Fp~(obs), where Fph(obs ) and Fph(calc ) are tile observed and calculated derivative structure factor amplitudes, respectively, and the summation is taken over all observed reflections.

Rc~Y. I IFph--FpI--Fhl/ZIFph--Fpl, where Fph and Fp are the observed derivative and native structure factor amplitudes, respectively, F h is the calculated heavy-atom contribution to the derivative structure factors, and the summation is over all centric reflections.

FH/res=r.m.s. heavy-atom contribution/residual.

built. A new envelope was defined based on the pre- liminary model of the LsSs molecule and then used throughout. Envelope sections were traced on plastic sheets and digitized using an interactive graphics program written by M. Bergdoll for the Evans & Sutherland PS330 graphics display. Since no C2221-specific F F T routine was available, electron density calculation and Fourier inver- sion were performed in space group P2 I. To save time and space, averaging and phase combination were performed in C222 I. The C2221 structure factors were expanded to P21 and an electron density map covering a C2221 asym- metric unit was computed on a 0"56 A grid. This map was averaged and the resulting C2221 asymmetric unit expanded to cover a P21 asymmetric unit, which was then Fourier inverted to give P21 structure factors. The struc- ture factors were reduced to C2221 before phase combina- tion; 6385 calculated structure factors were used for unobserved reflections. Local scale factors were used in scaling of structure factor amplitudes. The initial m.i.r. map was based on 53,657 reflections between 10 and 2-8 A resolution; 16 cycles of averaging were performed. In the initial 5 cycles, m.i.r, and calculated phases were combined using the method of Hendrickson & Lattman (1970), whereas calculated phases were used in all later cycles. During the 1st 6 cycles, electron density within 3"5 A of heavy-atom positions was set at zero. A final R-value, R=ZI[Fd-[FJI/Z[FJ, of 15"9% was obtained. The accumulated phase shift from the initial m.i.r, phases was 61-6 °. Fig. 3(a) shows the correlation coefficient between observed and calculated structures factors as a function of resolution for different cycles of averaging.

(h) Model building

Model building was performed on an Evans & Sutherland PS330 vector graphics display using FRODO (Jones, 1978; Joners & Thlrup, 1986). The large subunit was built starting from the R. rubrum model. The small subunit as well as those parts of the large subunit tha t were ill-defined in the R. rubrum structure, or where the 2

structures were obviously very different, were built using the BONES option of FRODO (Jones & Thirup, 1986). With this option, a skeletonized representation of the electron density can be edited to give a chain-tracing of pseudo-atomic positions. The resulting skeleton model was used to build a polyalanine model using short amino acid fragments from a data-base of 32 refined structures as described by Jones & Thirup (1986). The side-chains were fitted manually by dihedral rotations and fragment moves. The model was built in the electron density map based on the phases from the cyclic averaging procedure. This map was of very high quality and we had no prob- lems in unambiguously tracing the fold of the polypeptide chains. Only one L subunit and one S subunit were built and then used to generate the whole molecule using the 4- fold local symmetry. The R-factor for this initial model was 43-0% for all observed reflections between 8"0 and 2-4 A resolution. In this model, the L chain has a fold that is similar both to the L chain of R. rubrum {Schneider et al., 1986b) and to that of tobacco (Chapman et al., 1988). The fold of the S chain is, however, completely different {Knight el at., 1989) from that suggested for the tobacco enzyme {Chapman et al., 1988). Fig. 3(b} shows the elec- tron density of our map in one region of the small subunit.

(i) Refinement of the model

The model of LsS s Rubisco was refined using the XPLOR program (Briinger, 1988). The simulated annealing technique {Briinger et al., 1987) of structure refinement was used. This technique combines energy minimization with molecular dynamics simulation at high temperatures, thus allowing the system to escape from local energy minima. Most of the steps were computed on a STELLAR processor. Three rounds of refinement and dynamics have been performed so far, with manual rebuilding of the model between each round where necessary. Non-crystallographic symmetry constraints were used throughout the refinement.

In the initial round of refinement, only protein atoms

Ribulose 1,5-Bisphosphate Carboxylase Structure 12l

1.0

0 . 8 Cycle 16

Cycle 7

Cycle 6

Cycle 2

Cycle I

0 - 6

0 , 4

0 . 2 0 , I

{ i i u

f . l

(J,

t j I J 0 .2 0 .3

I/d (~-')

( o )

0-4

95 95

(b)

Figure 3. (a) Correlation coefficient between F o and F c as a function of resolution for different averaging cycles. The switch from phase combination to the use of calculated phases on cycle 6 is seen in the relatively large improvement between cycles 6 and 7. (b) Pa r t of the small subunit with 2Fo-F¢ electron density map a t 2"4 A resolution superimposed. The contour level is a t the level of l s tandard deviation of the map.

were used. Energy minimization reduced the R-value for the model from 0"430 to 0.328. A molecular dynamics simulation was then run for 0"875 ps at 3000 K using a t ime-step of 0"5 is. The temperature was subsequently raised to 4000 K and the structure slowly cooled to 300 K using a 50 K temperature drop for every 50 dynamics steps of 0"5 fs. Final energy minimization and refinement of atomic temperature factors gave an R-value of 0-262

for all the observed 78,926 reflections between 7-0 and 2'4 A resolution. Two further rounds of refinement, now including the reaction-intermediate analogue as well as the active site magnesium ion, and with small modifica- tions of the refinement protocol, have been performed, resulting in an R-value of 0.240. The current model consists of 18,760 a toms/asymmetr ic unit, excluding hydrogen atoms, which represents all of the amino acid

! 22 S. Knight et al.

Table 5 Deviations from ideal stereochemistry of the refined model of spinach Rubisco

r.m.s, bond lengths 0.018/k r.m.s, bond angles 3-63 ° r.m.s, dihedral angles 26"3 ° r.m.s, improper dihedral angles 1.56 °

No. of deviations > 0"06 ,a, = 65 No. of deviations > 10 ° = 124 No. of deviations >90°=0 No. of deviations > 20 ° =0

residues in the S subunit and 467 of the 475 residues in the L subunit. Deviations of the final structure from ideal stereochemistry are given in Table 5. Further refinement of the model is now in progress. Co-ordinates have been submitted to the Brookhaven Protein Data Bank.

3. Description o f the Structure

The LsS s molecule is shaped like a cube with rounded edges and a side of approx imate ly 105/k measured between main-chain atoms. The molecule has local 422 symmet ry , where the 4-fold axis relates four L 2 dimers into a core of eight large subunits, (L2)4 (Fig. 4). The small subunits, on the other hand, are arranged into two separate clusters of four subunits each, ($4)2, which interact t ight ly with the large subunits. Each small subuni t binds in a deep crevice formed between the t ips of two adjacent elongated L 2 dimers a t each end of the LsS s molecule. Four faces of the cube-shaped mole- cule are thus formed by pairs of adjacent L 2 dimers, whereas the remaining two faces are formed by the $4 clusters.

In the centre of the molecule, along the local 4-fold axis and between the four L~_ dimers, there is a solvent channel extending th roughout the mole- cule (Fig. 5). The widest dimension of this channel is found a t the funnel-shaped entrances a t both ends

of the molecule. The channel then narrows to become around 15A in d iameter a t its most constricted point. Toward the centre of the mole- cule, the channel widens again to form a central cav i ty with a d iamete r of app rox ima te ly 30 A. Between adjacent L 2 dimers, there are deep clefts extending from this central cav i ty a lmos t to the surface of the molecule, giving the central port ion of the channel a starlike appearance when viewed along the 4-fold axis {Fig. 5). At the top and bo t tom of the d imer -d imer interface, a narrow solvent channel is formed between the two dimers and the small subuni t bound between them. This channel is approx imate ly 6 A in d iameter and leads from the outer surface of the molecule into the central solvent channel.

(a) Structures of the L and the S subunits

Prel iminary descriptions have been repor ted for the s t ructures of both the L subuni t (Andersson et al., 1989) and the S subuni t (Knight et al., 1989) of spinach Rubisco. Here, we repor t a more detailed description. Diagrams showing the fold and indi- cating the secondary s t ructural elements of the large and the small subuni t of spinach Rubisco are shown in Figure 6. Table 6 lists the residues forming the main elements of secondary s t ruc ture in the two subunits.

(i) Secondary structure and fold of the large subunit The large subuni t of spinach Rubisco, which has a

s t ruc ture similar to the R. rubrum subuni t

m u the centre of the molecule. The drawings are based on

Figure 4. A diagram of the architecture of the LaSs computer-generated sections through the molecule made Rubisco molecule. Four L 2 dimers arranged around a by placing spheres of 2 A radius around each atom and 4-fold axis build up the (L2) 4 core of the molecule, represent slabs of 8 A thickness. In (a), the view is along Clusters of 4 small subunits bind at each end of the the 4-fold axis and in (b) it is perpendicular to the 4-fold molecule. Drawing by Bo Furugren. axis.

Ribulose 1,5-Bisphosphate Carboxylase Structure 123

N

C

(a )

N N

(b)

Figure 6. Computer-generated ribbon diagrams (Priestle, 1988) of (a) the large subunit and (b) the small subunit of spinach Rubisco. Secondary structural elements are labelled as defined in Table 6.

(Schneider et at., 1986b), has two clearly separated domains. The smaller N-terminal domain consists of residues 1 to 150 and the larger C-terminal domain is built up from residues 151 to 475. The central motif of the N-terminal domain is a five-stranded mixed fl-sheet with two a-helices on one side of the sheet (Fig. 7(a)). The domain starts with a short fl-strand followed by two flail units where the strands are not adjacent. These units are joined by a loop region, giving the fl-sheet in the N-terminal domain the topology ( + 3x, -- 2x, + 1, + 2x). The connection to the C-terminal domain is through a short helix, aD, followed by an extended piece of

chain. The C-terminal domain has an eight-stranded parallel a/fl-barrel structure, as found in triose phos- phate isomerase {Phillips et al., 1978), glycolate oxidase (Lindqvist & Brand,n, 1985) and a number of other enzymes. The a/fl-barrel structure of the C-terminal domain (Fig. 7(b)} consists of eight con- secutive fla-units with the fl-strands forming a barrel-shaped fl-sheet surrounded by the eight helices. We will denote loops that connect the C-termini of the fl-strands to the N-termini of the a-helices as C-terminal loops and the loops on the other side of the barrel as N-terminal loops. We have numbered the C-terminal loops 1 to 8 and we

124 S. Knight et al.

Table 6 Secondary structural elements of spinach Rubisco

Secondary Residue Secondary Residue structure numbers structure numbers

Large subunit N-terminal domain

C-terminal domain

Small subunit

8A 24-26 a4 274-287 8B 36-44 85 290-294 aB 50-60 aF 298-302 8C 83-89 8F 308-309 8D 97-103 a5 311-321 aC 113-121 86 325-327 fiE 130-139 a6 339-350 aD 142-145 8G 353-354

aE 155-162 8H 366-367 81 169-171 /17 375-379

I 182-194 a7 387-394 82 199-201 88 399-401 a2 21 4-232 aP 404-407 f13 237-241 a8 413-433 a3 247-260 aG 437-451 84 264-268 aH 453-462

ccA 23-35 8A 39-45 fiB 68-74 ~B 80-93 tiC 98-105 8D 110-118

The secondary structural elements are labelled in alphabetical order starting from the N terminus except in the a/fl-barrcl of the L subunit, where the strands and helices are labelled fll to fl8 and al to aS, respectively. The labelling of the a-helices in the L subunit starts from aB for easier comparison with the R. rubrum structure, which has an additional a-helix etA between strands flA and fiB. For the same reason, a small helix in loop 8, which is involved in phosphate binding and which is absent in the R. rubrum structure, has been labelled aP. The secondary structural elements were assigned according to Kabsch & Sander (1983) except for strand fiB in the small subunit where the hydrogen-bonding pattern is interrupted by a bulge formed by residues 71 to 73.

will frequently refer to these loops simply by their number. A majority of the residues involved in catalysis and substrate binding are found in the C-terminal loops, while the N-terminal loops are involved in subunit interactions in the LaS s complex.

The structure of the L subunit of the R. rubrum enzyme has been described in detail (Schneider et al., 1990). The main differences between that struc- ture and the structure of the spinach L subunit are found in the last 42 residues, which form a C-terminal ~-helical extension to the ~/fl-barrel constituting four helices in the R. rubrum structure and two in the spinach structure. In addition, a number of loop regions between secondary struc- tural elements are different, both in length and in structure (Schneider et al., 1989).

(ii) Secondary structure and fold of the small subunit The fold of the small subunit has been described

in some detail elsewhere (Knight et al., 1989). The 123 residues of this subunit are arranged in a four- stranded anti-parallel fl-sheet of topology ( + l , - 2x , -1 ) , covered on one side by two helices (Fig. 7(c)). The first 20 residues at the N terminus form an irregular arm, which extends from the main body of the structure to a neighbouring small sub-

unit in the LsS s molecule. The edge strands in the fl-sheet, fiB and riD, are somewhat irregular, partic- ularly strand fiB, where residues 71 to 73 form a bulge and do not form hydrogen bonds to strand flA.

The two strands flA and fiB are joined by a long loop, residues 46 to 67 (Fig. 8), which protrudes into the central solvent channel in the LsS s molecule. In cyanobacterial forms of LsS s Rubisco, residues 52 to 63 in this loop are absent. This part of the loop starts as a fl-hairpin structure with a pair of anti- parallel hydrogen bonds between residues Y52 and D63. Further hydrogen bonding between the two antiparallel peptide segments is prevented by the conformation of residues 54 to 56, which form a bulge. The loop is, however, further stabilized by a number of side-chain to main-chain hydrogen bonds. In 13 of the 15 intra-subunit hydrogen bonds involving residues 46 to 67, both the donor and the acceptor groups are contributed by residues within the loop (Table 7). Nine of these 13 hydrogen bonds are formed between charged side-chains and main- chain atoms. Most of these hydrogen bonds involve those residues that are deleted in the cyanobacterial small subunits. Residues 58 to 61 at the tip of the loop form a turn with a main-chain to main-chain hydrogen bond between $58 and Y61. In spite of the fact that this loop is not folded as a classical

LA LA

Ribulose 1,5-Bisphosphate Carboxylase Structure 125

(a)

(b)

(c )

Figure 7. Stereo diagrams showing the C ~ tracings of (a) the N-terminal domain of the large subunit, (b) the C-terminal domain of the large subunit and (c) the small subunit of spinach Rubisco.

126 S. Knight et al.

iPaO 59 PRO 59

TRP67 TRP07

Figure 8. The hairpin loop (residues 46 to 67) in the small subunit of spinach Rubisco. In small subunits from cyanobacteria, residues 52 to 63 are deleted, whereas in several algal species there are insertions in this loop region.

fl-hairpin structure, we will refer to it as the hairpin loop.

During model building, a number of deviations from the published sequence (Martin, 1979) of the small subunit became evident. These deviations were detected from the electron density and, in the two instances where they involved cysteinyl residues, they could also be inferred from the posi- tions of bound mercury atoms in the two mercury derivatives (Knight et al., 1989). The presence of the three cysteinyl residues at positions 44, 77 and 112 has been confirmed by isolating and sequencing all cysteine-containing peptides of the spinach small subunit (G. Lorimer and B. Ranty , personal communication). The electron density also strongly indicated tha t residues 6 and 101, which in the

T a b l e 7

Intra-subunit hydrogen bonds (d <3"3.4) involving residues in the hairpin loop in the small subunit

Asp48 N - G|u45 OE2 Tyr52 N - Asp63 0

0 - N Arg53 0 - Lys57 NZ

NE - Tyr61 0 NH1 - 0 NH1 - S e r 5 8 0

Glu54 N - Asp63 OD1 O - Lys57 NZ

His55 N - Asp63 OD1 NDI - OD2

Set58 0 - Tyr61 N Gly60 O - Arg65 NH1 Tyr62 0 - Arg65 NH2 Arg65 0 - Argl00 NH2

published sequence are Pro and Phe, respectively, should both be Ile (Fig. 9). In the published spinach small subunit sequence, residues 105 and 106 are Asn and Asp, respectively, while in almost all o ther species the order of these two side-chains is reversed. Although Asn and Asp can not be distinguished from the electron density, we have tenta t ively interchanged their order in our struc- ture. By the same argument, we have assumed tha t Glul09 is Glnl09. The complete corrected amino acid sequence is given in Figure 10.

(b) The cores of the subunits

To determine if a residue is buried or if it is on the surface of the molecule, we have computed the relative accessibility as compared to the same type of residue X in a t r ipeptide Gly-X-Gly (Miller et al., 1987). Residues with an accessibility of less than 5~/o are considered to be buried. For isolated large subunits, we find t h a t 158 of the 463 residues used in the computat ion, or 34°/o, are buried, while in the small subunit only 15 of the 123 residues, or 12~/o, are buried.

(i) The hydrophobic cores in the L subunit There are essentially four hydrophobic cores in

the L subunit, one in the N-terminal domain, two in the C-terminal domain and one in the interface region between the two domains. Out of the 158 buried residues in the L subunit, we find 116 side- chains in these four cores.

The hydrophobic core in the N-terminal domain is formed by packing helices aB and aC against the five-stranded fl-sheet, with aB interacting mainly

Ribulose 1,5-Bisphosphate Carboxylase Structure 127

(a )

(b)

Figure 9. Residues 6 and 101 in the small subunit of spinach Rubisco with superimposed Fo-F c electron density calculated without the contribution of these residues. The contour level is at 1 standard deviation of the electron density map. On the basis of this electron density, we have assigned Ile side-chains to both these positions, even though they have been reported as Pro6 and Phel01 (Martin, 1979).

with strands tiC and flD and helix aC with strands fiB and fiE. Residues that provide side-chains to this core are listed in Table 8.

The C-terminal a/fl-barrel domain contains two hydrophobic cores. One is formed by those residues from the fl-strands that point into the interior of the barrel (Table 9). The side-chains of these residues are arranged in three layers, each layer comprising four residues from alternate fl-strands around the barrel (Lesk et al., 1989). The top layer defines the "floor" of the active site. Below these three layers, at the N-terminal side of the a/fl-barrel, we find a buried charged side-chain, E158. This side-chain is surrounded by a number of leucine residues: L162,

L169, L290 and L375, and is situated almost at the centre of the barrel at the N-terminal side (Fig. 11 ). There is a short hydrogen bond between E158 and H325 in the bottom layer of the core in the interior of the barrel. The side-chain of H325 in turn forms a hydrogen bond to H292 in the middle layer, which in turn is hydrogen-bonded to H327 in the active site. We thus find a network of hydrogen bonds in the interior of the barrel, extending from a buried Glu at the very bottom of the barrel to a His at the active site. All four residues involved are conserved in all LsSs Rubisco large subunits.

The second hydrophobic core of the C-terminal domain comprises side-chains from the eight helices,

128 S. Knight et al.

I

i i i i i i

1 M Q V W P I L G

2 . . . . . AY.

3 . . . . . P I N

4 . . . . . PI.

5 . . . . . PY.

6 . M . . T P V N

7 . K . . N P V N

8 M Q T L P K

9 . S M K T L P K

i0 M

Ii M S E M Q D Y S S S L E D V N

1

2

3

4

5

6

7

8

9

I0

ii

1

2

3

4

5

6

7

8

9

i0

ii

I0 20

i i i i ii i i i i i i i ii

aaaaaaaa A aa

L K K F ET L LP~P L T T EQ L L A E

S , o • , ° ,

K . . Y . .

K o . . • o ,

K . ° Y . .

N . M . . . F

N . F W . . F

E R R Y . . . . . . . . . . D V . I E K

E R R . . . F . . . . . . S D R . IA .

RI T Q G . F . F . . E . ~ D • • I T K

SR . . . . F . . . . A M D A D R I R K

. .,.. S . DD . . K

• .,D ~. SO . . . . S

. ., . . ~: R D . . . K

. .,D . . D . . . . K

...... D . . IA .

..... S D AE I AK

30 4O

c ic i i c c c i i i

aaaaaaaaa bbbbb A bbbbb

V N Y L L V K G W I P C L E FIE V K °

. D . . . R N . . . . . . . . . IS- .

E KN . . V . . . . . . T E

io iv ! iil • D M I I A . . . . S L R

• Q . I . S Q . Y . A V . . N E V

I E . MI EQ . FH . L I . . NE H

LE . C . NQ . . A V G . . Y T D D

• E . I V S . . N . A I . H T E P

5O

i i

D G FV Y

V .

H .

K .

H .

. K A Y V S N E S A I

E I S E R A Y P CC Y I A N D M T V

S E P T

S N P E . . . . . . . . . . . . . .

P H P R . . . . . . . . . . . . . .

E N A F

60 70

i i i i ii ii ii ii ii i i i i i i

bbbbb B bbbbb

R E H D K S P G Y Y D G R YW T M W K L P M F G

. . N S T . . C . . . . . . . . . . . . . . . . . . . . N . A T .

• . N N . . . . . . . . . . . . . . . . . . . . . . . . . AT.

. . . N . . . . . . . . . . . . . . . . . . . . . T . . A S .

• . Y H A . . R . . . . . . . . . . . . . . . . . . . . . A T .

. F G S V . CL . . . N . . . . . . . . . . . . . . R . . M .

, F S G T A A . . . . N . . . . . . . . . . . . . . . . AS ,

~ . F . . . . . . . . , , . D . K s . o .

NT . . E . FG . . . . D LR . A . G I

D H . . Y . . . . . . . . E . . I D T I

8O

C c

aaaaaaaaaa B aaaa

ICT D P A Q V V N E V E

Y K L

L A .

L K L

L G L

L R I

L K L

L A .

L D .

L M I

L K A

Q

D Q V

S Q

R

N

Fig. 10.

which pack both against each other and against the eight fl-strands around the barrel. These residues are listed in Table 10.

Helix ~5 in the C-terminal domain is involved in domain-domain contacts and is completely buried in the L subunit. It interacts with helices aC and ~D in the N-terminal domain, and with residues from strands fig and flH in the C-terminal domain. Residues from the core of this domain-domain interaction are listed in Table 11. There are two buried, positively charged side-chains from helix ct5 in this core, R312 and K316. Residue R312 is conserved in all known hexadecameric Rubisco L sequences and K316 is strictly conserved in all known sequences of the large subunit. The guanidi- nium group of R312 packs fiat against the side- chain of H310. In addition, there is a salt-link to E136 and a hydrogen bond to the carbonyl oxygen

atom of this residue. E136 is conserved in all known sequences, except in the dimeric Rubisco from R. rubrum. K316 interacts with the strictly conserved residue D137, and forms a hydrogen bond to the carbonyl group of L138. There is one addi- tional buried salt-link in the domain-domain inter- face, between R134 and D473.

(ii) The hydrophobic core in the S subunit

The hydrophobic core in the small subunit is formed by packing the two a-helices against the anti-parallel fl-sheet. Residues from helix ~A interact mostly with fl-strand riD, whereas helix ~B is packed against fl-strands flA and fiB. Residues involved in this core are listed in Table 12. All these residues except A97 and V99 are invariably hydro- phobic in all known sequences of the small subunit.

Ribulose 1,5-Bisphosphate Carboxylase Structure 129

90 I00 ii0 120

c i c c c i c c c l l l i i i i i i i c

aaaaaaaaa bbbbbb C bbbbbb bbbbbbb D bbbbbbb

1 E V K K A Y P D A F V R I I G F D N K R Q V Q C I S F I A Y K P A G Y -

2 . AI . S . . . . . H . V . . . . . I K . T . . V . . . . . . . P . SD

3 .A . . . . . Q . W I . . . . . . . V . . . . . . . . . . . . . E . . -

4 . . V A . ~ . Q . . . . . . . . . . V . . . . . . . . . . HT . E~ . -

5 .A . . . . . N . W I . . . . . . . V . . . . . . . . . . . . . P . F -

6 A C T . . F . . . Y . . L V A . . QR . . . I M G . L V Q R . K T A R

7 P L E F . A. EN . . . L A A . . S V K . . . V . . . V V Q R . S . SS

8 S C R S Q . . G H Y I . V V . . . . I K . C . I L . . . VH . . S R - -

9 . C R S E . G . C Y I . V A . . . . I K . C . T V . . . V H R . G R - -

i0 N A R N T F . N H Y I . V T A . . S T H T . E S V V M S F I V N R P A D

ii A C H . . H . NN H . . L . . . . . Y A . S K G A E M V V E R G K P V -

1

2

3

4

5

6

7

8

9

i0

ii

D F Q P A N K R S V

W

E P G F R L V R Q E E P G R T L R Y S I E S Y A V Q A G P K

Figure 10. Amino acid sequence of the small subunits of Rubisco from (1) spinach (Martin, 1979), (2) maize (Matsuoka et al., 1987), {3) tobacco (Mazur & Chui, 1985), (4) pea (Fluhr et al., 1986), (5) petunia (Turner et al., 1986), (6) Chlamydomonas reinhardtii (Goldschmidt-Clermont & Rahire, 1986), (7) Eufflena ffracilis (Sailland et al., 1986), (8) Anabaena (Nierzwicki-Bauer et al+, 1984), (9) Anacystis nidulans (Shinozaki & Sugiura, 1983), (1O) Alcal~enes eutrophus (Andersen & Caton, 1987) and (11) Chromatium vinosum (Viale et al., 1989). Sequence numbers given in the top line refer to the spinach sequence. In the second line we identify residues that are buried in the interior {e) of the small subunit {less than 5% accessible) and at the S-L interfaces (i). The secondary structural elements are indicated in line 3 (a=~-helix; b=fl-strand). Three regions where at least 4 consecutive residues are identical in all the S-chains from eukaryotic organisms are boxed. Identity with the spinach sequence is indicated by a dot. A dash indicates a deletion arbitrarily introduced to maximize homology except for residues 52 to 63, where the deletions in the cyanobacterial small subunits are based on model building.

(c) Quaternary structure and subunit interactions

The fundamental uni t of the L s par t of the mole- cule is an L 2 dimer, which is very similar to the L 2 R. rubrum Rubisco molecule (Schneider et al., 1986b). The ar rangement of four such L 2 dimers and two S+ clusters in the LsS s molecule is i l lustrated by Figure 12, where we also define our naming conven- tion for the L subunits (A to H) and S subunits (S1 to $8).

The accessible surface area of the LsS s molecule is about 120,000 A 2 (Table 13A), in close agreement with the empirical relation between the molecular weight of oligomers and the accessible surface area described by Miller et al. (1987). The par t of the

surface area of the L subuni t t h a t is involved in subunit interactions in the LsS s molecule comprises 4 8 ~ of the total solvent-accessible surface area o f this subunit. Of the L subuni t surface, 25~o is buried in the L 2 dimer interface, which is the largest interface in the LsS s molecule, whereas 8 % is involved in L2-L 2 interactions to form the (L2) + core and 15 ~/o is involved in L-S interactions. These lat ter interactions comprise 37~/o of the total surface area of the small subunit. An additional 8 ~/o of the S subuni t surface area is involved in S--S interactions so that , in total , 45~/o of the accessible surface area of the isolated small subunits becomes buried in subunit interfaces in the LsS s molecule.

T a b l e 8 Amino acid residues in the core of the N-terminal

domain of the large subunit

I36 P44 A58 C99 V124 A38 A53 861 Vl01 P141 F40 G54 G82 Y103 Y144 V42 V57 C84 I120

T a b l e 9 Amino acid residues in the core in the interior of the

a/fl-barrel in the C-terminal domain of the large subunit

K201 G171 F199 M266 Y239 I264 H327 H292 H325 Q401 V377 V399

130 S. Knight et al.

CABP CABP

p

~ HIS 3 2 7

s 292

' ~

E~U 162

f • ~$327

- - ( (

292

37) U 169

EU 162

Figure 11. The hydrogen-bond network (indicated by hatched lines) from Glu158 via His325 and His292 to His 327 in the interior of the a/fl-barrel. Alt these residues are conserved in large subunits from LsSs Rubisco molecules. A molecule of CABP is also shown. Glu158 is buried in a highly hydrophobic environment at the bottom of the barrel, while His327 is part of one of the phosphate-binding sites in the active site.

Table 10 Amino acid residues in the circular core between

helices and strands in the ~/fl-barrel in the large subunit

P168 D202 H238 N277 I326 M387 L170 V206 L240 L280 $328 L390 C172 M212 A242 C284 T342 T391 I174 R 2 1 7 M250 L289 F345 F394 Y185 F218 A254 L291 V346 $398 G186 C221 A257 I293 L349 L400 V189 A222 V262 F311 V374 F402 C192 L225 V265 A315 P376 T406 L193 A228 H267 L318 A378 A417 T200 K236 Y269 G323 I382 N420

L424

Table 11 Amino acid residues in the core between the

N-terminal and the C-terminal domain of the large subunit

L37 A132 D137 Pl51 A317 A39 L133 L138 H310 L320 I98 R134 I140 R312 $321 Vl13 L135 V145 V313 Q366 M116 E136 G150 K316

Table 12 Amino acid residues in the core of the small subunit

V30 L42 V90 II01 L33 V83 A97 F115 P40 V87 V99

Figure 12. An illustration of the subunit arrangement in the (L2)4($4) 2 Rubisco molecule. The circles represent dimers of L subunits and triangles represent S subunits. The 8 large subunits are labelled A to H, the 8 small subunits S1 to $8 with S1 to $4 at one end of the molecule and $5 to $8 at the other end. The L2 dimers are formed by large subunits AB, CD, EF and GH. The molecular 4-fold axis as well as the 2-fold axes in the molecule are indicated.

Ribulose 1,5- Bisphosphate Carboxylase Structure 131

Table 13 Hydropathy of accessible surface areas and interfaces

in spinach Rubisco A. Accessible surface areas

Entity

Accessible Non-polar Polar Charged surface area area area area

(h 2) (%) (%) (%)

S 8578 63"5 23" 1 13-4 L 19,127 58"6 22"2 19"1 L 2 28,678 56'7 21"0 22"3 Ls 103,136 57"2 21"4 21"5 LaS s 120,463 58"1 22-2 19"6

B. Areas buried at subunit interfaces Buried Non-polar Polar Charged

surface area area area area Interface (A 2) (%) (%) (%)

S1-B S1 1912 63"7 22'2 14'0 B 1760 56"2 27'3 16'5

St -CD S1 1274 65"3 20-7 14"0 CD 1109 58"2 17"0 24"7

A-B 4784 64'4 26"1 9"5 L2-L 2 1448 52"5 17"4 30"2

Accessible surface areas were calculated using the algorithm of Lee & Richards (1971). The polar surface is defined as that for uncharged O, N and S atoms, the charged surface as that for charged O and N atoms and the non-polar surface as tha t for the remaining atoms. The difference in the size of the surfaces in S and L in contact with each other depends on the fact tha t when 2 atoms of unequal size are in contact with each other the smaller atom will have a larger area excluded from solvent. Subunit labels are defined in Fig. 12.

(i) Subunit interactions within one Le dimer The L 2 dimer resembles an ellipsoid with approxi-

mate dimensions 50 A x 72 A x 105 A (Fig. 13) and has an accessible surface area of around 29,000 A 2. The dimer is formed by tight and extensive inter- actions between the two subunits, which are related by a 2-fold rotation axis. In two of the L 2 dimers, CD and GH, this symmetry axis is a true crystallo- graphic 2-fold axis along the [010] direction in the unit cell. In the other two dimers, AB and EF, the rotation axis is instead a local 2-fold axis, which is perpendicular to the [010] direction and inclined 1"8 ° to the [100] direction. We therefore have one complete L 2 dimer and two L subunits from two different dimers, as well as one $4 cluster of small subunits in the asymmetric unit.

The interface area between two L subunits in the dimer is about 4800/~2 (Table 13B) and is separated into two main contact areas (Fig. 13). One such area is between the C-terminal domains of the two sub- units, which build up the core of the dimer interface. In this contact area, which is around 900 A 2, the surfaces of the subunits in contact with each other are symmetry-related across the dimer dyad. The second contact area, which is much larger, is formed by interaction of the C-terminal domain of one subunit with the N-terminal domain of the second subunit. Due to the 2-fold symmetry of the dimer, this type of contact area occurs twice in each dimer.

Figure 13. C ~ tracing of the L 2 dimer. One subunit is shown with the N-terminal domain in light blue and the C-terminal domain in blue. In the 2rid subunit, the N-terminal domain is shown in green and the C-terminal domain in yellow.

This interface is thus by far the most important in stabilizing the dimer. I t is also of functional signifi- cance, since it involves the loops at the C-terminal end of the barrel and these loops supply most of the active-site residues.

We have defined interface residues as residues that have a lower accessibility in the oligomer under consideration compared to the isolated subunits. By this criterion, we find 136 residues from each sub- unit at the dimer interface. Of these, 44 are only marginally involved in subunit interactions, with a decrease in accessibility of less than 10~/o on forming the L2 dimer. Table 14 lists the remaining 92 residues involved in the dimer interface.

The hydropathy of the dimer interface (Janin et al., 1988) is quite normal (Table 13B). However, hydrophobic side-chains contribute somewhat less and charged side-chains somewhat more to the interface area than is usually observed in subunit interactions. There are 50 hydrogen bonds at the dimer interface, eight in the core region between the two C-terminal domains and 21 at each N-domain

132 S. Knight et al.

Table 14 A m i n o acid residues found at the intra-dimer interface

A. Residues at the interface between the C-terminal domains of the 2 subunits in the L2 dimer

Residue Aacc Residue Aaec Residue Aace

Thr246 39"1 Thr275 60'3 Asn306 11"8 Cys247 41"3 Ala276 15'5 His307 33"6 Glu248 15"4 Thr278 12"8 Gly308 I 1-3 Gly272 32"0 Thr279 36"1 Met309 10-5 Gly273 21"4 Ile301 22'1

B. Residues at the interface between the C-termi.~ml domain of one xubunit and. the N-terminal donmin of the second subunit in the L 2 dimer

Residue Aacc Residue Aacc Residue Aacc

Phel3 16"8 Thrll8 60-2 Ala244 2@6 Alal5 45"6 Ser119 19'5 Gly245 86"9 Glyl6 21"9 Vail21 17"1 Thr271 52"5 Vail7 23.4 Glyl22 49'3 Ala296 17"3 Gin45 24'9 Asn123 48"0 Met297 41"6 Val48 10-2 Phe125 17-5 Ala299 36*2 Glu60 16'0 Gly126 74"8 Val300 71"1 Set62 28.1 Phel27 19'9 Arg303 39-2 Thr63 30"7 Lys128 82"1 Gin304 36"8 Gly64 14-2 Ala129 15"0 Lys334 49-6 Thr65 31"5 Argl31 39"6 Leu335 40-9 Trp66 65"1 Lys175 21"2 Glu336 14"1 Thr67 19"3 Pro176 52"4 Gly381 25"7 Val69 31'5 Lys177 41"2 Gly404 52-5 Trp70 34.2 Leu178 36-1 Leu407 25"3 Thr71 40-4 Gly179 56-5 Gly408 47-1 Leu74 33"4 Leul80 22"0 Pro410 22"5 Thr75 28"8 Asnl84 18"6 Gly412 19'0 Tyr80 17"8 Glu204 17-1 Asn413 13.7 Aspl06 34"9 Asn205 31-0 Va1461 17"3 Leul07 21"7 Asn207 20-8 Trp462 14-6 Glul09 51 '0 Ser208 17-9 Ile465 19"7 Glull0 27.0 Gin209 20'0 Phe467 14"3 Serl 12 20'2 Pro210 39"8 Phe469 45-9 Thr 114 36"8 Phe211 20" l Pro470 38"5 Asn115 28"1 Arg213 11-2 Met472 27-9

Aacc, change in accessibility isolated L subunits. Residues > 10% are listed.

on forming the L 2 dimer from with a change in accessibility

to C-domain interface (Table 15). Twenty-s ix hydro- gen bonds are formed from charged or polar side- chains to main-chain atoms. Twelve hydrogen bonds, or a lmost one quarter , involve threonine side-chains.

Charged groups are a lmost absent in the interface area between the two C-terminal domains. Ins tead, a wealth of polar groups is found here; the fraction of the interface surface contr ibuted by polar a toms is 34~o. The residues involved are mainly from the C-terminal loops 3, 4 and 5, and from helices a3 and a4 in the a/fl-barrel. G273 in C-terminal loop 4 of one subuni t is ra ther close to the symmet ry - re la t ed residue in the second subuni t in the dimer and is, not surprisingly, conserved in all known sequences of the large subunit .

In the interior of the molecule, a t the surface of the central solvent channel, the electron densi ty

s trongly indicates t ha t there is a disulphide bond in our crystals between C247 in one subuni t and the symmet ry- re la ted C247 in the second subunit of the dimer (Fig. 14). Although this disulphide bond will enhance the s tabi l i ty of the dimer, it does not seem to be of any functional significance, since substi- tut ing the equivalent residue in Rubisco from Anacyst is nidulans by Ala has no effect on ac t iv i ty (S. Gutteridge, personal communicat ion) . There are also other pairs of cysteinyl residues in the large subunit tha t are close to each other (C84A-C99A (Sr-S y distance 4"6A) and C172A-C192A (S~-S r distance 4.0 A)), but these do not form disulphide bonds under the crystall ization conditions used in this study.

There are no ionic interactions between the two C-terminal domains. In contrast , we find three salt- bridges between residues in the N-terminM domain of one subunit and the C-terminal domain of the second L subunit in the dimer. There are thus six inter-subunit salt-links per dimer, involving E60, E l09 and E l l 0 from each N-domain and R213, R253 and K334 from each C-domain. The str ict ly conserved residue E60A interacts with K334B in the act ive site loop number 6 of the a/fl-barrei domain of subuni t B. This is the only complete ly buried ion-pair interaction between the two L sub- units in the L 2 dimer. E109A and E l l 0 A are in a different loop region in the N-terminal domain. These two residues are s tr ict ly conserved in LsS s species, whereas in the R. rubrum subuni t the corre- sponding residues are D91 and K98 and flank an insertion of six residues. Both E l09 and E l l 0 are located on the surface of the central solvent channel where E109A forms a salt-link to R253B and E110A forms a second salt-link to R213B.

Whereas the residues t ha t form the interact ions between the two C-terminal domains in general are not conserved, we find several highly conserved peptide regions a t the interface between the N-terminal domain of one subuni t and the C-terminal domain of the second subuni t in the dimer. The loop region between helix aB and s t rand tiC in the N-terminal domain contains a number of highly conserved residues (Fig. 15). This loop region interacts extensively with the C-terminal loops 1 and 8 in the second subuni t in the dimer.

Loop one of the ~/fl-barrei domain contains a s tr ict ly conserved hexapept ide in which the two act ive site lysine residues K175 and K177 are present. These two lysine residues are involved in inter-subunit hydrogen bonds to residues in the loop between helix aB and s t rand tiC in the N-terminal domain. There is one hydrogen bond between the side-chain of T71A and the carbonyl oxygen a tom of K175B, and a second main-chain to main-chain hydrogen bond between T63A and K177B. K175B in addit ion forms hydrophobic interact ions with the side-chains of T65A and V69A, whereas K177B interacts extensively with main-chain a toms of residues 60 to 64 in the A subunit . All the residues involved in these interact ions are s t r ic t ly conserved in all known sequences of the L subunit . There are

Ribulose 1,5-Bisphosphate Carboxylase Structure 133

T a b l e 1 5

Hydrogen bonds (d < 3"3 ~ ) between the 2 L subunits in the dimer

A. Hydrogen bonds between the 2 C-terminal domains

Cys247A N - Th~279B OGI Glu248A OE2 - Thr279B OGI Ala299A O - His307B NE2 Gln304A OE1 - His307B ND1 Thr279A OG1 - Cys247B N Thr279A OG1 - Glu248B OE2 His307A NE2 - Ala299B O His307A NDI - Gln304B OEI

B. Hydrogen bo~uis between the N-terminal domain in one subunit and the C-terminal domain in the second subunit in the L 2 dimer

GIn45A NE2 - Pro470B O Lys175A O - Thr71B OG1 Thr63A O - Lys177B N Lys177A N - Thr63B O Thr65A OG1 - Lys334B NZ Gtu204A OEl - Asn123B ND2 Trp70A N - Leu407B O Asn205A O - Asn115B ND2

NE1 - Asn413B OD1 Asn207A ODI - Asnll5B ND2 ThrTIA OG1 - Lysl75B O GIn209A NE2 - Leul07B 0 Leul07B O - Gln209B NE2 Thr271A 0 - Thrll4B OG1 Thrll4A OG1 - Thr271B O OG1 - ThrllSB OG1 Asnll5A ND2 - Asn205B O Met297A N - Gly122B 0

ND2 - Asn207B OD1 Arg303A NH1 - Leul30B 0 ThvllSA OGl - Thr271B OG1 NH2 - Phe125B 0 Gly122A O - Met297B N NH2 - Phe127B 0 Asnl23A ND2 - Glu204B OE1 Va1331A O - Lysl28B NZ Phel25A 0 - Arg303B NH2 Gly333A 0 - Lys128B NZ Glyl26A O - Glu336B N Lys334A NZ - Thr65B OG1 Phel27A 0 - Arg303B NHY2 0 - Lys128B N Lys128A N - Lys334B 0 Glu336A N - Gly126B O

NZ - Val33IB O Leu407A O - Trp70B N NZ - GIy333B O Asn413A OD1 - Trp70B NE1 NZ - Phe467B O Phe467A 0 - Lysl28B NZ

Leul30A 0 - Arg303B NH1 Pro470A 0 - Gln45B NE2

also several add i t i ona l i n t e r - s u b u n i t i n t e r ac t ions be tween residues in these two loops, b u t these are n o t conserved.

I n loop 8 of the barrel d o m a i n there is an a-helix of h ighly conserved residues t h a t forms p a r t of one of the p h o s p h a t e - b i n d i n g sites in the ac t ive site. The first residue in this helix, G404, is in v a n der W a a l s ' con tac t with T65, W66, T67 and V69 in the N - t e r m i n a l d o m a i n of the second s u b u n i t . W66A also in t e rac t s wi th the second residue in the helix, G405B.

The loop be twen helix aC and s t r a n d fiE in the N - t e r m i n a l d o m a i n is also invo lved in i n t e r - s u b u n i t i n t e r ac t i ons in the ac t ive site region. The s t r i c t ly conserved res idue N123A in te rac t s wi th E204B a n d H294B a t the centre of the ac t ive site. There are also several hydrogen bonds be tween this loop a nd the h ighly conserved loop 6 in the C- t e rmina l d o m a i n of the second s u b u n i t , a m o n g s t these a m a i n - c h a i n to m a i n - c h a i n hydrogen bond be tween K128A a n d K334B (Table 15).

(ii) Interaction~between the L 2 dimers in the ( L 2 ) 4 core

F r o m the 422 s y m m e t r y of the LsS s molecules i t follows tha t , in a d d i t i o n to the 2-fold i n t r a - d i m e r axes, there are local i n t e r -d imer 2-fold axes r e l a t ing

the d imers (Fig. 12). These axes lie in the p l ane no rma l to the local 4-fold axis a n d pass t h r o u g h the molecular cent re a t an angle of 45 ° to the d imer 2-fold axes. Therefore , the c o n t a c t area be tween two L 2 d imers is also 2-fold symmet r i c . This c o n t a c t area covers a r o u n d 1400 A 2, so t h a t in the (L2) 4 core a p p r o x i m a t e l y 1 0 % of the to ta l d imer area is bu r i ed in d i m e r - d i m e r interfaces. Most of the con tac t s invo lve residues from the C- te rmina l domains , a l though a few in t e r ac t i ons a t the cen t re of the interface, a r o u n d the d i m e r - d i m e r dyad , are formed be tween residues f rom the N - t e r m i n a l domains . Ta b l e 16 lists the residues in the d i me r AB t h a t are in c o n t a c t wi th the CD d imer a t t he d i m e r - d i m e r interface.

This in te r face is rich in charged groups, wi th as m u c h as 6 1 % of the in ter face area be ing contr i - b u t e d b y charged residues, whereas hyd r ophob i c residues c o n t r i b u t e on ly 15~/o. Such a high c o n t e n t of charged groups a t a s u b u n i t in te r face is ve ry u n u s u a l a n d is more like t h a t f ound for the so lvent - accessible surface of a p ro te in molecule. I n to ta l , we find 28 charged s ide-chains a t the d i m e r - d i m e r interface, wi th each d i me r c o n t r i b u t i n g e igh t posi- t ive ly charged a nd six nega t i ve ly charged residues. Seven of the pos i t ive a nd five of the nega t i ve charges are conserved in all k n o w n e u k a r y o t i c large

134 S. Knight et al.

! •

I t - ~ I

Figure 14. Final 2F,,-F¢ electron density map showing the disulphide bond between Cys247 in subunit A and Cys247 in subunit B. The 2 cysteinyl residues are symmetry-related across the 2-fold dimer axis and are at the dimer interface close to the surface of the central solvent channel through the molecule.

subunit sequences. Most of the charged residues are found either at or close to the surface of the narrow solvent channel between two L 2 dimers and one S subunit in the LaSs molecule.

There are eight clearly defined inter-dimer salt- links per dimer-dimer interface (Table 17, Fig. 16). All these salt-links are in the interior of the molecule at the surface of the central solvent channel. With the exception of E l l 0 and K146, all of the charged residues involved are found in the C-terminal domain. The salt-bridge between these two residues is buried rather deep in the crevice between the two dimers. E110 is also engaged in an intra-dimer salt- link to R213 in the second subunit of the same dimer.

Charged residues at the dimer-dimer interface that are not engaged in inter-dimer ionic inter- actions either take part in intra-dimer or intra- subunit salt-links, or form hydrogen bonds to polar groups at the interface. Residue R215 provides one example of such an interaction. This residue in subunit A interacts with the earboxyl end of helix a4 (Hol et al., 1978) in the C subunit, making hydrogen bonds to the carbonyl groups of D286C and N287C (Fig. 17(a)).

There is only one major hydrophobie interaction between neighbouring dimers. This interaction

occurs across the dimer-dimer dyad at the central part of the interface, near the surface of the mole- cule. Here, residues L105, V142, A143 and V369 are close together, forming a small hydrophobic patch on the surface of each subunit (Fig. 17(b)).

(iii) Interactions between the small subunits

The contact area between two small subunits covers about 350/~2 and is almost entirely hydro- phobic. The seven first residues from the S1 small subunit pack into a crevice formed between the B large subunit and the neighbouring small subunit $4. A small hydrophobie cluster, residues V3 and I6 in S1 and F44, W70 and Y94 in $4, is present between the two S subunits.

A second interaction area is present between the hairpin loops of two adjacent small subunits that approach each other within the central solvent channel of the active enzyme. The side-chain of D56 in S1 is in the vicinity of K57 in $4, although in our present model the side-chain of K57 is flexible and does not form a salt-link to the aspartic acid residue. The two hairpin loops, together with the N-terminal arm in $4 as well as strand flA in S1, define a bowl-shaped surface (Fig. 18) into which the tip of the elongated L 2 dimer binds.

~.,o

El

i~

I...i

c)

I •

~ •

• *

..

..

.

i IZ

~ ~

- •

• ,

..

..

E

l I~

I 13

I~

E~

C,1

E

l [;

~ E

l ~

I~1

Z~

1 ~-

'J

~ .

..

.

El

- E

l E

~ ~-

:~a

('~

* t-

') .

..

..

..

0

~r

0"

o ~

~ .

..

..

..

K

~

o

H

i--i

, i--

i i~

l •

• •

• •

• <

~r*

('}

0"

• .

..

..

..

..

.

~ I:

r'

~ 0

"

(:~

PI

* *

* H

I--

I .

..

.

~ (~

-

• .

..

..

..

E

l E

l

! ! I

t2~

I

..

..

..

..

.

~J

El

O

H

H

* H

..

..

.

< I~

O

~*

~ .

..

..

~

(~

• .

:P

• ,

• ,

..

..

~-

] ~

El

I~

on

..

..

..

..

..

:~

~-

] E

l

• *

:~

..

..

..

..

~

E~

. •

• •

..

..

..

.

~ E

;I

~J

~-~

~ ~

~-~

~ .

..

..

.

~-]

. r)

~

..

..

..

.

~-~

<:: .

..

..

..

..

.

~:

~,,

:~

.

..

..

..

.

¢)

(~

<:*

<

:,

, ~

..

..

.

H

E

~t

..

..

..

..

..

<

:

I ~

~O

~

~ ~

H ~

<:

~ :~

~0

~l

ll

ll

ll

ll

ll

~l

ll

ll

il

ll

ll

~l

ll

ll

li

ll

ll

~I

II

II

II

II

II

HI

II

II

II

II

II

~*

.•

o•

•*

•.

•~

P.

0

~0

O

"

Cr

cr

~-

~ ~

• ~

~~

~

..

..

..

..

..

<

El

~.

~

..

..

..

~

..

..

..

..

..

.

~

..

..

..

..

..

.

~e

~1

11

11

11

11

Zl

~l

ll

ll

ll

~ ~

• •

• o

, •

.~

.

..

..

.

~m

~.

~

. •

m

KK

KK

••

.•

~

~•

~

0•

••

~

I1

~1

11

11

11

11

~•

~

..

..

..

..

~-

~ .

..

.

• •

• •

h,-

p- El

El

El

cr

Ix)

El

O

O

c~

¢3 a-

~b

~0

..

..

..

.

°.

o~

<<

<<

<.

<

..

..

~

mn

• o

. .

..

.

. ,

• •

> ~

~-

~H

<H

HH

HH

H~

H~

• ~

~ .

..

.

. •

° •

~ ~

~-

~

~ ~

• -

~ •

• •

° 0

> ~

~-

0

• •

>.

• °

° •

• •

~ ~

.

~.

~.

• •

° •

• •

• ~

> .

..

..

.

°.

..

~

Oo

H<

<<

<

..

..

..

>

~0

.

<

..

..

..

..

>

~o

I •

G)

P-

o

E;

0

Z

KD

- H

H

H.

• •

• •

<:

..

..

>

..

..

..

~

o~

>

..

..

..

..

..

<

~.

~

..

..

.

°0

0~

~I

II

II

II

II

I!

<1

11

11

11

11

11

~1

11

11

11

11

1t

<~

-~

.

..

..

..

°H

..

..

o

,.

0.

~

.o

.,

o,

~.

.~

0 (~

r~

0 p-

p~

~J

~ 0

0

~rO 0 0 0 0 0 0

(D

E~ 0

m ~

..

..

..

..

.

H ~

0 O

• •

~ .

..

..

..

.

~

W

>"

> .

..

..

..

.

~n

..

..

..

..

.

°~

0

~H

-

- •

~~

<~

~

°~

o.

o.

oo

o,

o~

C~

I ~

~-

~.

~ o

~-

<

<

..

..

~

m m

~.

~.

~

..

..

..

~m

~-

.,

.

..

..

..

~

0

~H

~

..

..

..

..

<

m~

~.

~

..

..

..

..

~

0

• •

~.

~

..

..

~

m

~

• ~

.~

.

..

..

~

M.

• ~

.

• •

M.

MM

~-

~H

.

H<

<<

<

~<

~

~ .

..

..

~

-.

m

I ~

~ ~

. •

~.

~.

I'

~'

~

..

..

..

I'

~'

''

''

''

'~

I .

..

..

..

..

"

~

0

• ,

..

..

..

..

.

~0

°.

..

,o

.,

,~

~o

.,

..

..

o.

°.

.~

~~

H-

Ho

~

o,

..

oo

..

.,

.~

,o

o,

..

,.

.~

,~

~.

~

..

..

.

,,

,.

o,

,,

,~

..

..

.

X~

,~

,o

oo

,.

.o

,,

,~

~.

.o

,.

..

.,

~

~.

~

..

..

H

~o

°°

*,

°*

*,

~

~-

~

-

• •

~"

~

..

.*

o.

,.

*o

*~

~i.

<

~

II

II

II

II

II

I

~o

oo

*.

.o

,*

,~

o|

.,

=.

.,

.,

,,

~

o|

..

.,

.,

.,

,,

~

~.

.

• ,

, •

. ~

0 o

O

0

0

0

<.n

O

Cv

0

b~

Ix)

Cu

0

o

n

0

0

~rn

~r

0

bJ

t,J

0

o

O

<3

13

h,) h-

~ O

- .

I')

..

..

..

..

~,

1~

~ W

it

. H

.

..

..

.

• •

~11~

{)

• •

I0

..

..

..

..

~.

-~

1~

l:~

• G

} rJ

~ •

• tJ

) (,

') U

)-

(.,'}

• ~-

3~

ID

I~

~-

~

..

..

..

..

~

o

o

p~

~J

..

..

..

..

..

O

l:u

0

I •

~)

..

..

..

..

(~

~

~.

I .

..

..

..

..

.

:~

~

• ~

..

..

..

..

~

I ~

~.

H

.

..

..

..

.

L-'

I~

~

O

~r

..

..

..

..

..

I

L"

I)"

0

..

..

..

..

..

.

• "r,

U'~

(~

~<

• ~,

.

..

..

..

.

~-~

~"

(~

o o

. o

, •

• •

• •

• ~

0

:~.

~ .

..

..

..

.

<

'~

c~

o

I I

I I

l I

I I

I I

I

9~

bJ

w

,-]

H

* •

H

• •

• •

° L

-~

~ I-

~ 0

,~

• •

. ~

..

..

..

Q

I-

'-

I I

I I

I I

I I

I I

I

m

~ i

~ e

~ o

o ~

q o

t

I O

t

~ i

t a

o o

t o

<::.

<:

.

..

..

..

.

,~

,.,.

..

..

..

..

..

0i

t:)"

("}

<::.

<:

: .

..

..

..

.

~:

o'o

..

..

..

..

..

.

~:~

13"

l)

..j

~.<I-.I

• *

• .

..

..

L "~

.~::~

.

I.-t

..

..

..

..

I--

) I~

..

..

..

..

..

.

u"')

~.

~:

..

..

..

..

,

~m

o

..j

-.....

'~:

~'~

0

0 D

"

=~

'~

,~.~

~

~g

'.

~<

~-

~<

.

<>

~

~>

<

>

<

>

r~ .r.

-j

C: O

O

tn

~o

I~

K.

~

..

..

.

I~

.

..

.

~.

~.

~

<

..

..

..

..

.

<~

~Q

-

~Q

..

OZ

K<

.

<<

<

..

..

~.

~.

.

>~

I,

~-

-,

.,

-.

,~

~.

>

..

..

..

.

>~

oo

oo

o,

~,

oo

o>

~I

KI

II

II

II

II

~I

II

II

II

II

II

~

..

..

..

..

..

..

..

..

..

H

~.

Z

..

..

Z

.~

r,J

~-,

o

° ;I

::::

::::

::

I O

~xJ.

H-

..

..

L-"

~

C;

..

..

o

.

~

~)

H

..

..

..

..

..

~

O

13

0

0

O

• ~

D

..

..

..

..

=E

~

" ~

3

O

..

..

..

..

..

.

C)

~-

H

..

..

..

..

~

~

rD

tn

W°~

h,-

i:-, .

..

.

o

.

.

.

,~ I~

(')

O

c~

..

..

.

.

<Col-'-

.'r. ~

IO-

o

[-rj ~

~....

~

C;

~0 .

..

..

I.-3 .

..

.

<

~

I~.

E~ ~r~

~ " .

..

..

.

~"

~O

~

I..'. ,n,,

ID

C; .

..

.

.

..

..

>' ~

C)

O

I K~

) •

• •

' .

..

..

L'

~ I ~

-

O

I .

..

..

..

..

.

C;

I <

..

..

..

..

.

i.-i h..~ l-J

~

oo ..j

~

in ~

t.~

h.)

h.~

~o i-~

o

C)>.

• .

~ .

..

..

<

r)

H

..

..

..

..

t'~

o

<

..J

()

H

°

~'°

• (/}13"

tU

00

O

O

>'"

>

..

..

..

..

<:

O

..

..

..

..

..

.

:E ~

i")

tO

..J

~--] .'r" .

o

o

..

..

.

~-J

~

Z

o

L -~ .

..

..

..

.

H

W

• •

• •

° •

• •

• •

• b'~

Z

I I

I I

I i

I I

I I

I

> .

..

..

..

..

.

0 h"

Z .

..

..

..

..

.

~ I-+

-

<-

<-

>>

.

..

.

'./)

O

..

..

C) C) .

..

.

<~

C ~

o

!I ......... C~iP~

0

0o

(0

Ribulose 1,5-Bisphosphate Carboxylase Structure 139

T a b l e 16 Amino acid residues at the dimer--dimer interface

Residue Aacc Residue Aacc Residue Aacc

Ser ! 81A 25"3 Th r34B 12"9 Asp160B 40"0 Lys183A 39"0 Leul05B 26-6 Lysl61B 33-7 Pro210A 28"3 Aspl60B 16-4 Asnl63B 10-6 Arg213A 22"4 Glull0B 14"9 Arg258B 23"1 Arg215A 36"7 Val142B 14"7 Arg285B 25"6 Asp216A 44.0 Alal43B 41"5 Asp286B 41-0 Leu219A 13"7 Lys146B 60"0 Gly288B 22"8 Phe220A 20"9 Thr147B 12"8 Val369B 21"9 Lys252A 17-5 Gln 156B 2 l" l Ser370B 56"6 Glu259A 17"0 Vall57B 17'5

Aacc, change in accessibility on forming the (L2) + core from isolated directs. Residues with a change in accessibility > 10% are listed.

(iv) Interactions between large and small subunits

The small subunits are located in the crevices formed between the ends of adjacent L 2 dimers. Each subunit is in contact with three different large subunits from two different L 2 dimers as well as with two neighbouring small subunits. As is shown in Figure 19, the S1 small subunit, s i tuated between the AB and the CD dimers, makes contact with L subunits B, C and D. The N-terminal arm of S1 wraps around the B subuni t with the seven first residues packed in a crevice formed between the large subunit and the $4 small subunit , which is on the other side of the AB dimer. In addition to the contacts with the S1 and $4 small subunits, the B large subunit of this dimer also forms a small contact area with the $5 small subunit at the other end of the molecule (Fig. 19). Thus, just as each small subunit interacts with three large subunits, each large subunit is in contact with three small subunits.

The total area buried in the S -L interfaces covers about 3000 A 2 for each small subunit, with the S1-B interface contributing 1800A 2 and the S1-D interface contr ibuting 1200 A 2. The interface areas between S and L subunits show some interesting general features. Although the contact area of the small subunit shows the normal distribution between non-polar, polar and charged atoms (Janin et al., 1988), the corresponding areas from the large subunits are enriched in charged and polar a toms (Table 13B). This a symmet ry is also reflected in the relative contribution of different amino acid residue types to the contact areas: hydrophobic residues contr ibute 240/o to the small subunit contact area

T a b l e 17 Ionic interactions at the dimer-dimer interface

E110B-K146C KI61B-D216D E259A-R258C K146B-E110C K252A-D286C R258B-E259D D216A-KI61C D286B-K252D

but only 15~/o to the large subunit contact area. As much as 3 6 ~ of the L subunit contact area arises from charged residues, whereas in the small subunit they contr ibute 25%. Thus, in addition to having quite different hydropa thy profiles, the contribution of charged residues to the contact areas of both subunits is larger than is found for an average oligomeric protein (Janin et al., 1988). The different hydropa thy profiles may reflect the fact tha t the L subunit is coded by the chloroplast genome, whereas the S subuni t is nucleus-encoded.

In the following, we limit the description of the subunit interactions between S and L subunits to consider only the interactions of one small subunit, S1, with the three L subunits B, C and D, which are in contact with S1. All other S - L interactions are identical in our present model, which assumes strict 422 symmet ry of the LsSs molecule.

(v) Interactions between $1 and large subunit B

The small subuni t SI packs against the bot tom of the sift-barrel of the large subunit B of the AB dimer. Residues from S1 tha t are involved in the contact area are mainly from the N-terminal arm and the hairpin loop between strands flA and fiB, but also from helix aA and strand flD (Table 18). These parts of the small subunit make contact with residues in the C-terminal domain of the B subunit , mainly located in helices aE, a2, and ~8 as well as in loop regions on the N- te rmina l side of the sift-barrel. Helix a8 of the a/fl-barrel interacts extensively th roughout its whole length with the N-terminal arm of the small subunit. The C terminus of this helix also interacts with a second small subunit, $4. These interactions may be of functional significance, since the position of helix a8 as well as loop 8 in the active site is different with respect to the rest of the sift-barrel in the spinach Rubisco s t ructure as compared with L 2 Rubisco from R. rubrum. These structural differences extend to one of the phosphate-binding sites in the active site (Schneider et al., 1989).

Although most of the interactions at the S1-B

(Curtis & Haselkorn, 1983), (9) Anacysti8 nidulans (Shinozaki et al., 1983), (10) Alcaligenes eutrophus (Andersen & Caton, 1987), (l l) Chromatium vinosum (Viale et al., 1989) and (12) R. rubrum (Nargang et al., 1984). The spinach and R. rubrum sequences have been aligned using the X-ray structures (Schneider et al., 1989). Sequence numbers given in the top line refer to the spinach sequence. In the second line we identify residues that are buried in the interior (c) of the large subunit (less than 5% accessible), at the intra-dimer interface (D), at the dimer-dimer interface (d) and at the S-L interfaces (i). When a residue is involved in more than 1 interface it has been assigned to the interface to which it makes the largest contribution. The secondary structural elements are indicated in line three (a--a-helix, b=fl-strand). Seven regions where at least 5 consecutive residues are identical in all LsS 8 L chains are boxed. A dot indicates identity with the spinach sequence. A dash indicates a deletion. Active site residues (Table 23) are labelled by an asterisk.

140 S. Knight et al.

Figure 16. There are 8 clearly defined ionrpair interactions across each L 2 dimer--dimer interface. The C ~ tracings of the L subunits are shown in blue and the residues involved in these interactions in red.

interface consist of non-bonded apolar contacts, there are several polar interactions here, in partic- ular 13 hydrogen bonds (Table 19). Most of these hydrogen bonds involve residues from the N-terminal arm of S1 and, as is often seen at sub- unit interfaces, a majority of the hydrogen bonds involves charged side-chains. Main-chain atoms in the small subunit form hydrogen bonds to two charged side-chains from helix a8, R421B and E425B (Table 19). We also find one main-chain to main-chain hydrogen bond between the peptide nitrogen of Kl lS1 and the carbonyl group of T232B, which is the last residue in helix a2. The small subunit on the other side of large subunit B, $4, interacts extensively with this helix. Thus, helices a2 and a8 of the large subunit are both in contact with two small subunits.

There are three ion-pair interactions at the S1-B interface. One of these interactions is formed between R108 from the small subunit and D397B from the large subunit. These two residues are on the outer surface of the molecule, close to the entrance to the narrow solvent channel. The highly conserved residue El3 from the N-terminal arm of the small subunit participates in both of the remaining two electrostatic interactions at the S1-B

interface. This residue is surrounded by a number of charged groups from the large subunit and is quite buried in the subunit interface, where it is involved in an intricate network of electrostatic interactions (Fig. 20). There are four positively charged (K164, R167, K236, t~421) and three negatively charged (D198, E234, E425) residues from the same large subunit in the vicinity of El3. All the charges from the large subunit are conserved in all known sequences of LsS s Rubisco. The unbalanced positive charge in the large subunit compensates for the negative charge introduced in this area by interaction with the small subunit.

There are two major hydrophobic contact areas between helix a8 in the a/fl-barrel and the N-terminal arm of the small subunit. Residues W4 and YI7 in S1 interact with a small hydrophobic patch formed by residues on the outside of helix a8 in the B subunit. The residues involved are A414, 1)415, V418, V422 and the hydrophobic part of the side-chain of R421. The second apolar interaction between S1 and helix a8 involves residues L15 and L18 in the small subunit. These two highly conserved residues pack against the backbone atoms of residues E425, A426 and Q429 in a shallow hydrophobic pocket between the side-chains of

Ribulose 1,5- Bisphosphate Carboxylase Structure 141

ARG 285C

( . )

~L 369C~E0 i05 s / ~ A L 142C

/ ~ ,,~E U 105C 369B

~ .._VAL 369C ~ . U 105 s

i 142c

.~EU 105C

(b)

Figure 17. Examples of interactions between the AB and CD dimers. (a) Arg215 from subunit A interacts with the carboxyl end of helix a4 in the a/fl-barrel of the C subunit. (b) The hydrophobic contact area across the dimer-dimer dyad.

V422B, E425B, A426, V428 and Q429B. The side- chain of W451 in the B subunit is also in the vicinity.

(vi) Interactions between $1 and the CD dimer of large subunits

While the S1 small subunit packs against the bottom of the ~/f-barrel in the B subunit, the interactions with subunit D largely involve residues from helices ~1, ~2 and ~3 from one side of the barrel. There are also interactions with loop 8 at the C-terminal end of the ~/f-barrel in the D subunit as

well as with residues from the loop between ~B and fC in the N-terminal domain of the C subunit. Most of the residues from the small subunit that form contacts to the CD dimer are within the hairpin loop, strand fiB and the loop between fC and f D (Table 20).

There is approximately one hydrogen bond per 150A 2 interface area at the S1-CD interface (Table 21). All of the eight hydrogen bonds at this interface are formed between S1 and the D subunit. Five of these hydrogen bonds involve charged donor or acceptor groups. One such hydrogen bond

142 S. Knight et al.

Figure 18. Surface of the cluster of.4 small subunits that binds at each end of the LsS 8 molecule viewed along the 4-fold axis from the inside of the molecule. Note the concave surface between adjacent small subunits where the tips of elongated L 2 dimers bind.

is formed between E191D, which is conserved in all LaSa Rubisco molecules, and the main-chain nitrogen atom of M69 in S1.

We find three salt-bridges at the interface

Figure 19. A drawing showing that each small subunit (triangles) interacts with 3 different large subunits (half- ellipses) and each large subunit interacts with 3 small subunits in the LsS 8 molecule.

between S1 and the CD dimer. All three charged residues from the large subunit involved in these interactions are from the C-terminal domain of the D subunit. Residue E43 from the first fl-strand in the small subunit is conserved in all known sequences of the small subunit and forms a salt-link to R187D. R187 is part of a region of conserved residues in helix al of the a/fl-barrel. Several addi- tional residues in this conserved region participate in interactions with the small subunit (Table 20, Fig. 15). The N terminus of helix al packs into a crevice formed between the hairpin loop and the loop between strands tiC and flD in the small sub- unit (Fig. 21). The two strictly conserved residues E43S and R100S, together with Q l l lS , define the floor of this crevice. These three residues, together with R187D, K183D and N163B from the D and B large subunits form an ion-pair and hydrogen-bond network in the interface region between the AB and the CD dimers.

The two remaining ion-pair interactions at the S1-CD interface involve residues from helix a2 in the D subunit. There is one salt-link between E45S1 and K227D and a second, conserved salt-link between R65 in S1 and E223D.

Ribulose 1,5-Bisphosphate Carboxylase Structure 143

T a b l e 18 Amino acid residues at the interface between the small

subunit 51 and the large subunit B

Residue Aacc Residue Aacc

MetlS 35"5 Gln156B 34-8 Gtn2S 43"4 Lysl61B 27-1 Trp4S 49"0 Ash 163B 45" 1 Ile6S 32-4 Tyr165B 49"3 Lysl0S 25-8 Glyl66B 43"5 Lysl 1S 14"3 Argl94B 34"8 Glu 13S 63"2 Glyl95B 12"8 Thrl4S 55"0 Glyl96B 12"9 Leul5S 20"7 Tyr226B 10"8 Tyrl7S 62"4 Gln229B 15.4 Leul8S 48"6 Ala230B 14.9 Prol9S 31"2 Glu231B 33'7 GIn25S 15'8 Thr232B 30"9 Ala28S 28-4 Gly233B 65"7 Glu29S 24-2 Glu234B 31"3 Tyr32S 32-9 Ile235B 32" 1 VaI51S 29-4 Arg258B 15"2 Arg53S 18"2 Gly261B 66" 1 Ser58S 17"3 Pro263B 20"2 Pro59S 88"4 Gly288B 23"7 Gly60S 48"6 Asp397B 15"l Tyr62S 38" 1 Trp411 B 33"4 Arg665S 30"9 Pro415B 15"6 Arg108S 25-8 Val418B 26.2 Glnl09S 14-6 Arg421B 13-2 Vai ! 10S 35"3 VaI422B 11-8 Gin I 1 IS 28-7 GIu425B 37-2 Cys! 12S 18-0 Va1428B 11-5 Ser I 14S 21-3 Gln429B 47-2

Asn432B 53"8 Glu433B 30"0 Trp451B 24"3 Pro453B 12"1 GIu454B 17'4

Aacc, change in accessibility on forming a tentative S1-AB complex fl'om S1 and AB. Residues with a change in accessibility > 10% are listed.

In addition to the hydrophobic core between the two helices and the E-sheet in the small subunit, there arc several hydrophobic clusters on the other side of the sheet. One such cluster forms a patch on

T a b l e 19 Hydrogen bonds ( d < 3"3 ~ ) between the small subunit

51 and the large subunit B

Trp4S 0 - Arg194B NH1 Pro5S O - Arg194B NH1 Lysl0S NZ - Ala230B O

NZ - Thr232B 0

Lysl IS N - Thr232B O Phel2S O - Glu234B OE1 Glul3S O - Arg421B NH2 Thrl4S OG1 - Tyr165B 0 Leul5S N - Glu425B OE2 Tyrl7S N - GIu425B OEI Arg53S NHI - Gly261B 0

NH2 - Tyr226B OH Arg65S NHI - Lysl61B O

the surface of the small subunit and is made from the highly conserved residues I39, C41, M69, L72, P73, F75, I102, and F104. The majority of these residues are exposed in the isolated small subunit and are involved in subunit interactions with the large subunits C and D in the LaSs molecule. Three of the side-chains (P73, F75 and F104) are strictly conserved in all known small subunit sequences. These residues interact with a short 31o helix in the C subunit comprising residues 70 to 74 (Fig. 22) as well as with G412D, which is the last residue before helix a8 in the D subunit.

(vii) Interactions between large subunit B and the hairpin loops of 51 and 54

Since residues 52 to 63 in the hairpin loop are deleted in the small subunits of cyanobacterial Rubisco, it is of some interest to examine the contri- bution of these residues to subunit interactions in higher plant LaSs molecules. This part of the hair- pin loop forms a quite autonomous structure, with a large number of stabilizing hydrogen bonds between side-chain and main-chain atoms within the loop (Table 7). The hairpin loops of S1 and $4 are quite

~ RG 421L ~ R G 421L

~ - ~ ~ ~ Lys 164L

F i g u r e 20. One i m p o r t a n t i n t e r a c t i o n be t w een smal l s u b u n i t S1 a n d large s u b u n i t B. G l u l 3 in t he smal l s u b u n i t (S) packs in a h ighly cha rged p o c k e t on t h e surface of t h e large s u b u n i t (L).

144 8. Knight et al.

T a b l e 2 0 Amino acid residue8 at the interface between the small

subunit $1 and the CD dimer

Residue Aace Residue Aacc

GIu43S 10'6 Phel3C 22"0 His55S 36'2 Trp70C 22"0 Asp56S 32"0 GIy73C 52"2 Ser58S 28'0 Leu74C 39"8 Pro59S 15"8 Thr75C 19.4 Gly60S 12"7 Asn76C 23-4 Tyr61S 30"9 Serl81D 21-5 Arg65S 31'0 Lys183D 26-1 Tyr66S 91"0 Asn184D 11"6 Thr68S 37'3 Arg187D 42.8 Met69S 33"9 Tyrlg0D 14-6 Leu72S 55'0 Glul91D 24-7 Pro73S 10"5 Arg215D 11"7 Phe75S 46"5 Leu219D 36-1 Phel04S 25"3 Phe220D 10"7 Asn 106S 36"3 Glu223D 591 Lysl07S 18"8 Tyr226D 28"4 Gin 109S 69"0 Lys227D 33-6 Glnl 1 IS 16'9 Giu259D 16"8

Leu260D 21"1 Oly412D 13-8

Aace, change in accessibility on forming a tentative SI~:~D complex from S1 and CD. Residues with a change in accessibility > 10% are listed.

close together and form pa r t of a relat ively smooth, concave surface t h a t interacts with one side of helix ~2 in the B subuni t (Fig. 23) as well as with residues a t the C terminal of helices ~3 and ~4. This surface is defined by residues R53, P59 and Y62 in S1, and residues Y52, E54, H55, D56, Y61 and D63 in $4. The surface area buried between the hairpin loops and the large subuni t covers around 600 A 2 and thus consti tutes one-fifth of the total area buried between each L subunit and small subunits. The

T a b l e 21 Hydrogen bonds (d < 3"3 4 ) between the small subunit

$1 and large subunit D

Arg65S N - Glu223D OE2 Tyr668 N - Giu223D OE2

OH - " Lys183D O 0 - Lys227D NZ

Met69S N - Glul91D OE1 Met69S N - Glul91D OE1 GlnI09S OEI - Serl81D N

NE2 - Gly179D O GlnlllS OEI - Arg187D NH1

surface area in the L subuni t t ha t is buried here is most ly hydrophobic, with 6 5 % of the area being contr ibuted by apolar a toms, main ly f rom the side- chains of L162, L219, A222, Y226, I235, F256, L260, G261 and P263.

Residues 52 to 58 a t the beginning of the loop are on the surface of the central solvent channel, with most of the side-chains a t least par t ia l ly exposed to solvent. However , mos t of these side-chains are also involved in interact ions with the large subunit . There is considerable sequence var ia t ion among spe- cies in this pa r t of the loop. In contrast , residues 61 to 63 are a t the beginning of a highly conserved region of the S subunit . These residues are buried between two adjacent L 2 dimers and are shielded f rom solution by residues 52 to 58.

In cont ras t to the large number of in t ra-subuni t hydrogen bonds involving residues f rom the hairpin loop, there are only two inter-subuni t hydrogen bonds between residues in the large subuni t and the region t ha t is deleted in cyanobacter ia l small sub- units. Both these hydrogen bonds involve the side- chain of R53. There is one hydrogen bond f rom R53 in S1 to the carbonyl oxygen a tom of G261B and a

163S 163S

Figure 21. The 3 residues Glu43, Argl00 and Glnl 11 define the floor of a narrow crevice between the hairpin loop and the loop between strands ]~C and flD in the small subunit. Charged and polar residues from large subunits B and D on each side of the small subunit S1 interact with these residues in the L 2 dimer-dimer interface.

Ribulose 1,5-Bisphosphate Carboxylase Structure 145

RO 40S 0 40S

Figure 22. The 3 strictly conserved residues Pro73, Phe75 and Phel04 in the S1 small subunit interact with the 31o helix in the N-terminal domain of the C subunit.

/

a . . . . \

~ /

Figur© 23. Helix ~2 in the large subunit packs against the hairpin loops of 2 adjacent small subunits.

146 S. Knight et al.

ASP 63 ~P 6 ~

Figure 24. Interactions between Glu223 in the large subunit and residues in the hairpin loop of the small subunit.

second side-chain to side-chain hydrogen bond to Y226B. In addition, this tyrosine residue interacts extensively with H55 in the $4 small subunit and packs with the plane of the ring against the imida- zole group.

E223 is another residue from helix a2 in the B subunit that makes extensive contacts to residues in the hairpin loop. The side-chain of this residue, which is conserved in all higher plants, is in a small pocket defined by Y61, D63, R65 and Y66 in the $4 small subunit (Fig. 24). There are a number of apo- lar contacts between the hydrophobic part of the glutamate side-chain and Y61. Furthermore, there are two hydrogen bonds from the carboxyl group to the peptide nitrogen atoms of R65 and Y66 as well as a salt-link to R65.

Residue P59 is at the tip of the hairpin loop and forms one of the major interactions between this loop in S1 and the B subunit. The proline side-chain is packed into a pocket that is defined by helices a3 and a4 and the loop connecting a3 to f14 in the C-terminal domain of the large subunit (Fig. 25).

The two adjacent tyrosine residues Y61 and Y62

in the hairpin loop both take part in hydrophobic interactions with large subunits. However, they do not interact with the same large subunit, but instead Y61 from the $4 small subunit and Y62 from the S1 small subunit interact with the same large subunit, B. The side-chain of Y61 in $4 packs edge-on in a shallow pocket defined by the side- chains of L219, E223, Y226 and L260 in the B subunit. The ring is completely buried, since the other edge is covered by the beginning of the hair- pin loop. The plane of the tyrosine ring of Y62 in S1 packs against the side-chain of I235B. Furthermore, there are contacts between the hydroxyl group of Y62 and residue P263B. This ring is also buried in the interface between the hairpin loop and the L subunit.

(d) Crystal packing

The molecules in the crystal are arranged as two layers perpendicular to the c axis with the molecular 4-fold akes tilted approximately 2 ° from the c axis in the (101) plane (Fig. 26). Each molecule is in

2 ~ E U 290L

t , \

Figure 25. Interactions of Pro59 at the tip of the hairpin loop in the small subunit with residues in the large subunit.

~ EU 290L

Ribulose 1,5-Bisphosphate Carboxylase Structure 147

TrI T

7

(o)

I /4

I /4

I /4

P I / 4

P i

1/4 ID i

I/4

(b)

Figure 26. Packing of spinach Rubisco molecules in the crystals used in this study. (a) View down the c axis showing l layer of molecules centred at z = 1/4. There is a 2nd similar layer where the molecules are centred at z=3/4 but translated half the length of the a axis, as illustrated in (b), which shows a view down the b axis. The local 4-fold axes are shown as thin straight lines in (b).

contac t with a total of 12 other molecules, four within the layer and four f rom each of the layers above and below. There are two different types of solvent channels t ravers ing the whole crystal , both parallel to the e axis. The first type, which is the widest with a d iameter of app rox ima te ly 40 A, is formed at the junction of four molecules within a layer and is cont inuous with the central solvent channel of molecules in adjacent layers. The second type, which has a d iamete r of abou t 15 A, is found a t the interface between two molecules within a layer and is formed by the crevices between two neighbouring L2 dimers on the surface of each mole- chic. The channels of this type overlap between the layers and are thus also cont inuous th roughou t the crystal. The two types of channel are connected in the solvent region between the layers as well as

through the narrow solvent channel between adjacent L 2 dimers in the LsS s molecule.

In the current model, where exact local 4-fold s y m m e t r y has been imposed, the packi.ng inter- actions of the different subunits in the molecule are not equivalent. While subunits B, E, G and H have ra ther t ight interact ions with large subunits in neighbouring molecules, the crystal contacts involv- ing subunits A, C, D and F are much fewer.

The accessible surface area buried in crystal con- tacts is abou t 2700 A 2 per molecule. Most of the contacts occur between molecules within the layers and involve hydrophobic interact ions of P89, V90 and A91 from a large subuni t with their s y m m e t r y - related residues across a local dyad tha t relates two LsS s molecules. In addition, there are direct salt- links from E28 and K356 in B, E G and H to the symmet ry - re l a t ed residues in a neighbouring mole- cule. Contacts between the layers are media ted most ly b y residues in the small subunits, where residues 24 to 28 interact with the corresponding residues in a neighbouring molecule across a differ- ent local 2-fold axis. A more detailed analysis of the crystal contacts will have to awai t ref inement of the four L and S subunits in the asymmet r i c uni t with no local s y m m e t r y imposed.

(e) Heavy-atom binding sites

The heavy-a tom binding sites used in the phase determinat ion are described in Table 22. All the sites involve cysteinyl residues. With the exception of sites F and G, the relative occupancies of the sites {Table 3) seem to reflect the different accessibilities

T a b l e 22 Description of heavy-atom binding sites

Site Ligands Commen~

A C41S

B C112C

C C459L D C77S

E C172L C192L

F C84L C99L

G C427L

H C41S

Accessible from "bottom" of S subunit. The carbonyls of W70S and L72S are also possible ligands. M69S and M74S nearby

Highly charged pocket on surface at interface L-S

Surface of large subunit Surface of small subunit. M74S in

vicinity

Somewhat buried in the core between strands and helices in the C-terminal domain of L subunit but close to the surface

Somewhat buried at edge of the hydrophobic core in the N-terminal domain of the L subunit

Somewhat buried, but near surface of L subunit. M387L is also close

Quite buried. Close to S-L interface. M69S and RI87L are also in the vicinity

The labels L and S refer to the large and the small subunit, respectively.

148 S. Knight et al.

of the sites quite well, and also follow the 4-fold symmetry rather closely, There is no obvious explanation for the deviations in occupancy among the G sites. The F site, on the other hand, is partly covered by strand fiB in the N-terminal domain of the L subunit and is relatively close to E28, which is involved in cry~tal packing. The different occupan- cies seen for the F sites may thus reflect small local deviations from the 422 symmetry of the LsS s mole- cules due to the different packing arrangements of the subunits in the crystal lattice.

Three cysteinyl residues in the large subunit, C221, C247 and C284, do not bind any heavy-atoms. Of these, C247 should, in principle, be accessible from the central solvent channel but is involved in an intra-dimer disulphide bond. The other two cys- teinyl residues are buried in the circular core of the large subunit.

After refinement of our LsS s model, calculated phases were used to compute difference Fourier maps for the derivatives. The resulting KAu(CN)2 difference map showed an additional heavy-atom binding site not previously found. This site is located at the interface between two L2 dimers, close to the dimer dyad. No cysteinyl residues con- tribute to this site; instead, three arginine residues found nearby (R213, R253 and R285) may be involved in binding the negatively charged Au(CN)~" ion.

(f) The active site

We have previously given a preliminary descrip- tion of the active site of Rubisco (Andersson et al., 1989), in particular, residues that are directly involved in binding the magnesium ion and the reaction-intermediate analogue. Here, we present a more detailed description and extend the analysis to include residues in the next shell around the active site. Although these residues are not directly involved in catalysis or binding, they are important for proper positioning of residues in the active site, and hence for the function of the enzyme.

The active site is found at the C-terminal end of the a/fl-barrel with most of the active site residues being contributed by the loops connecting the fl-strands to the a-helices in the barrel (Table 23). In addition, two loop regions in the N-terminal domain of the second L subunit in the dimer supply some residues to the active site. The active site is thus

Table 23 Amino acid residues that interact (d < 4"0,4) with the transition state analogue CABP in the active site at

the C-terminal end of the a/fl-barrel in subunit ,4

Thr173A G i u 2 0 4 A L e u 3 3 5 A Gly404A Lys175A His294A Ser379A GIu60B Lys177A A r g 2 9 5 A Gly380A Thr65B Cab 201A His327 A Gly381A Trp66 B Asp203A L y s 3 3 4 A GIy403A Asn123B

located at the intra-dimer interface between the C-terminal domain of one subunit and the N-terminal domain of the second subunit.

The crystals used in this structure determination contain the quaternary complex Rubisco'CO2" Mg 2+'CABP. The CABP molecule binds in an extended conformation ~across the opening of the a/fl-barrel with two distinct phosphate-binding sites at opposite sides of the barrel (Fig. 27(a)). Most of the contacts between the protein and the reaction- intermediate analogue are polar, with all of the oxygen atoms in CABP except 0-5 participating either in hydrogen bonds or ionic interactions (Table 24). There are only a few van der Waals' contacts between the carbon main-chain of CABP and protein atoms.

The entrances to the eight active sites are on the outside of the cube-like LsS s molecule, with two active sites on each of the four faces formed by a pair of adjacent L 2 dimers. Loop 6, which is flexible in the non-activated R. rubrum molecule, closes off the entrance to the active site and buries CABP, which is completely inaccessible to solvent in the quaternary complex. CABP, in turn, completely buries the magnesium ion as well as the carbamate group on K201.

Phosphate binding site I is formed mainly by loops 7 and 8 (Fig. 27(b)). Most of the interactions involve hydrogen bonds to main-chain NH-groups in these two loops. There are hydrogen bonds to the main-chain NH-groups of G403 and G404 at the N terminus of a short a-helix in loop 8, aP, as well as to the NH-group of G381 in loop 7. In addition, two of the phosphate oxygen atoms can form hydrogen bonds to the side-chain of T65 in a loop region in the second subunit of the dimer. A second residue from this loop region, W66, is also part of this phosphate- binding site. The indole ring packs between helix aP and loop 7 and forms van der Waals' interactions with two of the phosphate oxygen atoms. In addi- tion, two positively charged side-chains in the vici- nity, K175 and K334, form salt-links to the phos- phate group. However, both these side-chains are involved in additional ionic interactions with side- chains in the active site; K175 with E204 and K334 with E60 in the second L subunit in the dimer. Furthermore, K334 interacts with the carboxyl group at C-2 of the reaction intermediate analogue.

Whereas phosphate binding site I involves several residues from both large subunits in the dimer the second site (site II) is formed almost exclusively by residues in one large subunit (Fig. 27(c)). Most of these residues are in loops 5 and 6 at the C-terminal end of the barrel. There are ionic interactions between the phosphate group and the positively charged side-chains of R295 and H327. In addition, two histidine side-chains from loop 5, H294 and H298, are found in the vicinity. One of the phos- phate oxygen atoms may also form a hydrogen bond to the carbonyl group of $379. Furthermore, there are van der Waals' interactions between the side-chain of L335 and one of the phosphate oxygen atoms. One residue from the second subunit in the

Ribulose 1,5-Bisphosphate Carboxylase Structure 149

+ + + +

W66B W66B

+

H H2

(a)

P 668 ~'~L¥ 380A

~" ~ 403A

~ A ~_'W 66B~'~ LY 380A

nLYS ~ CABP

% " ~ 403A

Fig. 27.

dimer is close to this phosphate group; the side-chain of N123 is in van der Waals' contact with 0-5, although too far away to form a hydrogen bond.

The metal-binding site is in the middle of the active site, close to the bottom where the activator C02 at the end of the side-chain of carbamylated K201 forms one of the protein ligands to the mag- nesium ion (Fig. 27(d)). The side-chain of K201 is buried in the interior of the barrel, where it inter- acts extensively with the tyrosine ring of Y239 as well as with the side-chains of T173, M266 and N401. Two additional residues in loop 2 are involved in binding the metal ion; D203 and E204. These three residues provide the only direct ligands from the protein to the metal. However, the side- chains of H294 as well as N123 in the N-terminal domain of the second subunit in the dimer are also found in the vicinity of the magnesium ion (Fig. 27(a)). A third acidic residue in loop 2, D202, which is conserved in LsS 8 Rubisco, is not part of the active site. Instead, the side-chain of D202

points away from the active site, forming a con- served salt-link to R217 in the same large subunit. Both these residues are buried in the circular core of the a/fl-barrel (Table 10).

The two metal ligands D203 and E204 also form ion-pair interactions with two lysine residues at the active site, K175 and K177 (Fig. 27(a)). The side- chain of K177 is located approximately midway between D203 and E204, and forms salt-links to both these residues. Furthermore, there is a salt-link between K175 and D203. The protein ligands in the quaternary complex thus provide one of the nega- tive charges needed to compensate for the two posi- tive charges of the divalent metal ion. The second negative charge is provided by substrate CO2 as simulated by the carboxyl group of CABP, which interacts directly with the metal ion.

The magnesium ion is also co-ordinated to two oxygen atoms from the hydroxyl groups of CABP. One of these is the hydroxyl group at C-2. Thus, both the carboxyl group and the hydroxyl group attached to C-2 of CABP form ligands to the metal.

150 S. Knight et al.

• IS 327A ~4A

(c)

~ U . 33SA 96A

• 'IS 327A

~ 03

c~BP MG

CA

U 2 0 4 ~ ° 3

~BP

( d )

Figure 27. The active site of spinach Rubisco. (a) Overview showing the reaction-intermediate analogue CABP bound at the active site. The alternative where the hydroxyl groups at C-2 and C-3 of CABP are in trans conformation is shown. All atoms within 12 A of C-2 of CABP are included. (b) Detailed view of phosphate-binding site I. (c) Detailed view of phosphate-binding site II. (d) Surroundings of the active site magnesium ion. With CABP in the conformation shown here, O-4 of CABP is a ligand to the metal, whereas if C-3 is cis to 0-2, 0-3 would form a ligand to Mg 2+ instead of 0-4.

The second ligand is either the hydroxyl group at Co3 or at C-4. At the present resolution, we can not with confidence deduce from the electron density map if the C-3 hydroxyl is cis or trans to the C-2 hydroxyl group (Fig. 28). I f the C-3 hydroxyl group is in cis position, it is a ligand to Mg 2+, whereas the C-4 hydroxyl group points in the other direction. If, on the other hand, the C-3 hydroxyl group is in trans position, it points away from the magnesium ion and instead the C-4 hydroxyl group occupies roughly the same position as the C-3 hydroxyl group in the previous alternative, and is conse- quently liganded to the metal. This ambiguity in interpreting the electron density of CABP also has implications for deductions about possible side-

chains that could function as a base in one of the early steps of catalysis; abstraction of a proton from C-3 of RuBP, as will be discussed later.

The carboxyl group at C-2 of CABP interacts with three positively charged side-chains in the active site; K175, K177 and K334, in addition to being a ligand to the magnesium ion. Furthermore, one of the carboxyl oxygen atoms makes a hydro- gen bond to the side-chain of N123 in the second large subunit of the dimer. As has been mentioned above, all these three conserved lysine residues participate in several additional electrostatic inter- actions in the active site; K175 with D203 and the P1 phosphate group; K177 with D203 and E204; and K334 with E60 in the second L subunit in the

Ribulose 1,5-Bisphosphate Carboxylase Structure 151

Table 24 Polar (d <3.5 •) and ionic (d<4.0 ,'~) interactions

between the transition state analogue CABP and protein atoms at the active site of spinach Rubisco

CABP Side- CABP Side- oxygen chain Atom oxygen chain Atom

O- l P Thr65A OG l 0-3 Ser379B OG Lys334B NZ O Gly381B N Gly380B N

O-2P Thr65A OG l 0-4 Asn 123A ND2 Lysl75B NZ His294B NE2 Gly404B N Cab201B OFI

O - 3 P Gly403B N 0-5 None O-I Lysl75B NZ O-4P Arg295B NE 0-2 Thr173B OGI NH2

Cab201B OF1 O-SP Arg295B NH2 0-6 Lys334B NZ O-6P His327B NDI O-7 Asn 123A ND2 Ser379B 0

Lysl75B NZ Lysl77B NZ

The interactions involving oxygen atoms O-3 and 0-4 have been computed assuming that 0-2 and 0-3 in CABP are in trans position (see the text). If these 2 oxygen atoms are instead in cis position relative to each other the positions of 0-3 and 0-4 are interchanged.

dimer as well as with the P l phosphate group. All three lysine residues are also involved in a number of contacts a t the dimer interface, as has been described above.

4. Discussion

(a) Quaternary structure

The L2 dimer is the minimal functional unit needed to form an act ive Rubisco molecule, since the act ive site is located a t the interface between the two L subunits . Accordingly, such dimeric Rubisco molecules are found in the photosynthe t ic bac- ter ium R. rubrum. The only other L subuni t a r r angemen t t h a t has been unequivocal ly shown to

exist is an L s aggregate (Andrews & Lorimer, 1987) which in spinach (Andersson et al., 1989) and tobacco (Chapman et al., 1987, 1988) is a t e t r amer of L 2 dimers. Such molecules are found in all eukary- otic as well as in mos t p rokaryot ic photosynthe t ic organisms. All Rubisco molecules of this t ype require S subunits to form a functional molecule of the form (L2)4($4)~. However , higher-order oligomers could, in principle, be formed f rom L~ dimers in a number of ways, a l though this would require different contac t surfaces between the dimers. For example , format ion of an (L~)z d imer of dimers with 222 s y m m e t r y requires t ha t the 2-fold axes of the two dimers coincide, and t ha t the two dimers pack against each other with homologous surfaces. Any other type of packing would lead to surfaces identical with those buried in the d i m e r - dimer interface being exposed on the surface of the molecule, with fur ther oligomerization as a possible consequence. Obviously, the relative or ientat ion of the two L2 dimers, and hence the contac t surfaces, would be very different in such a t en ta t ive (L~)~ molecule as compared to the observed (L~)4($4)2 molecule. Fur thermore , such a packing would com- pletely change a possible binding site for the small subunits compared to the (L~)4(Sa)2 molecule. In view of the high degree of conservat ion within both L and S subunits from various species, i t seems highly unlikely t ha t subuni t a r rangements other than those described here exist in vivo.

The overall shape of the LsSs molecule and the a r r angemen t of the subunits found in this s tudy are in close agreement with the results for the non- ac t iva ted Rubisco from tobacco (Chapman et ai., 1987, 1988). However , as has a l ready been pointed out by Chapman et al. (1987), these models are completely incompat ible with the notion of a sliding-layer conformat ional change as proposed by Holzenburg et al. (1987) for the Alcaligenes eutro- phus enzyme.

Electron micrographs of Rubisco f rom various sources (Andrews et al., 1981; Bowien et al., 1976;

,--'b, t- ,b,

,, ~ , t ~' ,~-- , - . . . ' ~ _ , , ~ , ~ . . . _ , .

,,%.. ~ - ' ~ , , f f ~ . . , , ,,'~ ~ • ~.~--~ .~ , - ~ - . -. ~..

Figure 28. Stereo diagram illustrating that CABP with 0-2 and 0-3 in trans conformation (thick line) as well as in ci8 conformation (thin line) are both compatible with the electron density. The final F o - F c map calculated without the contribution of CABP is shown at the level of 1 standard deviation of the map.

152 S. Knight et al.

Takabe et al., 1984b) frequently show two types of image. In one type, where the molecules are viewed down the 4-fold axis, a square or four-lobed struc- ture with a central hole representing the central solvent channel is clearly seen. The second type of image, which is much less frequent, shows a two- layered structure; these images apparently show the molecule from the side with the two layers repre- senting two adjacent L 2 dimers. A number of models for the quaternary structure of LsSs Rubisco have been proposed, based on electron microscopy and low-resulution X-ray crystallo- graphy. In all these models, the eight large subunits are arranged as two layers of four subunits each in a square array (Andrews et al., 1981; Baker et al., 1975, 1977a,b; Bowien et al., 1976, 1980). However, we now know, from the present study and from that by Chapman et al. (1987), that this model is incor- rect. The two large subunits in the L 2 dimer are highly eclipsed (Fig. 13) so that each large subunit stretches almost from one end of the molecule to the other.

There has been considerable controversy as to the position of the small subunits in the LaSs molecule. Baker et al. (1975, 1977a,b) positioned the small subunits on the periphery of the molecule, whereas in the model proposed by Bowien & Mayer (1978) the small subunits were placed on the two faces of the L a core that are perpendicular to the 4-fuld axis. Obviously, none of these models is very accurate, which reflects the limitations of the resolution obtainable with electron microscopy of single particles.

In addition to these studies, small-angle scat- tering techniques have been used to study the overall shape and size of LaSs Rubisco. The model that agreed best with the small-angle neutron scat- tering data for the spinach enzyme in H20 (Donnelly et al., 1984) was a hollow sphere with an outer radius of 56.4 A and an inner radius of 14"3 A. In 2H20, the data was best modelled by a hollow cylinder with an axial ratio of 1"06 and outer and inner radii of 46.7 and 12.9 A, respectively. This technique thus captures the gross features of the molecule, although the details are not correct.

Meisenberger et al. (1984) used small-angle X-ray scattering to determine the radius of gyration and maximum dimension of activated and non-acti- vated Rubisco from A. eutrophus. The maximum dimension was calculated to be 135 and 157 A for the activated and non-activated enzyme, respect- ively. This was taken as an indication that a rearrangement of the subunits occurred upon activation of the enzyme. Similarly, drastic changes in the hydrodynamic properties of Rubiscu from this organism (Bowien & Gottschalk, 1982) were found to be associated with activation and taken as an indication of a major conformational change, in severe contrast to the spinach enzyme (Donnelly et al., 1984). However, most of the residues involved in interactions between the two L subunits in the dimer, in particular those at the active site, are conserved in the Alcaligencs enzyme. I t is thus

highly unlikely that the structure of Rubisco from Alealigenes differs significantly from the higher plant structures. In accordance with this, a recent re-examination of the hydrodynamic properties of the Alcaligenes enzyme failed to detect any large conformational changes associated with activation (Choe et al., 1989).

(b) Subunit interactions in relation to amino acid sequence variations in different species

Amino acid sequences of the small and large sub- units of Rubisco from all species where both sequences are known are shown in Figures 10 and 15, respectively. For comparison, the sequence of the homodimeric R. rubrum subunit is also shown in Figure 15.

The overall identity among L subunits from the LsSs Rubisco molecules listed in Figure 15 is around 45~/o. I f the L subunit from the L 2 Rubisco of R. rubrum is included in the comparison, only 18 % of the residues remain identical in all the sequences. However, some regions of the sequence are much more conserved. Most of these regions are found in the C-terminal loops that form both the active site and some of the subunit interactions in the L 2 dimer. Within these loops, 41% of the residues are identical in all the L sequences. This is also reflected in the higher proportion of conserved residues within the C-terminal domain (20%) as compared to the N-terminal domain (15%). In general, the helices are much less conserved than both loop regions and fl-strands. For example, within the a/fl-barrel, only 11% of the residues found in helical regions are conserved, while in loop regions 32% and in fl-strands 27 % of the residues are identical in all the sequences. Within L sequences from the LBS s Rubisco molecules, the corresponding figures are 39 ~/o for helices, 54 ~/o for strands and 52 % for loop regions. As expected, the helices involved in subunit interactions with the small subunit are the most highly conserved with 62% identical residues in helix al and 60% in aS. Helix a5, which is involved in domain-domain interactions within the L sub- unit, is also more conserved with 54~/o identical residues in LsS a Rubisco and 27% in all the L sequences.

Within the L-chains from LaSs Rubisco mole- cules, seven regions stand out as being especially highly conserved, each containing at least five con- secutive strictly conserved residues (Fig. 15). Five of these peptide regions are in the active site loops at the C-terminal side of the ~/~-barrel. The longest of these five regions comprises residues 195 to 205. Three residues at the end of this region, K201, D203 and E204, form the magnesium-binding site in the activated enzyme. The residues at the beginning of this region are on the other hand at the bottom of the barrel where the two glycine residues G195 and G196 interact with the side-chain of Y17 in the small subunit.

The two remaining highly conserved regions of the L-chains are not close to the active site. The

Ribulose 1,5-Bisphosphate Carboxylase Structure 153

first of these regions, residues 107 to 112, is involved in subunit-subunit interactions in the L 2 dimer. Most of the interactions are formed with residues 207 to 213 at the end of loop 2, amongst these a salt- bridge between E l l 0 and R213. The second conserved region outside the active site comprises residues 419 to 423 in helix a8. These residues interact extensively with residues in the N-terminal arm of the small subunit, specifically W4, El3, L15, S16, Y17 and L18.

The small subunits are much less conserved than the large subunits, with only nine of the 123 residues in the spinach S subunit being identical in all known sequences, although many more residues are conservatively substituted. Furthermore, there is considerable variation in the number of residues in the S subunit polypeptide chains. For example, in small subunits from cyanobacteria, residues 52 to 63 in the hairpin loop are deleted, whereas in some organisms (Euglena and Chlamydomonas) the loop is considerably longer. Model building has shown that the additional residues in the hairpin loop of plant S subunits can be deleted without disrupting the rest of the subunit structure. Although the residues in this part of the loop are quite variable, a few of the interactions are conserved in all higher plants, e.g. the interaction between P59 at the tip of the loop with residues G261, V262 and L289 in the large subunit.

There are three regions of the S subunit chain where at least four adjacent residues are strictly conserved in eukaryotic species (Fig. 10). In addi- tion, a fourth conserved region is found toward the carboxy terminus of the small subunit although identities are fewer here. Most of the residues involved in subunit interactions with the large sub- units in the LsS s molecule are found in these four regions. One example of such a conserved inter- action that may also be of functional importance is the interaction between residues P73, F75 and F104 in the small subunit and a conserved 31o helix, residues 70 to 74, in a loop region in the N-terminal domain of the large subunit. I t is interesting to note that the second residue in the 31o helix, T71, forms a hydrogen bond to the carbonyl oxygen of the active site residue K175 in the second L subunit of the same dimer. The residue following K175 is P176, which has cis conformation and interacts exten- sively with the 31o helix. The conformation of these two residues at the active site, K175 and P176, may thus be influenced by interactions between the small and the large subunits in this area. Consequently, mutations of P73, F75 or F104 in the small subunit might influence catalytic activity through indirect effects, provided proper assembly can occur.

Amino acid residues that are conserved in the L subunits from higher plants but variable or different in cyanobacteria are clustered in two main areas on the surface of the large subunit. Both these areas are involved in interactions with small subunits in the LsS 8 molecule. Not surprisingly, most of the residues involved interact with the hairpin loop in the plant molecules, e.g. E223, Y226 and G261. The

second area, which is conserved in higher plants but not in prokaryotic L subunits, interacts with residues at the N terminus of the small subunit.

The N terminus of the large subunit of spinach Rubisco contains two tryptic cleavage sites; one at K8 and one at K14. Removal of the region from A9 to K14 from the large subunit of spinach (Houtz et al., 1989) and wheat (Gutteridge et al., 1986) Rubisco has been shown to reduce drastically cata- lytic activity, whereas removal of the first eight residues has no effect. In our electron density map there is no density for the first eight residues of the L subunit. I t is thus possible that these residues have been removed by proteolysis at K8, although it cannot be excluded that the N terminus is disor- dered or deviates significantly from the 4-fold local symmetry. There is, however, density for the penul- timate tryptic peptide with well-defined density starting at F13. This part of the N terminus inter- acts extensively with the 31o helix in the N-terminal domain of the same L subunit, close to the S--L interface. The side-chain of F13 is buried between the small and the large subunit and interacts exten- sively with the N terminus of the 31o helix as well as with L72 from the small subunit. The dramatic effect on activity following removal of residues 8 to 14 in the large subunit again indicates that the interactions in this area are important for catalysis.

(c) Association]dissociation of subunits

The subunit interactions in the LsS s molecule are extensive and cover almost half of the accessible surface area of the isolated subunits (Table 13). As is usually found, the largest part of the interface surfaces is hydrophobic. However, there is consider- able variation in the hydropathy of the subunit surfaces involved in these interfaces. The accessible surface of the small subunit, even those parts that are accessible to solvent in the LsS s molecule, is rather hydrophobic (Table 13). In contrast, the accessible surface area of the L 2 dimer is somewhat less hydrophobic than would be expected for a molecular entity involved in subunit interactions. As a matter of fact, the surface of the L 2 dimer is more like that of a monomer or a fully assembled oligomer (Janin et al., 1988), perhaps indicating an intermediate role for the dimer in the assembly process. This is also true for the (L2) 4 core of large subunits. I t might therefore seem somewhat surprising that it has not been possible to isolate S and L subunits from plant Rubisco without con- comitant denaturation and/or precipitation of the large subunits (Jordan & Chollet, 1985; Kawashima & Wildman, 1971; Rutner & Lane, 1967; Voordouw et al. , 1984). However, release of the small subunits would expose some highly hydrophobic patches on the surface of the large subunits, e.g. the area buried behind the hairpin loop of the small subunit. The residues that form this patch are highly conserved in plant Rubisco L-chains. In contrast, in Rubisco from cyanobacteria, where the subunits can be separated easily to give isolated S subunits and

154 S. Knight et al.

stable L 8 cores (Andrews & Abel, 1981; Andrews & Ballment, 1983; Asami et al., 1983; Ineharoensakdi et al., 1985a,b; Jordan & Chollet, 1985; Takabe et al., 1984a), several of these conserved hydrophobic residues are changed to more polar or even charged side-chains. For example, Y226, which interacts extensively with the hairpin loop, is conserved in the plant species, whereas in L-chains from cyano- bacteria the equivalent residue may be T, H, N or D. Further examples are given by G261, which is K261 in Anabaena and I235, which is R235 in Chromatium vinosum.

The conditions that promote dissociation of LsS s Rubisco into its constituent subunits vary consider- ably among different species. So, for example, the small subunits from the hypersaline cyano- bacterium Aphanothece halophytica can be almost completely removed by sedimentation at low ionic strength (Asami et al., 1983; Incharoensakdi et al., 1985a; Takabe et al., 1984a). In contrast, the enzyme from Synechococcus can be stripped of small subunits using high ionic strength and mildly acidic pH (Andrews & Abel, 1981; Andrews & Ballment, 1983), whereas C. vinosum Rubisco releases its small subunits from the L 8 core at mildly basic pH (Incharoensakdi et al., 1985b; Jordan & Chollet, 1985). A definite basis for the different behaviours cannot be established at this time. However, since most of the interactions between the L and the S subunits are conserved, it seems as if even small differences in subunit interactions can provoke large differences in the strength and nature of the associ- ation between the subunits.

In spite of being largely hydrophobie, the inter- faces between L and S subunits in the LsS s molecule contain a relatively large number of hydrogen bonds (Tables 19 and 21). Most of these hydrogen bonds are formed from charged or polar side-chains to main-chain atoms. This observation is not com- pletely surprising, since the most obvious way to form a tight interface between two subunits is by interdigitating side-chains from one subunit with side-chains from the second subunit. The side-chains will then pack side-to-side with the head groups of residues in one subunit approaching the backbone of the second subunit in the subunit-subunit interface.

(d) Assembly mutants

A number of mutations in the small subunit have been made by site-directed mutagenesis to allow residues that are essential for association with large subunits and catalysis to be identified and analysed (Fitchen et al., 1990; McFadden & Small, 1988; Voordouw et al., 1987). Many of the mutant S subunits fail to assemble with L subunits, while others, although being able to assemble, do so to a lesser extent than wild-type subunits. Since the small subunit must exert its effect on catalysis through interactions with the L subunits, it seems unlikely that changes in the S subunit affecting catalysis will be found that do not also change association between the subunits.

Fitchen et al. (1990) have described a number of mutations in the small subunit from Anabaena that prevent assembly of homologous L and S subunits into functional LsS s holoenzyme in an E. coli co- expression system. Two of these mutations, E13V and P73H, affect residues directly involved in S-L interactions. E l3 is part of an intricate network of ion-pair interactions at the S1-B interface involving three additional negatively charged-as well as four positively charged residues from the large subunit (Fig. 20). Removal of the negative charge of El3 would lead to an unbalanced positive charge being buried at the S1-B interface in case of assembly. As for the second mutation, introduction of a polar histidine side-chain in a region of conserved hydro- phobic interactions between the S and L subunit should be highly unfavourable and it is thus not surprising that this mutant is unable to form stable holoenzyme.

(e) Possible roles for the small subunit

Several plausible roles for the small subunit may be invoked, although none of these roles has been firmly established. One possible role for the small subunit, at any rate in plants, is suggested by the obvious lack of solubility of the large subunit L s core. The present study clearly shows that hydro- phobic interactions play an important role in hold- ing the LsS s molecule together. As discussed above, removal of the small subunits would make hydro- phobic patches on the surface of the L subunit accessible to solvent, thus decreasing solubility. However, this cannot be the main role of the small subunit, since the (L2) 4 cores from cyanobacterial Rubisco do not precipitate on removal of the small subunits.

A second possible role is suggested by the relative sizes of the interface surfaces in the LsS s molecule. The interface area between the L 2 dimers is rela- tively small compared to that buried between L and S subunits (Table 13) and the S subunits may thus function as a "glue" to hold the L2 dimers in the LsS 8 molecule together in the correct orientation.

Although the function of the small subunit has long been unclear, there is accumulating evidence that binding of small subunits to the Ls core some- how influences the conformation of residues in the active site (Andrews, 1988; Schneider et al., 1989). Andrews (1988) showed that the L 8 core retains approximately 1 ~o of the native enzyme activity, thus excluding direct involvement of any residues from the small subunit in catalysis. This conclusion has also been confirmed by the crystallographic studies of Rubisco from spinach (Andersson et al., 1989; this paper) and tobacco (Chapman et al., 1987). Thus, the effect of the small subunits on the catalytic activity of the enzyme must be exerted by indirect means. In this context, the highly con- served interactions between the small subunit and helix a8 in the large subunit may be relevant. A comparison between the structures of Rubisco from spinach and R. rubrum (Schneider et al., 1989)

Ribulose 1,5-Bisphosphate Carboxylase Structure 155

showed that the position of helix a8 with respect to the rest of the a/fl-barrel is different in the two structures. These structural differences extend to loop 8, which forms one of the phosphate-binding sites in the active site, and might be the basis for the different catalytic properties of LsS s enzymes as compared to R. rubrum Rubisco.

Although nothing is known about the effect of the small subunits on the negative co-operativity observed in inhibitor binding (Belknap & Portis, 1986; Johal et al., 1985; Jordan et al., 1983; Parry et al., 1985; Vater et al., 1983), it is conceivable that these effects are mediated through the small sub- units. Obviously, the small subunits are strategi- cally located between the L z dimers with ample opportunities to transmit conformational changes from one dimer to another. In relation to this, it is noteworthy that, out of the nine amino acid residues in the small subunit that are conserved in all known sequences, three (W67, E43 and R100) are found in the same region, close to the interface between two L 2 dimers. Two of these residues, E43 and R100, form part of a hydrogen-bond network and ion-pair network in the dimer-dimer interface.

(f) The role of the active site metal ion

Several of the peptide regions that form the loops at the active site have been identified by chemical affinity labelling (Hartman et al., 1987a). In addi- tion, biochemical and mutagenesis data have pin- pointed some of the residues in the active site that are important for substrate binding and catalysis. One of these residues is K201, the lysine residue that becomes carbamylated in the activation pro- cess (Donnelly et al., 1983; Lorimer, 1981; Lorimer & Miziorko, 1980; Lorimer et al., 1976). This is the last residue in strand f12 and is part of a highly conserved region that, in addition, contains a number of acidic groups (Fig. 15). The carboxyl group introduced by carbamylation of the N ~ atom of K201 forms one of the ligands to the magnesium ion in the active site and completes the metal- binding site. However, carbamylated K201 is not absolutely required for metal binding in this region, since one of the major heavy-atom binding sites in the non-activated R. rubrum enzyme is also in this region (Schneider et al., 1990).

There are two further protein ligands to the mag- nesium ion in the spinach quaternary complex: D203 and E204, both from the same conserved loop as K201. Furthermore, the carboxyl group as well as two hydroxyl groups of CABP are directly co- ordinated to the magnesium ion. Thus, in agreement with the results of electron paramagnetic resonance (e.p.r.) spectra obtained from the CuZ+-activated spinach enzyme (Brgnd~n et al., 1984a,b), only oxy- gen atoms are directly co-ordinated to the metal ion in this quaternary complex. The broadening of the nitrogen hyperfine structure seen in the e.p.r, spec- tra of the Cu 2+ activated enzyme in the absence of CABP suggests that there should be at least one nitrogen ligand to the metal in the ternary complex

I" o _,i0 _L o

CABP

0

Figure 29. The distorted octahedral co-ordination sphere around the active site magnesium ion in activated spinach Rubisco with bound CABP.

(Br~nd~n et al., 1984a). In our model, there are two nitrogen-containing amino acid side-chains in the vicinity of the magnesium ion: H294 and N123 from the N-terminal domain of the second subunit in the dimer (Fig. 27). The positions of these two side- chains indicate that one of them, or both, could be co-ordinated to a cupric ion in the ternary complex.

Although the fine details of the co-ordination geometry cannot be determined at the present reso- lution, the metal ligands in our structure occupy positions consistent with a distorted octahedral arrangement (Figs 27 and 29). In agreement with this, a highly distorted co-ordination sphere around the metal is also indicated by the e.p.r, spectra of the quaternary complexes of both L z and LsS s Rubisco with COz, CABP and various divalent metal ions (Br~nd~n et al., 1984a; Gutteridge et al., 1984; Miziorko & Scaly, 1980, 1984; Nilsson et al:, 1984; Styring & Brand,n, 1985a,b).

One obvious important function of the divalent metal ion would be to polarize the carbonyl group at C-2 of the substrate, RuBP (Andrews & Lorimer, 1987), thus facilitating the deprotonation of C-3 to form the 2,3-enediol intermediate. None of the known mutants involving the residues that bind the magnesium ion is able to catalyse this initial reac- tion (Lorimer et al., 1987). This is true even for relatively benign mutations such as replacing one of the acidic side-chains by the corresponding amide. It thus seems as if formation of the 2,3-enediol is very sensitive to the position and environment of the active-site magnesium ion. Since the metal ion co-ordinates the carboxyt group of CABP, which is an analogue to one of the reaction intermediates

156 S. Knight et al.

after CO2 addition, it seems highly probable that magnesium is also involved in stabilization of the transition states for CO2 addition as well as for cleavage of the C-2-C-3 bond.

(g) Residues that might be involved in acid-baze catalysis

The two lysine residues K175 and K177 in loop 1 are part of a strictly conserved hexapeptide {Fig. 15) that has also been labelled by active-site affinity reagents (Hartman et al., 1987a). Lorimer & Hartman (1988) have suggested that one of the lysine residues, K175, is the base that initiates both the carboxylation and the oxygenation reaction by abstracting a proton from C-3 of the substrate, RuBP. Chemical modification studies (Hartman et al., 1985) revealed an unusually low pKa of 7"9 as well as enhanced nucleophilicity for the amino group of K175. Moreover, site-directed mutations of the residue in R. rubrum corresponding to K175 (Hartman et al., 1987b; Lorimer & Hartman, 1988) resulted in mutant enzymes that did not catalyse the exchange of the C-3 proton of RuBP with sol- vent, while still being able to form a quaternary complex with CO2, Mg 2+ and CABP, leading to the conclusion that K175 is the essential base respon- sible for catalysing the enolization reaction.

From the present structure of the CABP complex, which reflects the binding of a six-carbon reaction intermediate, it is difficult to deduce which groups are involved in proton abstraction from C-3 of the five-carbon substrate, mainly because we do not know if the C-3 hydroxyl group is in cis or trans position relative to the C-2 hydroxyl group in RuBP. This uncertainty is independent of the rela- tive orientation of these hydroxyl groups in CABP, since introduction of the carboxyl group at C-2 might change this orientation. We will therefore discuss both alternatives and assume that the other regions of RuBP bind essentially in the same way as CABP.

Independent of the relative orientation of the C-2 and C-3 hydroxyl groups K175 does not, on struc- tural grounds, seem to be a good candidate for proton abstraction from C-3. I f 0-3 is in trans position, the distance from the side-chain N atom to C-3 of CABP is 5-9 A. Furthermore, the side-chain of K175 is approximately colinear with the carbon backbone of CABP, such that the C-3-H-3 bond is perpendicular to the direction between C-3 and the amino group of K175. The distance from N ~ to the H atom to be abstracted is therefore about 6 A. In order to make this distance sufficiently short for proton transfer, while avoiding collision with carbon atoms of the substrate, either the mode of substrate binding or the position of C ~ of K175 must be different from what we observe by model building of RuBP into our structure. This model is based on the assumption that the C-2 carbonyl binds to Mg 2+ and that the positions of the phosphate groups are the same as in our quaternary complex with CABP.

I f C-3 is in cis position, the situation is even worse

for K175 to abstract the proton at C-3. The hydro- gen atom then points away from the N ~ atom and is hidden below C-3. Clearly, drastic differences are then required for proton transfer to K175. I t thus seems unlikely that the N ~ atom of K175 provides the acid-base group responsible for deprotonation of C-3 in the enolization reaction. However, it is perfectly situated for protonation of the carbanion of PGA that is formed in a subsequent reaction by hydrolysis of the six-carbon reaction intermediate. In fact, K175 is the only residue that is suitably positioned to provide an acid-base group for this step. The N ~ atom of K175 is only 4"3 A from C-2 of CABP in our model and can easily be brought within 3"7 A simply by dihedral rotations of the side-chain.

If K175 does not catalyse the enolization reac- tion, then what other possible candidates does our present structure suggest? If 0-3 is in cis position to 0-2, we find N ~ of H327, N 8 of H294, OH of $379 and one of the oxygen atoms of the carbamate group in such positions that they might serve to accept the C-3 proton. If, on the other hand, O-3 is in trans position, corresponding groups are N ~ of K177, N ~ of K334 and the side-chain of N123. Even though K334 is essential in later stages of the cata- lytic mechanism, as will be discussed later, this side- chain can not be involved in deprotonation of C-3, since even drastic mutations such as K334G do not abolish the capacity to catalyse the enolization reac- tion (Hartman & Lee, 1989). Of the two remaining candidates, K177, although further away than N123, seems the most likely.

An interesting hypothesis for deprotonation of C-3 coupled to protonation of the aci-acid form of PGA may then be formulated, based on the assump- tion that K177 functions in the enolization reaction and K175 is involved in the final stereospecific protonation of the C-2 carbanion of PGA that com- pletes product formation. Initially, K 175 is deproto- hated and K177 is protonated before substrate binding has occurred. Binding of RuBP places the P1 phosphate group in the vicinity of K175. The negative charge of this phosphate group should enhance the tendency of the amino group of K175 to accept a proton by stabilization of the positive charge thus formed. Instead of abstracting a proton from C-3, which in our model is too far away, we suggest that K175 abstracts a proton from the second lysine residue, K177, which is nearby. The distance between the two amino groups is 4"9 A in our structure but can easily be reduced to below 3-0 A by small movements of the side-chains of K175 and K177. Abstraction by K177 of the C-3 proton from RuBP to create the 2,3-enediol would then regenerate the positive charge on this lysine residue. The two adjacent p'o~itive charges thus formed would be stabilized by their interactions with the metal ligands D203 and E204 in addition to the phosphate group. If, after hydration and carbon-carbon cleavage, K175 were to donate the proton initially abstracted from K177 to the carb- anion of phosphoglycerate, the cycle would be

Ribulose 1,5- Bisphosphate Carboxylase Structure 157

completed. While being more compatible with the structure, this scheme also seems to accommodate all the existing biochemical and mutagenesis data. That K175 is vital for the enolization reaction has already been clearly demonstrated (Lorimer & Hartman, 1988), whereas no information pertaining to the importance of K177 in this reaction has been published. The possible role of K175 in the final, stereospecific, protonation step could easily be tested by determining if hydrolysis of the six-carbon reaction intermediate using position 175 mutant enzymes gives 3-D-PGA as the only product or an epimeric mixture of D and L-PGA (Siegel & Lane, 1973).

(h) Possible roles of a flexible loop in the active site during catalysis

The sixth loop at the C-terminal end of the barrel contains a number of strictly conserved residues {Fig. 15). One of these residues, K334, has attracted special attention due to its low pK a and enhanced nucleophilicity (Hartman el al., 1985). A number of mutations at this position have been described (Hartman & Lee, 1989; Soper et al., 1988). None of the mutant enzymes exhibits any detectable car- boxylation activity, nor are they able to form a stable quaternary complex with CABP in the pre- sence of CO2 and Mg z+. However, these mutant enzymes still retain 2 to 6~/o of the wild-type acti- vity in the enolization reaction, showing that the metal-binding site is not greatly distorted by these mutations.

In the structure of ' the non-activated R. rubrum enzyme (Schneider et al., 1990), loop six is flexible, whereas in the spinach quaternary complex this loop has a well-defined conformation. K334 is engaged in ionic interactions with both the P1 phos- phate group and the carboxyl group at C-2 of CABP (Fig. 27). L335, the residue following K334, forms one of the few apolar interactions between the protein and the reaction intermediate, analogue.

It is well established that binding of the reaction intermediate analogue CABP to the active site of Rubisco is a two-step process. Initial reversible binding is followed by a slower, almost irreversible, step which is accompanied by conformational changes that extend to the surface of the protein (Johal et al., ]985; Pierce et al., 1980). We suggest that the initial binding of CABP induces a fixed conformation of loop 6 by interactions between loop 6 and the regions of CABP that are accessible when the loop is flexible. After this conformational change, CABP is trapped and buried, which is reflected in the extremely tight binding of CABP (Pierce et al., 1980). In our structure, loop 6 forms a "flap", which closes the entrance to the active site and buries most of CABP. Solvent accessibility cal- culations indicate that CABP is completely shielded from solution in the quaternary complex. In con- trast, when the residues in loop 6 are not included in the calculation, both the carboxyl group at C-2 and the P2 phosphate group become accessible to solvent.

In addition to the interactions with CABP, several residues in loop 6 are involved in extensive interactions across the subunit-subunit interface in the L 2 dimer. K334A, which is at the tip of the loop in the A subunit, packs in a pocket defined by the side-chains of L335A, F467A, W66B and F127B as well as the main-chain residues 380-381A and also forms a salt-link to E60B in the N-terminal domain of tile second subunit in the L 2 dimer {Fig. 30). Furthermore, the C terminus of tile large subunit, where the final 13 residues form an extended chain, packs on top of loop 6 with numerous hydrogen bonds and ion-pair interactions keeping it fixed. The C terminus thus acts as a "bolt" that locks the initially flexible loop 6 in position. Presumably, the ionic interactions between K334 at the tip of loop 6 and CABP as well as E60 would also be important for holding loop 6 closed during the later stages of the carboxylation reaction, as is also indicated by the inability of position 334 mutant enzymes to

~THR 65B

l,-,,GLu 6os \ .

N22.M

/

66B

~ ,.,,,.,~. ~THR 65B

• P 66B

Figure 30. Stereo diagram illustrating tile surroundings of K334 in loop 6 of the a/fl-barrel domain of the large subunit in the quaternary complex of activated spinach gubisco and CABP.

158 S. Knight et al.

form a stable, exchange-inert, quaternary complex with CO2, Mg ~-+ and CABP (Soper et al., 1988). This property is also lacking in a R. rubrum mutant enzyme where the residue corresponding to E60 has been changed to glutamine (Hartman & Lee, 1989).

The notion that loop 6 is flexible in the initial stages of the carboxylation reaction and then folds over to close the active site is further strengthened by the observation that binding of CABP to acti- vated spinach Rubisco changes the tryptic pattern of the L subunit so that proteotysis at K466 no longer occurs (Mulligan et al., 1988). The region around K466 in our model is at the surface of the L subunit, with the side-chain of K466 pointing into solution. The side-chain of F467 packs against the backbone of loop 6, residues G333 and K334, and also makes contact with the side-chain of K334 in the active site (Fig. 30). Movements of loop 6 would be transmitted to the C terminus of the L-chain, which accounts for the change in tryptic suscept- ibility seen on binding of CABP.

Since all K334 mutants catalyse the enolization of RuBP to some extent, while being inactive in the overall carboxylation reaction, the catalytic role of this residue, if any, must be at some later step in the reaction mechanism but still before carbon-carbon cleavage. The position of K334 in our structure, where there is a salt-link between the amino group of this residue and the carboxyl group of CABP, indicates that K334 may play a role in facilitating the addition of the gaseous substrates to the ene- diol, either by stabilizing the enediol or by direct interaction with C02 and O 2. This residue might also, in concert with the metal ion, stabilize the transition states for CO2 and O2 addition, e.p.r. spectra indicate that one of the oxygen atoms in the hydroperoxide intermediate formed by attack of 02 on the enediol (Lorimer et al., 1973) is directly co- ordinated to the active site metal ion (Br~nd~n et al., 1984b; Styring & Bri£nd~n, 1985a). Inspection of the model suggests that the bridge oxygen could occupy approximately the same position as the car- boxyl carbon atom of CABP, leaving the second oxygen free to interact with K334. Thus, the posi- tion of this residue is consistent with the idea of an active role for this residue in addition of the gaseous substrates to the 2,3-enediol. The apparent lack of addition of CO 2 or 0 2 to the enediol generated by the position 334 mutant enzymes is also indicative of such a role (Hartman & Lee, 1989).

(i) Suppressor mutants affect the flexible loop

Chen & Spreitzer (1989) have reported a mutant in the flexible loop region in the green alga Chlamydomonas reinhardtii, the catalytic effect of which may be partly suppressed by a second mutation. The first mutation, which causes V331 to be replaced by alanine, reduces the relative speci- ficity of the enzyme by 37 ~/o. A second mutation, whereby T342 is replaced by isoleucine, increases the relative specificity of the initial mutant enzyme by 33~o. V331 is the fourth residue in loop 6 and

interacts with T342 at the N terminus of helix a6 at the top of the circular core in the a/fl-barrel. The second mutation is thus strictly compensating, the space made available in the hydrophobic core by changing V331 to A331 being partially filled by the c$-methyl group of I342. The observation that mutations in this loop have an effect on the relative specificity is interesting and further strengthens the hypothesis that K334 is involved in stabilizing the 2,3-enediol of RuBP and/or addition of the gaseous substrates. The fact that cyanobacterial large sub- units are quite variable and different from the plant L chains at the N terminus of helix a6 may also have some bearing on this subject. A number of residues at the beginning of loop 6 are involved in contacts with these residues and these contacts may be important in determining both the flexibility and the exact conformation of loop 6. These interactions will be different in cyanobacteria and plants, and thus the relatively low specific activity of the cyano- bacterial Rubisco molecules as compared to the plant enzymes may be a consequence of these different interactions.

This work was supported by grants from the Swedish research councils NFR and SJFR as well as from E. I. du Pont de Nemours and Co., U.S.A. We thank the staff at the Daresbury synchrotron radiation facility U.K. for assistance with the data collection.

References

Aldrich, J.. Cherney, B., Merlin, E. & Palmer, J. (1986). Nucl. Acids Res. 14, 9534.

Anderson, K. & Caton, J. (1987). J. Bacteriol. 169, 4547-4558.

Andersson, I. Br£nd6n, C.-I. (1984). J. Mol. Biol. 172, 363-366.

Andersson, I., Tj~,der, A.-C., Cedergren-Zeppezauer, E. & Br£nd6n, C.-I. (1983). J. Biol. Chem. 258, 14088-14090.

Andersson, I. Knight, S., Schneider, G., Lindqvist, Y., Lundqvist, T., Br£nd6n, C.-I. & Lorimer, G. H. (1989). Nature (London), 337, 229-234.

Andrews, T. J. (1988). J. Biol. Chem. 263, 12213-12219. Andrews, T. J. & Abel, K. M. (1981). J. Biol. Chem. 256,

8445-8451. Andrews, T. J. & Ballment, B. (1983). J. Biol. Chem. 258,

7514-7518. Andrews, T. J. & Lorimer, G. H. (1987). In The

Biochemistry of Plants (Hatch. M. D., ed.), vol. 10, pp. 131-218, Academic Press, Orlando, FL.

Andrews, T. J., Lorimer, G. H. & Tolbert, N. E. (1973). Biochemistry, 12, 11-18.

Andrews, T. J., Abel, K. M., Menzel, D. & Badger, M. R. ( 1981 ). Arch. Microbiol. 130, 344-348.

Asami, S., Takabe, T., Akazawa, T. & Codd, G. A. (1983). Arch. Biochem. Biophys. 225, 713-721.

Baker, T. S., Eisenberg, D., Eiserling, F. A. & Weisman, L. (1975). J. Mol. Biol. 91,391-399.

Baker, T. S., Eisenberg, D. & Eiserling, F. (1977a) Science, 196, 293-295.

Baker, T. S., Suh, S. W. & Eisenberg, D. (1977b). Proc. Nat. Acad. Sci., U.S.A. 74, 1037-1041.

Barcena, J. A., Pickersgill, R. W., Adams, M. J., Phillips, D. C. & Whatley, F. R. (1983}. EMBOJ. 2, 2363-2367.

Ribulose 1,5- Bisphosphate Carboxylase Structure 159

Belknap, W. R. & Portis, A. R. (1986). Biochemistry, 25, 1864-1869.

Bowes, G., Ogren, W. L. & Hageman, R. H. (1971). Biochem. Biophys. Res. Commun. 45, 716-722.

Bowien, B. & Gottschalk, E.-M. (1982). J. Biol. Chem. 257, 11845-11847.

Bowien, B. & Mayer, F. (1978). Eur. J. Biochem. 88, 97-107.

Bowien, B., Mayer, F., Codd, G. A. & Schlegel, H. G. (1976). Arch. Microbiol. 110, 157-166.

Bowien, B., Mayer, F., Spiess, E., Pi/hler, A., Englisch, U. & Saenger, W. (1980). Eur. J. Biochem. 106, 405-410.

Br//nd6n, R., Nilsson, T. & Styring, S. (1984a). Biochemistry, 23, 4373-4378.

Br~nd6n, R., Nilsson, T. & Styring, S. (1984b). Biochemistry, 23, 4378-4382.

Bricogne, G. (1976). Acta CrystaUogr. sect. A, 32, 832-847. Briinger, T. A. (1988). X-PLOR (Version 1.5) Manual. Briinger, T. A., Kuriyan, J. & Karplus, M. (1987). Science,

235, 458-460. Chapman, M. S., Se Won Suh, Cascio, D., Smith, W. W. &

Eisenberg, D. (1987). Nature (London), 329, 354-356.

Chapman, M. S., Se Won Suh, Curmi, P. M. G., Cascio, D., Smith, W. W. & Eisenberg, D. S. (1988). Science, 241, 71-74.

Chen, Z. & Spreitzer, R. J. (1989). J. Biol. Chem. 264, 3051-3053.

Choe, H.-W., Jakob, R., Hahn, U. & Pal, G. P. (1985). J. Mol. Biol. 185, 781-783.

Choe, H.-W., Georgalis, Y. & Saenger, W. (1989). J. Mol. Biol. 207, 621-623.

Curtis, S. E. & Haselkorn, R. (1983). Proc. Nat. Acad. Sci., U.S.A. 80, 1835-1839.

Donnelly, M. I., Stringer, C. D. & Hartman, F. C. (1983). Biochemistry, 21, 4346-4352.

Donnelly, M. I., Hartman, F. C. & Ramakrishnan, V. (1984). J. Biol. Chem. 259, 406-411.

Dron, M., Rahire, M. & Rochaix, J.-D. (1982). J. Mol. Biol. 162, 775-793.

Fitchen, J. H., Knight, S., Andersson, I., Br/~nd6n, C.-I. & McIntosh, L. (1990). Proc. Nat. Acad. Sci., U.S.A. In the press.

Fluhr, R., Moses, P., Morelli, G., Coruzzi, G. & Chua, N.-H. (1986). EMBO J. 5, 2063-2071.

Gingrich, J. C. & Hallick, R. B. (1985). J. Biol. Chem. 260, 16162-16168.

Goldschmidt-Clermont, M. & P, ahire, M. (1986). J. Mol. Biol. 191,421-432.

Gutteridge, S., Sigal, I., Thomas, B., Arentzen, R., Cordova, A. & Lorimer, G. H. (1984). EM BOJ . 3, 2737-2743.

Gutteridge, S., Millard, B. N. & Parry, M. A. J. (1986). FEBS Letters, 196, 263-268.

Hartman, F. C. & Lee, E. H. (1989). J. Biol. Chem. 246, 11784-11789.

Hartman, F. C., Stringer, C. D. & Lee, E. H. (1984). Arch. Biochem. Biophys. 232, 280-295.

Hartman, F. C., Milanez, S. & Lee, E. H. (1985). J. Biol. Chem. 260, 13968-13975.

Hartman, F. C., Foote, R. S., Larimer, F. W., Lee, E. H., Machanoff, R., Milanez, S., Mitra, S., Mural, R. J., Niyogi, S. K., Smith, H. B., Soper, T. S. & Stringer, C. D. (1987a). In Plant Molecular Biology (yon Wettstein, D. & Chua, N.-H. eds), pp. 9-20, Plenum Press, New York.

Hartman, F. C., Soper, T. S., Niyogi, S. K., Mural, R. J., Foote, R. S., Mitra, S., Lee, E. H., Machanoff, R. & Larimer, F. (1987b). J. Biol. Chem. 262, 3496-3501.

Hendrickson, W. A. & Lattman, E. E. (1970). Acta Crystaltogr. sect. B, 26, 136-143.

Hol, W. G. J., van Duijnen, P. T. & Berendsen, H. J. C. (1978). Nature (London), 273, 443-446.

Holzenburg, A., Mayer, F., Harauz, G., van Heel, M., Tokuoka, R., Ishida, T., Harata, K., Pal, G. P. & Saenger, W. (1987). Nature (London), 325, 730-732.

Houtz, R. L., Stults, J. T., MulLigan, R. M. & Tolbert, N.E. (1989). Proc. Nat. Acad. Sci., U.S.A. 86, 1855-1859.

Igarashi, Y., McFadden, B. A. & EI-Gul, T. (1985). Biochemistry, 24, 3957-3962.

Incharoensakdi, A., Takabe, T., Takabe, T. & Akazawa, T. (1985a). Arch. Biochem. Biophys. 237, 445-453.

Incharoensakdi, A., Takabe, T., Takabe, T. & Akazawa, T. (1985b). Biochem. Biophys. Res. Commun. 126, 698-704.

Janin, J., Miller, S. & Chothia, C. (1988). J. Mol. Biol. 204, 155-164.

Janson, C. A., Smith, W. W., Eisenberg, D. & Hartman, F. C. (1984). J. Biol. Chem. 259, 11594-11596.

Johal, S., Partridge, B. E. & Chollet, R. (1985). J. Biol. Chem. 260, 9894-9904.

Jones, T. A. (1978). J. Appl. Crystallogr. 11,268-272. Jones, T. A. & Thirup, S. (1986). EMBO J. 5, 819-822. Jordan, D. B. & Chollet, R. (1985). Arch. Biochem.

Biophys. 236, 487-496. Jordan D. B., Chollet, R. & Ogren, W. L. (1983).

Biochemistry, 22, 3410-3418. Kabsch, W. & Sander, C. (1983). Biopolymers, 22,

2577-2637. Kawashima, N. & Wildman, S. G. (1971). Biochim.

Biophys. Acta, 229, 749-760. Knight, S., Andersson, I. & Br~nd6n, C. I. (1989). Science,

244, 702-705. Lee, B. & Richards, F. M. (1971). J. Mol. Biol. 55,

379-400. Lesk, A. M., Br~ind6n, C. I. & Chothia, C. (1989). Proteins,

5, 139-148. Lindqvist, Y. & Bri/nd6n, C.-I. (1985). Proc. Nat. Acad.

8ci., U.S.A. 82, 6855-6859. Lorimer, G. (1981)~ Biochemistry, 20, 1236-1240. Lorimer, G. H. & Hartman, F. C. (1988). J. Biol. Chem.

263, 6468-6471. Lorimer, G. H. & Miziorko, H. M. (1980). Biochemistry,

19, 5321-5328. Lorimer, G. H., Andrews, T. J. & Tolbert, N. E. (1973).

Biochemistry, 12, 18-23. Lorimer, G. H., Badger, M. R. & Andrews, T: J. (1976).

Biochemistry, 15, 529-536. Lorimer, G. H., Gutteridge, S. & Madden, M. W. (1987.).

In Plant Molecular Biology (yon Wettstein, D. & Chua, N.-H., eds), pp. 21-31, Plenum Press, New York.

Lundqvist, T. & Schneider, G. (1988). J. Biol. Chem. 263, 3643-3646.

Lundqvist, T. & Schneider, G. (1989). J. Biol. Chem. 264, 7078-7083.

Martin, P. G. (1979). Anst. J. Plant Physiol. 6, 401-408. Matsuoka, M., Kano-Murakami, Y., Tanaka, Y., Ozeki, Y.

& Yamamoto, N. (1987). J. Biochem. 102, 673-676.

Matthews, B. M. (1968). J. Mol. Biol. 33, 491-497. Mazur, B. J. & Chui, C.-F. (1985). Nucl. Acids Res. 13,

2373-2386. McFadden, B. A. & Small, C. L. (1988). Photosynth. Res.

18, 245-260. McIntosh, L. Poulsen, C. & Bogorad, L. (1980). Nature

(London), 288, 556-560.

160 S. Knight et al.

Meisenberger, O., Pilz, I., Bowien, B., Pal, G. P. & Saenger, W. (1984). J. Biol. Chem. 259, 4463-4465.

Miller, S., Janin, J., Lesk, A. M. & Chothia, C. (1987). J. Mol. Biol. 196, 641-656.

Miziorko, H. M. & Lorimer, G. H. (1983). Annu. Rev. Biochem. 52, 507-535.

Miziorko, H. M. & Sealy, R. C. {1980). Biochemistry, 19, 1167-1171.

Miziorko, H. M. & Sealy, R. C. (1984). Biochemistry, 23, 479-485.

Mulligan, R. M., Houtz, R. L. & Tolbert, N. E. (1988). Proc. Nat. Acad. Sei., U.S.A. 85, 1513-1517.

Nakagawa, H., Sugimoto, M., Kai, Y., Harada, S., Miki, K., Kasai, N., Saeki, K., Kakuno, T. & Horio, T. (1986). J. Mol. Biol. 191,577-578.

Nargang, F., MeIntosh, L. & Somerville, C. (1984). Mol. Gen. Genet. 193,220-224.

Nierzwicki-Bauer, S. A., Curtis, S. E. & Haselkorn, R. (1984). Proc. Nat. Acad. Sci., U.S.A. 81, 5961-5965.

Niisson, T., Br~:nd~n, R. & Styring, S. (1984). Biochim. Biophys. Acta. 788, 274-280.

North, A. C. T., Phillips, D. C. & Mathews, F. S. (1968). Acta Crystallogr. sect. A, 24, 351-359.

Pal, G. P., Jakob, R., Hahn, U., Bowien, B. & Saenger, W. (1985). J. Biol. Chem. 260, 10768-10770.

Parry, M. A. J., Schmidt, C. N. G., Cornelius, M. J., Keys, A. J., Millard, B. N. & Gutteridge, S. (1985). J. Expt. Bot. 36, 1396-1404.

Phillips, D. C., Sternberg, M. J. E., Thornton, J. M. & Wilson, I. A. (1978). J. Mol. Biol. 119, 329-351.

Pierce, J. & Reddy, G. S. (1986). Arch. Biochem. Biophys. 245, 483-493.

Pierce, J., Tolbert, N. E. & Barker, R. (1980). Biochemistry. 19, 934-942.

Pierce, J., Andrews, J. & Lorimer, G. H. (1986). J. Biol. Chem. 261, 10248-10256.

Priestle, J. P. (1988). J. Appl. Crystallogr. 21,572-576. Quayle, J. R., Fuller, R. C., Benson, A. A. & Calvin, M.

(1954). J. Amer. Chem. Soc. 76, 3610-3611. Rao, S. T. & Rossmann, M. G. (1973). J. Mol. Biol. 76,

241-256. Remington, S., Wiegand, G. & Huber, R. (1982). J. Mol.

Biol. 158, 111-152. Rossmann, M. G. (1979). J. Appl. Crystallogr. 12,

225-238. Rutner, A. C. & Lane, M. D. (1967). Biochem. Biophys.

Res. Commun. 28, 531-537. Sailland, A., Amiri, I. & Freyssinet, G. (1986). Plant Mol.

Biol. 7, 213-218. Schmid, M. F., Weaver, L. H., Holmes, M. A., Gruetter,

M. G., Ohlendorf, D. H., Reynolds, 1%. H., Remington, S. J. & Matthews, B. W. (1981). Acta Crystallogr. sect. A, 37, 701-710.

Schneider, G., Br~nd6n, C.-I. & Lorimer, G. (1986a). J. Mol. Biol. 187, 141-143.

Schneider, G., Lindqvist, Y., Br~ind6n, C.-I. & Lorimer, G. (1986b). EMBO J. 5, 3409-3415.

Schneider, G., Knight, S., Andersson, I., Br~ind~n, C.-I., Lindqvist, Y. & Lundqvist, T. (1990). EMBOJ. 2045-2050.

Schneider, G., Lindqvist, Y. & Lundqvist, T. (1990). J. Mol. Biol. 211,989-1008.

Shinozaki, K. & Sugiura, M. (1982). Gene, 20, 91-I02. Shinozaki, K. & Sugiura, M. (1983). Nucl. Acids Res. I1,

6957-6964. Shinozaki, K., Yamada, C., Takahata, N. & Sugiura, M.

(1983). Proc. Nat. Acad. 8ci., U.S.A. 80, 4050-4054. Siegel, M. I. & Lane, M. D. (1973). J. Biol. Chem. 248,

5486-5498. Sire, Q. A. (1959). Aeta Crystallogr. 12, 813. Soper, T. S., Mural, R. J., Larimer, F. W., Lee, E. H.,

Machanoff, R. & Hartman, F. C. (1988). Protein Eng. 2, 39-44.

Steigemann, W. (1974). Dissertation, Technische Universit~t, Miinchen.

Styring, S. & Brand,n, R. (1985a). Biochemistry, 24, 6011-6019.

Styring, S. & Brand,n, R. (1985b). Biochim. Biophys. Aeta, 832, I13-118.

Takabe, T., Incharoensakdi, A. & Akazawa, T. (1984a). Biochem. Biophys. Res. Commun. 122, 763-769.

Takabe, T., Rai, A. K. & Akazawa, T. (1984b). Arch. Biochem. Biophys. 229, 202-211.

Turner, N. E., Clark, W. G., Tabor, G. J., Hironaka, C. M. Fraley, R. T. & Shah, D. M. (1986). Nucl. Acids Res. 14, 3325-3342.

Vater, J.. Gaudszun, T., Lange, B., Erdin, N. & Salnikow, J. (1983). Z. Naturforsch., C: Biosci. 383,418-427.

Viale, A. M.. Kobayashi, H. & Akazawa, T. (1989). J. Bacteriol. 171, 2391-2400.

Voordouw, G., Van der Vies, S. M. & Bouwmeister, P. P. (1984). Eur. J. Biochem. 141,313-318.

Voordouw, G., de Vries, P. A., Van den Berg, W. A. M. & de Clerck, E. P. J. (1987). Eur. J. Biochem. 163, 591-598.

Wang, B.-C. (1985). In Methods in Enzymology (Wyckhoff, H. W., Hirs, C. H. W. & Timasheff, S. N., eds), vol. 15, pp. 90-112, Academic Press, London.

Weissbach, A., Smyrniotis, P. Z. & Horecker, B. L. (1954). J. Amer. Chem. Soc. 76, 3611-3612.

Zurawski, G., Perrot, B., Bottomley, W. & Whitfeld, P. R. (1981). Nucl. Acids Res. 9, 3251-3270.

Zurawski, G., Whitfeld, P. R. & Bottomley, W. (1986). Nucl. Acids Res. 14, 3975.

Edited by R. Huber