The differences between the structural repertoires of V H germ-line gene segments of mice and...

16
M&c&r Imnumlog~. Vol. 34, No. 16-17. pp. 1199 1214, I997 Pergamon PII: 80161~5890(97)00118-l (‘ I997 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0161-5890~97 Sl7.00 + 0.00 THE DIFFERENCES BETWEEN THE STRUCTURAL REPERTOIRES OF VH GERM-LINE GENE SEGMENTS OF MICE AND HUMANS: IMPLICATION FOR THE MOLECULAR MECHANISM OF THE IMMUNE RESPONSE JAUN CARLOS ALMAGRO,*$ ISMAEL HERNANDEZ,” MARIA DEL CARMEN RAMIREZ* AND ENRIQUE VARGAS-MADRAZO? * Instituto de Biotecnologia, Universidad National Aut6noma Mkxico, Apdo Postal 045 I O-3. Cuernavaca, Morelos 62250, Mexico; t Instituto de Investigaciones BioGgicas, Universidad Veracruzana, Araucarias 280 Col. Animas, Xalapa, Ver, 91190. Mexico. (First received 2 June 1997; accepted in revised fbrtn 4 September 1997) Abstract-Although human and murine antibodies are similar when considering their diversification strategies, they differ in the proportion by which K and ,i type chains are present in their receptive V, repertoires. It has been shown that this difference implies a divergence in the structural repertoire of the ti and i genes of these species. Nonetheless, the differences in V, have not been systematically studied. In this paper a systematic characterization of the V, structural repertoire of mice is made. so that a comparison with the V, structural repertoire of humans. described in detail elsewhere. could be properly accomplished. Our study shows the structural repertoire of mice to be dominated by canonical structure class I-2 (- 60%) while in humans the dominant one is class l--3 ( - 40%). Analysis of the evolutionary relationships between human and mice suggest that this divergence may have a functional meaning. The implications of such findings are discussed. 1~’ 1997 Elsevier Science Ltd. All rights reserved. Key words: immunoglobulin, Ig, canonical structures, VH repertoire, structural repertoire. INTRODUCTION The antigen-binding site of antibodies consists of six hyp- ervariable loops; three from VH and three from VL denoted HI. H2, H3 and Ll, L2, L3 respectively (Wu and Kabat, 1970; Kabat and Wu, 1971; Poljak et al., 1973). Although there is great sequence variability in these regions (Wu and Kabat, 1970; Kabat and Wu, 1971), it has been shown that excepting H3, the remaining five hypervariable loops have one of a small set of main- chain conformations or canonical structures (Chothia and Lesk, 1987; Chothia et al., 1989). Based on that fact, it has been found that from the total number of possible combinations of canonical structures only a few possi- bilities do exist in the known antibody sequences, named structural repertoire (Chothia et ul., 1992; Tomlinson et al., 1995; Vargas-Madrazo et al., 1995a Vargas-Madrazo et ul., 1995b; Almagro et ul., 1996). Furthermore, it has been suggested that the antigen-binding site shapes allowed by the structural repertoire correlate with the kind of antigen the antibody interacts with (Vargas-Mad- $ Author to whom all correspondence should be addressed. Tel.: (52) (73) 291605; Fax: (52) (73) 172388; e-mail: almagro (a‘ibt.unam.mx Ahhrez~iations: V,,, Variable heavy domain; VL, Variable light domain. razo et ul., 1995a; Lara-Ochoa et ul., 1996). Taken to- gether, these findings provide evidence concerning struc- tural restrictions at work in the process of antigen rec- ognition. Genetically, the structural repertoires of human and murine antibodies are generated in a similar fashion (Weill and Reynaud, 1996): Ll, L2 and most of L3 are encoded in the VL gene segments (ti and i. type), while H I and H2 are encoded in the VH gene segments (Tonegawa, 1983). In spite of this similarity, it has been noticed that the corresponding repertories of humans and mice differ in the relative proportion by which h’ and /. type chains are present in VL. In humans, roughly 60% of the V,- repertoire is x type [40 functional VK germ-line genes versus 30 functional Vj germ-line genes (Klein et ul., 1993; Tomlinson et ul., 1995; Williams et al., 1996)]. In mice, x type preponderates, being as much as 95% (Hood rt al., 1967). Such divergence implies differences in the struc- tural repertoire of humans and murine V, and V,. germ- line genes (Williams et al., 1996: Almagro et al.. 1998) and, consequently, differences in the initial structural restrictions operating to recognize different types of anti- gens. Although differences in VH are less evident, recent stud- ies we made in the rearranged VbI sequences of mice indicate that the combination of canonical structures most frequently used is the I -~2 class (combination of 1199

Transcript of The differences between the structural repertoires of V H germ-line gene segments of mice and...

M&c&r Imnumlog~. Vol. 34, No. 16-17. pp. 1199 1214, I997

Pergamon PII: 80161~5890(97)00118-l

(‘ I997 Elsevier Science Ltd. All rights reserved Printed in Great Britain

0161-5890~97 Sl7.00 + 0.00

THE DIFFERENCES BETWEEN THE STRUCTURAL REPERTOIRES OF VH GERM-LINE GENE SEGMENTS OF

MICE AND HUMANS: IMPLICATION FOR THE MOLECULAR MECHANISM OF THE IMMUNE RESPONSE

JAUN CARLOS ALMAGRO,*$ ISMAEL HERNANDEZ,” MARIA DEL CARMEN RAMIREZ* AND ENRIQUE VARGAS-MADRAZO?

* Instituto de Biotecnologia, Universidad National Aut6noma Mkxico, Apdo Postal 045 I O-3. Cuernavaca, Morelos 62250, Mexico; t Instituto de Investigaciones BioGgicas, Universidad

Veracruzana, Araucarias 280 Col. Animas, Xalapa, Ver, 91190. Mexico.

(First received 2 June 1997; accepted in revised fbrtn 4 September 1997)

Abstract-Although human and murine antibodies are similar when considering their diversification strategies, they differ in the proportion by which K and ,i type chains are present in their receptive V, repertoires. It has been shown that this difference implies a divergence in the structural repertoire of the ti and i genes of these species. Nonetheless, the differences in V, have not been systematically studied. In this paper a systematic characterization of the V, structural repertoire of mice is made. so that a comparison with the V, structural repertoire of humans. described in detail elsewhere. could be properly accomplished. Our study shows the structural repertoire of mice to be dominated by canonical structure class I-2 (- 60%) while in humans the dominant one is class l--3 ( - 40%). Analysis of the evolutionary relationships between human and mice suggest that this divergence may have a functional meaning. The implications of such findings are discussed. 1~’ 1997 Elsevier Science Ltd. All rights reserved.

Key words: immunoglobulin, Ig, canonical structures, VH repertoire, structural repertoire.

INTRODUCTION

The antigen-binding site of antibodies consists of six hyp- ervariable loops; three from VH and three from VL denoted HI. H2, H3 and Ll, L2, L3 respectively (Wu and Kabat, 1970; Kabat and Wu, 1971; Poljak et al., 1973). Although there is great sequence variability in these regions (Wu and Kabat, 1970; Kabat and Wu, 1971), it has been shown that excepting H3, the remaining five hypervariable loops have one of a small set of main- chain conformations or canonical structures (Chothia and Lesk, 1987; Chothia et al., 1989). Based on that fact, it has been found that from the total number of possible combinations of canonical structures only a few possi- bilities do exist in the known antibody sequences, named structural repertoire (Chothia et ul., 1992; Tomlinson et al., 1995; Vargas-Madrazo et al., 1995a Vargas-Madrazo et ul., 1995b; Almagro et ul., 1996). Furthermore, it has been suggested that the antigen-binding site shapes allowed by the structural repertoire correlate with the kind of antigen the antibody interacts with (Vargas-Mad-

$ Author to whom all correspondence should be addressed. Tel.: (52) (73) 291605; Fax: (52) (73) 172388; e-mail: almagro (a‘ibt.unam.mx

Ahhrez~iations: V,,, Variable heavy domain; VL, Variable light domain.

razo et ul., 1995a; Lara-Ochoa et ul., 1996). Taken to- gether, these findings provide evidence concerning struc- tural restrictions at work in the process of antigen rec- ognition.

Genetically, the structural repertoires of human and murine antibodies are generated in a similar fashion (Weill and Reynaud, 1996): Ll, L2 and most of L3 are encoded in the VL gene segments (ti and i. type), while H I and H2 are encoded in the VH gene segments (Tonegawa, 1983). In spite of this similarity, it has been noticed that the corresponding repertories of humans and mice differ in the relative proportion by which h’ and /. type chains are present in VL. In humans, roughly 60% of the V,- repertoire is x type [40 functional VK germ-line genes versus 30 functional Vj germ-line genes (Klein et ul., 1993; Tomlinson et ul., 1995; Williams et al., 1996)]. In mice, x type preponderates, being as much as 95% (Hood rt al., 1967). Such divergence implies differences in the struc- tural repertoire of humans and murine V, and V,. germ- line genes (Williams et al., 1996: Almagro et al.. 1998) and, consequently, differences in the initial structural restrictions operating to recognize different types of anti- gens.

Although differences in VH are less evident, recent stud- ies we made in the rearranged VbI sequences of mice indicate that the combination of canonical structures most frequently used is the I -~2 class (combination of

1199

1200 J. C. ALMAGRO et ul.

canonical structures in HI and H2) (Vargas-Madrazo et al., 1995a; Lara-Ochoa et al., 1996). In contrast, human V, germ-line genes, which have been thoroughly char- acterized (Cook and Tomlinson, 1995), have shown to encode predominantly canonical structure class l-3 (Chothia et al., 1992; Vargas-Madrazo et al., 1995b). We have also found that same difference at pseudogene level (Vargas-Madrazo et al., 1995b). This suggests that VH germ-line gene segments of mice and humans may encode different structural repertoires in VH too.

Such difference, however, has not been properly char- acterized, partially because the structural repertoire of the mice VH germ-line genes has not been systematically studied. A proper characterization of this subject could provide insight and additional ideas to the theories addressing the origin, organization, complexity and use of VH genes. Furthermore, if such differences in V, do exist, then taken together with the structural divergence in the repertoire of VL germ-line genes, they could shed light on the different structural constrains at work when antigen recognition takes place in human and mouse (Vargas-Madrazo et al., 1995a). In addition, such a characterization might prove useful as a criterion to choose human V, genes for humanization of murine antibodies (Poul and Lefranc, 1995).

In this paper we compiled the information published on V, gene germ-line segments of mice to characterize their structural repertoire. Comparison with its human counterpart corroborates the differences found in rearranged sequences and pseudogenes. Implications of such findings for the molecular mechanism of the immune response are discussed.

MATERIAL AND METHODS

The germ-line V, gene segments of mice

We compiled all of the Mus musculus VH gene segments reported as germ-line genes or pseudogene sequences at Genbank and LIGM, as well as in current literature up to April 1996. We found a total of 295 VH gene segments and immediately discarded 42 of them because of being duplicates (different accession numbers but identical entries) or not comprising one or both hypervariable loops (see web site http://www.ibt.unam.mx/ N almagro for a full description of the sequences).

Of the remaining 253 VH gene segments, some were identical at nucleotide level, so we considered them to be the same VH gene segment because current available information does not allow to distinguish if these sequences are recent copies of a particular VH gene seg- ment in the mice genome or if they have been sequenced more than once.

There were also present, pairs of sequences with one or two nucleotide differences (99.6% and 99.2% identities respectively). Those sequences having silent mutations (100% identical at amino acid level) were also considered to be the same gene segment. This is so because they might be alleles in different individuals or in different strains of mouse. Sequences in which the nucleotide

difference resulted in replacements (different amino acid sequences) were considered as distinct VH gene segments. Although this might seem very conservative, we relied on it because there is no established criterion to define alleles based only on the analysis of nucleotide identities. Thus, we preferred to include in the analysis all those sequences differing by at least one amino acid in order to avoid underestimating the available information.

A unique exception was made with those genes belong- ing to the S107 (V,7) family which has been well char- acterized in two strains of mice: BALB/c (Crews et al., 1981) and C57BL/lO (Perlmutter et al., 1985). In this family we have taken into account only the alleles of BALB/c (the most represented strain within the com- pilation; see below), in spite of those from C57BL/lO which differ by more than one amino acid when compared with the BALB/c sequences. In this way, we managed to finally gather 185 sequences as representative of the mice VH locus.

Classification qf the known VH gene segments in gene,fam- dies

V, gene segments in mice have been classified in 15 families based on Southern blot hybridization and sequ- encing (Brodeur and Riblet, 1984; Kofler et al., 1992; Mainville et al., 1996). Each family is represented by a prototype member defining the name of the family (Kofler et al., 1992; Mainville et al., 1996). VH sequences within families share an identity of at least 80%, whereas among those belonging to different families the identity is at most 75% (Brodeur and Riblet, 1984). Following these criteria we clustered the 185 sequences finally gath- ered into the 15 established V, families. In the case of the VH14 family, in which some members are greater than 80% identical to sequences belonging to the VH1 family, the assignment was made following the criteria estab- lished by Tutter and coworkers (199 1).

The resultant sequence alignment, as organized by families, is given in Fig. 1 and can be retrieved from web site http://www.ibt.unam.mx/ - almagro. Within VH families, the sequences are sorted according to the decreasing order or similarities they have with respect to prototype members.

Determination of,functional V, gene segments

From the 185 VH gene segments depicted in Fig. 1, 47 were reported as pseudogenes in databases or in the literature (see status column of Fig. 1). This led us to assume they had serious genetic defects and, so, were not taken into account to determine the mice V, gene segments functional repertoire. The remaining 138 VH gene segments reported as germ-line genes and potentially functional were examined to see what would their in vivo expression be. VH gene segments not expressed in vivo might have defects within the coding region hindering the formation of a stable three-dimensional V, domain. Otherwise, they may have minor genetic defects outside the coding region, for example in splicing sites, regulatory

Ig-fo

ld=

Posit

ior?

Fa

mily

' NZ

llN6

VHl

5558

VH

-186

-Za

V186

-2=/

B21c

b/B1

0Cb

C36e

C/B7

c b

VH14

5a/C

legc

= VH

186-

la

c14c

= B1

6cb

c19c

c c2

2eC

7C-0

7=

c44g

cc

c31e

c cz

oc=

C25c

; vH

28

c15c

= V2

3=

C3eg

cc/C

35ec

/C45

gc/C

44eC

B1

3c

b

VH3b

c2

2g=

B3e

S25c

b VH

6=

C46g

c c1

1c=

B20c

b c9

gcc

J558

-122

Td

c40c

c Bl

2lZb

C1

6cc

C27c

c VH

124b

/VH

124b

clo

g=

c33e

gC

C23c

c c3

aeC

C8CJ

C cz

c=

B9cb

C6

eC

p2M

5=

Cl1

= B6

C %

BB

B T

B B

B B

1111

1111

1 B

BIB1

T

I IB

B 22

2222

22

TT

B BB

B T

T B

BIB1

1 ,..

.,....

;:...,

....;:

...,..

..;:a

b...,

....f:

...,.

. . .

. ..a

bc...

.....:

...,..

..;:.

80

..,...

.,..a

bc..,

....q

p...,

Re

arra

nged

ge

ne'

statu

s'

......

......

......

......

.....

..--..

......

......

......

..--

......

......

......

......

......

......

......

...

P....

......

......

......

......

..--..

......

......

......

..--

......

......

......

......

......

. ...

......

......

......

s..

......

.--...

......

......

......

.--

. ...

......

......

......

......

......

......

......

...

......

......

......

......

.. ..-

-.....

......

......

.....-

-.....

......

......

...TS

......

......

......

H ...

.

......

......

......

. ..R

......

...--.

......

....Q

......

....--

...

......

......

......

......

....

P....

......

......

......

......

..--..

......

......

...*..

..--

......

......

......

......

......

. ...

......

......

....

..s...

......

-- ...

......

. .Q

......

....--

...

......

......

......

......

. ..I

B.

......

......

......

......

.....-

-.....

......

Q....

......

--....

..S

......

......

......

......

...

......

......

......

.. ..V

...K.

--....

+....

......

......

.--

......

......

......

......

......

......

......

...

......

......

......

......

.....

..--..

......

...Q

......

M.H

.--...

S.N

......

......

...S

......

...

B....

......

......

......

......

..--..

......

...Q.

.....M

.H.--

...S.

N....

......

.....S

...

......

......

......

......

......

.....

..--..

......

...Q.

.....M

.H.--

...S.

N....

......

.....S

..S

...

..I

......

......

......

......

.....

..--..

......

...Q.

......

...--G

.SS.

N....

......

....T

S ...

...

..I

......

......

......

......

.....

..--..

.Q...

....Q

...

. ..N

.N.--

...S.

N ...

......

......

......

......

......

.. ..T

R ...

......

......

......

s..

......

.--...

......

..Q...

...N.

N.--S

N...N

...

......

......

......

....

......

..T

......

......

......

....--

......

.....Q

......

N.N.

--SN.

..N...

......

......

S ...

......

......

......

.. ...

......

......

......

......

.. ..-

-.....

......

Q.P.

.....H

.--SN

...N.

......

......

.TS.

.....H

..

......

......

......

. M

......

.....-

-..IT

*.....

......

....Y

.--S.

...N.

......

......

.TS

......

......

......

.. ..I

...

...

..T...

......

......

......

.--...

......

..Q...

...N.

N.--S

N...N

......

......

...S.

.S

...

..I

......

..S

......

......

......

....--

S....

A....

.Q...

......

.--...

N.N.

.....G

......

.TS.

....~

..

......

......

......

......

.....

..--..

......

...Q

....

..B.N

.--SN

.R.N

......

......

...S

......

..S

...

......

......

......

......

.. ..-

-.....

......

B....

..N.Y

.--G.

SS.N

......

......

..T

......

......

......

......

. .

.T

. . .

. . .

. . .

. . .

. . .

.T

. ...

......

...

. . .

. . .

. .

. ...

...

..M

...

. .

. . .

. .

. .

......

.. ..R

. .

. .

. . .

. .S

V .R

....

......

.G

. .D

...

..K...

......

.

. . .

......

......

. .

. .

. . .

...

..P

......

. .

. I..

. ...

...

..M

...

. . .

. .

. .

. . .

......

..M

...

,‘.

. ..I

. .

. .

. . .

........

....

.G.

. .

. .

......

..v

... .G

. .

. .

.* ...

......

. .G

. . .

.

. ...

...

..v

...

.G.

. ..I

B.

. ...

...

..v

...

.G.

. ...

...

..M

...

. .

......

..M

...

...

. T

......

. .

....

T ...

....

,.D...

....

......

......

.....

..--..

......

..AQ.

.....N

.N.--

SN...

N ...

. ...

.. ..-

-.....

......

B....

..N.Y

.--G

.SS.

N

....

.....

..--..

.T...

....Q

......

D.Y.

--G..S

.N

....

.....

..--..

......

...B.

.....N

.Y.--

G.SS

.N

....

.....

..--..

.N...

....Q

......

G.Y.

--...S

.D

....

.....

..--

.. .Y

......

.Q...

R..B

.N.--

GN...

N ...

. ...

.. ..-

-...N

......

....Q

......

.--SD

SB.H

..Q.

.....

..--..

IT...

....Q

...

. ..D

.Y.--

G..S

.N

....

.....

..--

.. IT

......

.Q...

...D.

Y.--G

..S.N

...

. ...

.. ..-

-.....

......

Q....

..B...

--SDS

Y.N.

.Q.

.....

..--..

......

...Q.

......

.H.--

SDSD

.N..Q

. ...

.. ..-

-.....

......

Q....

..B...

--SDS

Y.N.

.Q.

.....

..--

......

....

.Q...

.....H

.--SD

SD.N

..Q.

.....

..--..

......

...Q.

......

.H.--

SDSD

.N..Q

. ...

.. ..-

-..IT

....S

..Q...

...D.

Y.--G

..S.N

...

. ...

.. ..-

-..IT

...

.. ..Q

......

DTH.

--G..S

.N

....

...

N...-

-..IN

...L.

..Q...

...D.

Y.--G

..S

......

...

.. ..-

- ..

IN...

L...Q

......

D.Y.

--G..S

.N

....

.....

..-

.. IN

...

...

.Q...

...N.

Y.--G

.SS.

N ...

. ...

.. ..-

-.....

......

Q....

....H

.--SD

SD.N

..Q.

.....

..--

.. IN

......

.Q...

...D.

Y.--G

T.I.N

...

. .G

.

. . .

. .

.s....

. .

. .

.TS.

...

. .

. ..T

S....

.

. . .

. T

S....

.

. . .

. T

S..

.T

. .

. .

..s...

. .

. . .

..s.

...

. .

. . .

TS...

. .

. . .

TS.

...

. .

. . .

.s....

.

. . .

..s.

...

. .

. . .

.s....

.

. . .

..s.

...

. .

. . .

.s....

. .

.

TS...

. .

. . .

.TS.

...

TS...

. .

. TS.

...

TS...

. .

. .

. ..s

....

L.TS

....

. ...

.....

v ...

...

...

..M

...

QVQL

QQPo

PgLv

KW;n

svKL

SCKn

sGYT

FTS-

-Ym

HWVK

QRPG

RGLE

WIG

RIDP

--NSG

GTKY

NBKF

KSKA

TL?W

KPSS

TAm

QLSS

LTSB

DSAV

YYCA

R TO

99

Bll-1

4

cyd-

1

TO77

163.

100

T210

RF-4

PA

N H2

0-A1

5

L3

11D

MRA

'IH

(0)

(15)

(17)

(0)

(1)

(5) (4)

F F NFSD

NF

PS

PS

NF

SD

NF

PS

NF

P NF

NF

PS

NFSD

F NF

PS

NF

NF

P NF

NF

P NF

PS

NF

NF

SD

F (1

0)

F NF

(2)

F NF

PS

NF

NF

NF

NF

NF

NF

(0)

F PS

NF

5 6 7 8 9 10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

Fig.

l(a).

Ig-fo

ld=

Posi

timP

Fam

ily1

Nam

e' VH

I 55

58

(con

tinue

d)

B14e

b/B2

cb/B

5eb

VHIO

Za

VH33

b

VH5p

H3

0 PC

DPL.

lb

VH

104B

b/VH

104B

b C2

6CC

BlFb

HE

H9

b B1

3eb

VH5b

H1

3-1

b

VH10

5 b

H13-

3b

J558

-43y

d VH

3a

B23c

b/B1

8cb

VH-Id

-lid/

V(HJ

Id(C

R)d

3558

-28

VH31

b EJ

1cb

5558

-83

d

VHlO

hb

pMll=

VH

lllb

H16b

55

58-1

.3

d

VGAM

3-om

B2

6cb

B4cb

M

H VH

-Id-7

d/VH

-Id-1

4d

BALB

71

C57G

5 VA

RlOO

BA

LB17

BA

LB67

C57C

2~~C

57G6

37

All

BALB

C C5

7Clk3

/C57

G3/C

57Gl

4/C5

7C9

BB

B T

B B

B B

1111

1111

1 B

BIB1

T

I IB

B 22

2222

22

TT

B BB

B T

T B

BIB1

1 ,...,.

...;~

...,..

..;p.

..,...

.;pab

...,..

..fP.

..,...

.;P.a

bc..,

...

..,...

.;:...

,....;

:..bc

..,...

.;:...

, Re

arra

nged

ge

ne"

statu

s+

QVQL

QQPG

ABLV

KPGn

sVKL

SC~G

YTFT

S--Y

WM

HWVK

QRPG

RGLB

WIG

RIDP

--NSG

GTKY

NBKF

KSKA

TLTM

KPSS

TAYM

QLSS

LTSB

DSAV

YYCA

R ...

......

......

....

M...

......

..--..

IN...

....Q

......

D.Y.

--GR.

I.N...

......

...L.

TS

......

...

H ...

......

......

.. .V

......

.....-

-.....

......

Q ...

...

..H.--

SDSD

.N..Q

...G.

......

.S

......

......

......

.. ..I

...

......

K.

......

......

......

..--..

....B

*...Q

......

E.N.

--SN.

..N...

...R.

......

.S...

......

......

......

T I

T....

.M...

......

..--.T

...

.....

.Q...

...Y.

N.--S

..Y.N

..Q...

D....

.A..S

...

......

......

......

.

....

..S...

...T

....

.M...

......

..--.T

...

....

..Q...

...Y.

N.--S

..Y.N

..Q...

D....

.A..S

...

......

......

......

.. ...

......

V.

.RH.

......

......

....--

S....

A...H

.Q...

...B.

H.--.

..N.N

......

G....

....S

.....V

D ...

......

......

. ...

.....

SV..R

..T...

......

......

--....

.A...

..Q...

...B.

H.--.

C.NI

N....

..G...

....T

S....

.VD

......

......

....

......

......

......

......

.....

..--

......

......

......

...

---PY

SDI..

S....

N....

......

N....

.H

I ...

......

....

..T

......

....

.N...

--..IN

...L.

..Q...

...D.

Y.--G

..S.N

......

......

..TS

......

...

. ..S

.P

......

.. .R

I.....

......

--.YI

......

..Q...

...W

.Y.--

GNVN

......

..G

.....

...

S.P.

......

...M

......

.....-

-.YI..

......

Q....

..W.Y

.--GD

.S...

.....G

.T

. ..A

..B

...S.

PQ...

......

I.....

..S...

--....

......

.Q...

..AM

...--S

DSB.

.*.Q.

......

.....S

...

......

'...

......

..NT

....

.M...

......

..--.T

......

.L.Q

......

Y.N.

--S..Y

.N..Q

...D.

....A

..S

......

......

......

.....

...

S.P

.....

..L..I

......

.....-

-.DIN

......

.Q...

...W

.Y.--

GD.S

......

..G...

..A..S

......

......

..~

.. ...

...

S.P.

......

...I..

......

...--.

YI...

.....Q

......

Y.Y.

--RD.

S.N.

.....G

.....A

.TS.

......

......

......

P ...

...

...

SA...

AR...

...M

......

.....-

-.T...

......

Q....

..Y.N

.--S.

.Y.B

..Q...

D.T.

..A..S

...

......

......

......

.. ...

. ..S

.P

......

. ..R

I.....

......

--.NI

......

..Q...

...W

.Y.--

GD.N

......

..G.T

...A.

.S...

......

......

....F

...

...

......

...

R..S

......

......

...--.

..D...

....Q

......

N.Y.

--SDS

B.H.

.Q...

D....

....S

...

......

......

......

.. ...

......

...

M...

...M

......

....D

--....

......

.Q...

...T.

.T--S

DSY.

S..Q

...G.

......

BS

......

...

S...M

......

.....-

-.GIN

......

.Q...

...Y.

N.--G

N.Y.

......

.G.T

......

S....

....R

......

....P

...

...

...

S....

.R..T

...V.

......

V..N

--.LI

B....

...Q.

.....V

.N.--

G....

N....

..G...

..A..S

......

......

.D...

..F

...

.A...

.S...

......

...M

......

.S...

--.YI

......

..QE.

....*.

FL--G

..N...

.....G

.....A

.TS.

......

......

.....H

F ...

...

......

. ..R

......

......

......

-- ..

IN...

....Q

...

. ..N

.Y.--

LDSN

.N..Q

...D.

......

.S

......

...

....

..s

......

...

..RI..

.T...

....--

.NI..

..B...

Q....

..W.Y

.--GD

.N...

.....G

.T...

A..S

......

......

......

.P

...

B ...

..S

.P...

......

.I....

......

D--.N

......

SH.K

S....

.Y.Y

.--YN

...G.

.Q...

......

..NS.

.....B

...

......

......

. ...

. ..s

...

. .R

..T...

M...

.A...

..N--.

.IG...

....H

......

D.Y.

--GG.

Y.N.

.....G

.....A

.TS.

......

......

....I

.....

......

S.

....R

..T...

K....

.....A

N--..

IG...

....H

......

D.Y.

--GD.

V.N.

.....A

.....A

..S...

...B.

.R...

.....*

...

. B

...

..S.P

......

....I.

......

...D-

-.N...

...SH

.KS.

....Y

.Y.--

YN...

G..Q

......

.....N

S....

..DVR

...

.....

E....

.S...

..R..S

......

.T...

....--

.GIN

......

.Q...

...Y.

YI--G

N.N.

B....

......

..S.T

S....

..B...

......

.I.F

...

B....

.S.T

V.AR

......

M...

T....

...--.

......

....Q

...

. ..A

.Y.--

GNSD

.S..Q

...G.

.K..A

VTSA

.....B

.....N

......

..T.

......

....

..R

......

......

. .S

...--.

..N...

....Q

......

M.H

.--SD

SB.R

L.Q.

..D...

.....S

...

......

...

......

......

S.

......

....N

...--.

.IN...

L..Y

Q.I..

.*D.Y

.--G.

.S.N

......

......

..TS

......

...

S....

...T.

......

--.GI

N....

...Q

......

Y.

Y.--G

N.Y.

A...Q

..G...

....T

S....

....R

...

.. S.

......

T....

...--.

GIN.

......

Q....

..Y.Y

.--GN

.Y.A

...Q.

.G...

..S.T

S....

....R

...

....

B....

.P

......

......

I

......

...

.D--.

N....

..SH.

KS...

..G.N

.--.N

.A.S

..Q...

G....

....S

......

B .R

B.

....S

.P

......

.. ..M

......

.K..D

--.Y.

.....S

H.KS

...

..D

.N.--

.N...

S..Q

...G.

......

.S

......

..N

...

. ..S

.P

.. .R

..L...

......

..I.IT

--...N

......

.Q...

...Q.

F.--A

..S.N

...M

.EG.

......

TS

......

......

......

.....

B....

.s.P

......

....I.

..T...

...B-

-.T...

...SH

.KS.

....G

.N.--

.N...

S..Q

...G.

......

.S...

...B

.R

B....

.F.P

......

....I.

......

...D-

-.N.D

....S

H.KS

.....D

.N.--

.N...

I..Q.

..G...

.....S

......

E .R

B.

....S

.P...

......

.I....

......

D--.Y

.D

.. ..S

H.KS

.....D

.N.--

.N...

I..Q.

..G...

.....S

......

B .R

...

. L.

S ..

..M...

....I.

...T.

...S.

--..IB

......

.H...

...K.

L.--G

..S.N

......

G..K

F.A.

IS.N

...

......

......

......

B.

....S

.P...

....L

..I

.. .T

......

B--.T

......

SH.K

S ...

..G

.N.--

.N...

S..Q

...G.

......

.S...

...E

.R

B....

.s.P

......

....IP

......

...D-

-.N.D

...

. SH

.KS.

....D

.N.--

.N...

I..Q.

..G...

.....S

......

B .R

R A003

=40/

5G7

13

L77

H72

A 111.

68

MRA

llH

0~0-

2 AS

WAl

CO

17-1

AC

RF-2

H163

-130

H9

L2

11c

anti-

(cyd

-1)

llF6

MO

eCl0

4B

129

H 19.1

.2

NF

NF

PS

PS

(3)

F PS

NF NF

SD

(2)

F (4

) F

(11)

F PS

;SD

(12)

F

(4)

F (9

) F

(4)

F (1

4)

F (0

) F

(1)

F PS

(2)

F NF

NF

(0)

F PS

NF

PS

PS

(17)

F PS

(12)

:

(16)

F

(2)

F NF

(1)

F NF

(14)

F

(4)

F NF

NF

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

Fig.

I(b

)

Ig-fo

lda

Posit

i&

Fam

ily1

Nam

e"

VHl

w&l0

4 b

5558

(c

ontin

ued)

V10$

Ab/V

AR10

4A

H24

VH36

-65=

G

LVh5

0f

BALB

B BA

LBS

J558

-122

B d

VHAT

AG-Z

b J5

58-1

86d

C57C

2 C5

7G9

C57G

l BA

LB58

/B$L

B13

J558

-42X

/2

1d/V

HATA

G-lb

/H17

b pH

C103

C5

7C48

H2

6-lb

Ba

lbll/

Balb

lS

C570

26/C

57C1

6 c5

7c17

VH

104A

b C5

7G30

M

3497

6(zlA

3jd

J558

-15

H130

/HlS

b C5

7015

/C57

G10

c57c

44

VIilO

EB

b

H26

-6

b

A5Bg

b C

5702

2 pH

C102

b

vAR3

4 VH

2 ~5

2 PJ

14/V

OO76

7 VH

ox-l;

/vox-

lb M

3780

8

M26

982;

M

2698

4 M

2698

1 b

v(ox

2)g

Q5SH

.100

h

BB

B T

B B

B B

1111

1111

1 B

BIB1

T

I IB

B 22

2222

22

TT

B BB

B T

T B

BIB1

1 ,..

.,....

;~...

,....;

~...,

....;~

ab...

l....f

Y...I

....;~

.,b,..

,....~

Y...I

....;Y

...I..

..~~.

,b,..

l....f

Y...I

Re

arra

nged

ge

ne=

statu

s+

QVQL

QQ~~

LVKP

GasV

KLSC

KRSG

YTPT

S--Y

WM

HWVK

QRPG

RGLB

WIG

RIDP

--NSG

GTKY

NBKP

KSKA

TLTV

DKPS

STAY

MQL

SSLT

SBDS

AWYC

RR

......

S.

P...R

..T...

I.....

....L

T--..

.N...

*..AQ

......

Q.P.

--A..S

.N...

M..G

......

.TS.

......

......

......

F ...

...

...

S.P.

..R..T

...I..

......

.LT-

-...N

...*M

..Q

....

..Q.F

.--A.

.S.N

...M

..G...

....T

S....

......

......

...F

...

......

S.

....R

..T...

K...V

.....A

N--..

IG...

....H

......

D.Y.

--GD.

V.N.

.....G

.....A

..S...

...*..

......

......

S B.

....S

.....R

..S...

M...

T....

...--.

GIN

......

.Q

......

Y.YI

--GN.

Y.G.

.....G

.....S

.TS

......

......

.....

B....

.S.P

......

D...M

...

.....

..D--.

Y.D.

...SH

.KS.

....Y

.Y.--

.N

.. .S

..Q...

G....

....S

......

B.H

......

......

.. B.

L...S

.P

......

....

IT...

......

D--.N

.D...

.SH.

KS...

..D.N

.--.N

...I..

Q...G

......

..S...

...B

.R

B.L

.. .S

.P...

......

.IP

......

.. .D

--.N.

D....

SH.K

S....

.D.N

.--.N

...I..

Q...G

......

..S...

...B

.R

....

..S.P

...R.

.T...

I.....

....IT

--...N

...'...

Q.X.

...Q.

F.--A

..S.N

...M

..G

......

. TS

......

......

......

HF

...

......

SD

T....

......

......

.....D

--HAI

......

.BQ.

.....Y

.S.--

GN.D

I.....

..G...

..A..S

......

..N...

......

.F.K

B.

....S

.....R

..S...

....T

......

.--.G

IN...

....Q

...

. ..Y

.YL-

-GN.

Y.A.

.....G

.....S

.TS.

......

.R...

....V

I~

...

B ...

..S

.P...

......

......

...S.

.G--.

Y.N.

...S.

B~...

..B.N

.--.N

...S.

.Q...

G....

....S

......

B .R

B

...

..S.P

......

....I.

......

...D-

-.Y.N

....S

H.KS

...

. .D

.N.--

.N...

S..Q

.I.G.

......

.S...

...B

.R

B....

.S.P

...N.

.....I

......

.S..G

--.Y.

N....

S..K

S....

.B.N

.--ST

...T.

.Q...

A....

....S

...

...

..K

B....

.S.P

......

....I.

....+

....D

--.N.

.....S

H.KS

...

. .G

.N.--

.N.A

.S..Q

...G.

......

.S...

...B

.R

....

..SD.

......

....I

......

.. ..D

--HAI

.....K

.BQ.

.....Y

.S.--

GN.D

I.....

..G...

..A..S

......

..N...

......

.P.K

.

. ..S

.P...

...P.

..I...

....S

..G--.

Y....

..SH.

KS...

..B.N

.--YN

...S.

.Q...

G....

...TS

......

B.H

.....

L .._

...

...

B ...

..S

.P...

......

.I ...

.....

..D--.

Y....

S~B.

A....

.D.N

.--.N

...S.

.Q...

G....

....S

...

...

..N

B....

.S.P

....L

.P...

I.....

..S..G

--.Y.

.....S

H.KS

.....B

.N.--

YN...

S..Q

...G.

......

TS...

...B.

H....

...L

......

B.

....F

......

......

I.....

.....D

--.N.

D....

SH.~

.....D

.N.--

.~S.

S..Q

...G.

......

.S...

...B

.R

B.

.. ..S

.P...

......

.I....

...S.

.G--.

Y.N.

...S.

BKS.

....B

.N.--

ST...

T..Q

...A.

......

.S

......

..K

B.

....S

.P...

......

.I....

...S.

.G--.

Y.N.

...S.

BKS.

....B

.N.--

ST...

T..Q

...A.

......

.S

......

..K

...

. ..S

.P...

R..T

...I..

......

.LT-

-...N

...'M

..Q...

...A.

F.--A

O.S.

N..Q

M..G

......

.TS

......

......

......

. P

...

B ...

..S

.P...

......

.I....

...S.

.G--.

Y.N.

...S.

BKS.

....B

.N.--

ST...

T..Q

...A.

......

.S...

..I*

.K

B....

.S...

..RT.

S...M

......

.....-

-SGI

N....

...Q.

.....Y

.H.--

GK.Y

IH...

R..G

.T...

...S.

......

.R...

......

.F

...

B....

.S...

.GR.

.S...

....T

......

.--.G

IN

......

.Q

D....

.Y.Y

.--GN

.Y.A

.....Q

OB...

.S.T

S....

....R

......

..I.F

...

B.

....S

.P...

......

.I....

...S.

.G--.

F.N.

.M.S

H.KS

......

.N.--

YN.D

.F..Q

...G.

......

.S...

.H.B

.R..A

...

......

.. B.

....S

.P...

......

.I..M

....Q

.SD-

-.Y..*

..

.SH.

KS...

..Y.N

.--.N

.C.S

..Q...

G....

...TS

......

B .H

B.

....S

.P...

......

.I..M

..

..S.S

D--.Y

..'...S

H.KS

.....Y

.N.--

.N.C

.S..Q

...G.

......

TS...

...B

.H

B....

.S.P

......

....IT

....D

.S..G

--.I.N

....S

H.KS

.....B

.N.--

YN...

S..Q

...G.

......

TS...

...B.

H....

...L

......

...

T.

....I.

......

S..G

--.Y

....

..SH.

KS...

..Y.S

C--Y

N.A.

S..Q

...G.

..F...

TS...

....F

N ...

......

.....

......

S.

...M

......

.I....

T..K

.S.--

.NIB

......

BQ...

...E.

L.--G

.DY.

Y.I..

..G...

F.A.

TS.N

......

.G

......

......

. B

...

..S.P

...

......

. I..

M...

LS.S

D--.Y

..*...

SH.K

S....

.Y.N

.--.N

.C.S

..Q...

G....

...TS

......

B .H

B.

H ..

.SLP

KV..A

.P...

I.....

..S..G

--.Y.

.....S

H.KI

.QR.

BYVN

.--YN

...G.

.....D

.....A

..SF.

.....P

......

..L

......

E.

..K...

TVV.

......

.I..Q

....S

..G--.

Y....

..SHB

KS.*

...

L.I.-

-YN.

N.SN

.Q...

G....

....S

....N

.B

.C

QVQL

KBSG

ffiLV

APSQ

SLSI

TCTV

SGPS

LTG-

-YGV

NWVR

QPPG

KGLB

WLG

TIW

---GN

GSTD

YNST

LKSR

LTIT

KNSK

SQVF

L~NS

LQTD

DTAV

...

......

......

......

......

.. ..-

-.....

......

......

.M..-

--.D.

......

A ...

.. S.

S ...

......

......

......

...

..R

......

......

......

......

......

S-

-...H

......

......

..V..-

--AG.

..N...

A.M

...S.

S....

......

......

.....M

..

..R

......

......

......

......

...

S--..

.H...

......

.....V

..---S

D...N

.I.A.

....S

.S...

......

......

......

M

.. ..R

...

...

T....

......

......

...I..

.S--.

..H...

......

....W

..---S

D...N

...A.

....S

.S...

......

......

......

M

.. ..R

...

.. Q.

.....*

...

......

......

.. S-

-...H

....S

......

...V.

.---S

G....

...AP

I...S

.S...

......

F....

..A...

.M

.. ..K

...

.. Q.

.....Q

......

......

.....S

--...H

....S

......

...V.

.---S

G....

..AAF

I ...

S.

S....

.....F

......

A....

I ..

..R

.....

Q....

..Q...

......

......

..S--.

..H...

.S...

......

V..--

-SG.

.....r

JLFI

...S.

S....

.....F

......

~...I

...

.. ...

.. Q.

......

.....F

.....Y

.....S

--.EI

......

......

...V.

.---T

G...N

...A.

I...S

.S...

...L.

......

......

.I .

..VR

163.

72

1410

B.lO

e AC

38

205.

12

3-l-3

mAb

A4

1 50

12-6

91A3

CR

I-

Al2

D1.3

DB

l-453

.2

PS

PS

PS

NF

(0

) F NF

NF

PS

NF

NF

SD

(9)

F (1

) F NF

PS

(2

) F N

FSD

PS

PS

(6

) F

(1)

F PS

PS

PS

(0)

F NF

(1)

F PS

PS

PS

NF

NF

PS

PS

PS

(2)

F (0

) F PS

PS

PS

PS

PS

PS

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

Fig.

I(c

)

Ig-fo

ld'

BB

B T

B B

B B

1111

1111

1 B

BIBI

T

I IB

B 22

2222

22

TTB

BBB

T T

B BI

B1

Posit

iona

1

10

80

Fam

ily'

Nam

e' ( .

( .

. .

I....,

....

;: . .

. .

. .

. . ;

:a,

. ..(.

...

;: .

..(.

. . .

.Y..,

..,...

.;j~.

..,...

.;P...

,....,

...b,

..,...

.4"..

.,

VHZ

Q52

(con

tinue

d)

QVQr

XsSG

PGLV

APSQ

SLSI

TCTV

SGFS

LTG-

-YGV

NWVR

QPPG

KGLE

WLG

TIW

---GN

GSTD

YNST

LKSR

LTIT

KDNS

KSQV

F~NS

LQTD

DTAR

V v(

ox2)

g b

VHlO

l vH

3 36

-60

36-6

0 VH

-36-

60

d

VH-3

;-60b

SB

32

VH3A

l* VH

4 X-

24

V-H

441/

V441

b VH

55b

vH5

L 71

83

...

..Q...

...Q

......

......

.....

S--.

..H...

.S...

......

V..--

-SG.

.....A

API..

.S.S

......

...F.

.....A

N...I

..

..R

...

..Q...

...Q

......

......

.....

S--..

.H...

.S...

......

V..--

-SG.

.....~

I...S

.S...

......

F....

..SN.

..I

.. ..R

BV

QLQ

BSG

PSLV

KPSQ

TLSL

TCSV

TODS

ITS-

-DYW

NWIR

KFPG

hlKL

BYM

GYI

S---Y

SGST

YYNP

S~RI

SI~D

TS~Q

YY~~

S~SB

~ATY

Y~SL

...

......

......

......

......

.. ..-

-.....

......

......

....--

- ...

......

......

......

......

......

......

....

. ...

......

......

......

......

.. ..-

-.....

......

......

....--

-.....

......

......

......

......

......

...~

....

A ...

......

......

......

......

.. ..-

- G

......

......

......

..---

......

......

......

......

......

.. ..T

...P

.. ..A

D.

.....X

.X...

...S.

....T

...Y.

...D-

YA...

..Q...

....W

.....-

--....

.S...

......

......

....F

F ...

D.

...

.. ..G

......

.V...

.T...

I...T

GNYR

.S...

Q....

...W

I...Y

---...

TIT.

....T

..TI..

......

.FF.

BM..L

.A

.....

BVKV

IBSG

GO

LVQ

POO

SLKL

SCAA

SGFD

FSR-

-YW

MSW

VRQ

APG

KGLB

WIG

BINP

--DSS

TINY

TPSL

I(DKP

IISRD

NAKN

TLY~

S~SB

~DTA

tYYC

ARL

61-lP

p SE

-3Gb

76

-lBGb

/Wb7

183.

9D

vH71

83

.i' VH

lO-1

9 Vh

7183

WIi6

9.1)

b vH

7183

.14b

VH

283

VH37

.1b

VHB4

-psib

VH

7183

.llb

vH5.

0.

lb

VH71

83.

lob

57-lM

b/VH

7183

.12b

68

-S&

VHEI

Xb

VH6

5606

VH

22.1

b VH

7 To

7 Vl

b pB

V132

b

b Vl

l /p

BVlS

BQ

V13b

v3

b

VHE

3609

.7

CB17

H-3a

CB

17H-

la

CB17

H-lO

a CB

17H-

Ea

CB17

H-6a

CB

17H-

9=

VH36

09

...

LL...

......

......

......

.....-

-.....

......

......

.....-

- ...

......

......

......

......

......

......

......

...

LL

......

......

.N...

......

...--.

....A

......

.Q...

.....-

-G

......

......

......

......

......

......

......

.. DV

QLVs

SGGG

LVQP

GoSR

KLS~

GFTF

SS--F

~IIW

VRQA

PBKG

LBW

ISS-

-GSS

TLHY

ADTV

KORF

TISR

DNP~

LF~M

TSLR

SBDT

...

......

......

.....

..--..

......

......

......

..--

....

IY .

......

......

......

......

......

......

.. ...

.....

--YA.

S....

S...R

.....B

...--.

G.~.

P...T

......

...A.

...Y.

B.S

......

......

. . .

.K...

.L

......

......

. --Y

A.S.

...T.

..R...

..T...

--.G.

YTY.

P.S.

......

....A

....Y

...S

......

......

.. .K

....L

......

......

.--rP

.S...

.T...

R....

....N

--.OG

STY.

P....

......

...A.

...Y.

..S..K

...

......

.. ..K

......

...K.

...L.

......

......

--rP.

S....

T...R

.....T

...--.

G.Ym

.P.S

......

.....A

....Y

...S.

.K...

......

T .

B.K.

......

..K...

.L...

......

....--

rP.S

....S

...R.

....T

...--.

G.YT

Y.P.

S....

......

.A...

.Y...

S..K

......

...T

. .K

..

..L

......

. ..A

...--Y

D.S.

...T.

..R...

..T...

--.G.

YTY.

P.S.

......

....A

R...Y

...S.

......

.L

.....

B.M

......

...K.

...L.

......

......

--rP.

S....

T...R

.....T

...--.

00~.

P.S.

......

....A

..N.Y

...S.

......

.L

.....

B.K.

......

..K...

.L...

..T...

....--

Y..S

....T

...R.

....T

..G--.

G.m

.P.S

......

.....A

..N.Y

...S.

......

.L

.....

BL...

......

......

......

......

..--Y

A.S.

...T.

..R...

..A..T

--DG.

PIY*

P....

......

...A.

......

.S...

Y....

....L

.

.K...

......

......

..D--Y

..A...

...G.

.P...

.F...

--LAY

SIY.

....T

......

.B.A

....Y

.B.S

...

......

.....

B.K.

......

......

.L...

..T...

...D-

-YY.

Y....

T...R

......

..N--.

CGST

Y.P.

......

......

A....

Y...S

R.K

......

.....

.K .

. ..L

......

......

.--YY

.S...

.T...

R..L

..A.N

.--NG

DSTY

.P...

......

....A

....Y

...S.

.K...

..L

.....

...

K....

L....

......

...--Y

A.S.

...T.

..R...

..S..-

--SOG

STY.

P.S.

......

....A

R.I.Y

...S

......

......

.. ...

.....

L....

......

...--Y

..S

.. ..T

.D.R

..L..T

.N.--

NGGS

TY.P

.S...

......

..A...

.Y...

S..K

...

......

.. B.

......

......

gB.L

...

. BS

NEYB

.P.--

HD.S

...KT

..

.R..L

..A.N

.--DO

OSTY

.P..M

BR..I

.....T

.K..Y

...S.

......

.L

.....

BVKL

BBSG

GG

LVQ

PGG

SMKL

SCVA

SGFT

PSN-

-YW

MNW

VRQ

SPBK

GLB

WVA

BIRL

KSNN

YATH

YABS

VKO

RFTI

SRDD

SKSS

VYLQ

n NN

LRAB

DTGI

YYCT

TG

......

......

......

......

.....

..--

...

S....

......

....Q

...

.. D

......

......

......

......

......

......

......

. G

BVKL

VSSG

GGLV

QPGG

SLRL

SCAT

SGFT

FSD-

-FYN

BWVR

QPPG

KRLB

WIA

RN

KAND

YTTB

YSAS

VKG

RFIV

SRDT

SQSI

LYLQ

M

NALP

ABDT

AIYY

CARD

...

......

......

......

......

.. ..-

- ...

......

......

......

......

......

......

......

......

......

......

......

. ...

......

......

......

......

.. T.

--Y..S

......

..A...

LOFI

.....G

......

......

.TI..

.N...

......

.T...

..S.T

...

...

....

M...

......

.A...

...BA

...

. .T

.--Y.

.S...

.L.R

.SP.

.L.L

I.....

G....

......

...TI

...N.

.N...

....T

....A

S.T.

...K

. ...

......

....

..A...

....S

...

. .T

.--Y.

.N..H

R....

P...L

.LI..

...G.

I.....

.M...

.TI..

.N...

......

.T.S

T..S

.T

......

QV

TLKs

SGPG

ILKP

SQTL

SLTC

SFSG

FSLS

TSOM

OVGW

IRQP

SGKG

LBW

~W---

WDD

DKYY

NPSL

KSQL

TISK

TSRN

QVF~

ITSV

DTAD

TAV

......

....

..Q

......

......

......

F.

..I...

......

......

...---

......

...A.

..R...

.....N

......

......

....T

.

......

....

..QS

......

......

......

....

s ...

......

......

.Y

---...

..R...

....R

......

......

......

......

.T.

......

....

..Q

......

......

......

. N.

.I....

......

......

..---.

N....

......

.R...

.....N

......

..T...

....T

.

......

......

QS

......

......

...N.

.....S

......

......

....Y

---...

..R...

....R

......

......

......

......

.T

......

......

QS

......

.....V

.....P

....S

......

......

....Y

---..B

..H.K

.....R

......

..N...

......

......

.T

. ...

......

. ..Q

...

......

...

V...N

.F...

.S...

......

......

.Y---

..B..H

.K...

..R...

.....N

......

..T...

....T

.

....

F....

T....

....Y

..M.S

.MC.

......

V...L

..---C

NN..G

...F.

R....

......

N....

......

.P...

.T

....

Rear

rang

ed

gene

' st

atus

+

D23

Pab

419

LB8

NBO

C72-

3Al

XRPC

44

XRPC

24

RF-3

PA

N H3

7-40

H3

7-45

AS

WA2

H3

5-C6

H37-

60

MRK

lC

B5Fv

B1

3 AN

10

B112

79

68.2

DE

NQ10

.3.8

H2

20-7

ASW

Bl

B6.2

(1)

(4) (2)

(0)

(0)

(0)

(1)

(0)

(3)

(5)

(0)

(3)

(1)

(0)

(1)

(3)

(4)

(4)

(0)

(0)

(1)

PS

F 89

F 90

NF

SD

91

NFSD

92

F

93

F 94

F F F F F F F NF

F F NP

PS

NF

F F F F NFSD

F F F NFSD

PS

NF

F

95

96

97

98

99

100

101

I3

102

103

.n

104

c 10

5

106

: 10

7 10

8 8

109

2

110

e

111

112

113

114

115

116

117

118

119

120

121

(10)

F NF

NF

NF

PS

Fig.

I(d

)

Ig-fo

ld"

Posit

icmP

Fam

ily'

Nam

e"

wia

3609

.7

(con

tinue

d)

V31h

/vN

tJ-3.

1 vH

9 GA

M?-

8 VF

Ml;b

/VG

Kl$'

w4

sg

/VGK

TA~/

~S~~

16

1 VG

KCj

VNS;

b/VG

K4j

264

VFM

lb/2

81b/

VGK7

j VM

Sl:/1

41b/

VGK3

j VG

IC53

VG

K2'

VHlO

M

RL-D

NA4

MRL

-RP2

4BGk

/M

2146

9 VH

ll CP

3 vH

12

CH27

16

-A

vH13

vh

3609

N m

i14

vhem

7-13

vh

em7-

13

1

li2b-

;b/V

H2b-

3b

37A4

VH

10~/

H10b

/M33

391-

7' 17

c1;

14c3

vn

4a-3

b/H4

a-3

b

vH15

Vh

lSA

BB

B T

B B

B B

1111

1111

1 B

BIB1

T

I IB

B 22

2222

22

TT

B BB

B T

T B

BIB1

1 ,..

.,....

;~...

,....~

~...,

....;~

,b...

,....f

~...,

....;~

,ab,

..,.,.

.~~,

..,...

.;~...

,....f

~.ab

c..,.

...~~

...,

QVTL

KBSG

PGIL

KPSQ

TLSL

TCSF

SGPS

LSTS

GMGV

GWIR

QPSG

KGLB

WLA

HIW

---W

DDDK

YYNP

SLKS

QLTI

SKDT

SRNQ

VFLK

ITSV

DTAD

TASY

YCAR

V . V

......

Q....

.G.A

.T...

I.....

...LS

.L.K

.Q.R

-.....

S..--

--NN.

N....

....R

.....E

..N...

...L.

......

S~...

.~

QIQL

vQSG

pBLK

KpGE

TVKI

SCKA

SGYT

FTN-

-YGL

NWVK

QAPG

KGLK

WM

GWIN

T--Y

TGKS

TYAD

DFKG

RFAP

SLBT

SAIT

AYLQ

INNL

KNBD

MAT

YF~R

S ...

......

......

......

......

.. ..-

- ..

M...

......

......

....--

...BP

......

......

......

S....

......

......

...A

......

......

......

......

.....

..--

.. M

......

......

......

.--

.. .B

P....

......

......

..S...

......

....T

.....A

. ...

......

......

......

......

.. ..-

-..M

......

......

......

.--...

BP...

......

......

.C.S

......

.....Q

.T

.. ...

......

......

......

......

.. ..-

-..M

......

......

......

.--B.

.BP.

......

......

.....S

......

......

.T

.....

......

......

......

......

.....

..--

.. M

.....

......

......

..-

-N..B

P...B

B....

......

...S.

......

......

T....

.A.

......

......

......

......

....

..T--.

.MS.

......

......

.....-

-.S.V

P....

......

......

..S...

......

....T

.....A

.

......

......

......

......

....

..D--.

SMH.

......

......

.....-

-B..B

P....

......

......

..S...

......

....T

.....A

.

......

......

......

......

.....

..--.A

MH.

...

......

.....

.KY.

--N..B

P..G

......

......

...S.

......

......

......

A.

......

......

......

......

.. ..T

--A.M

Q..Q

KM...

....I.

....--

HS.V

PK..B

......

......

..S

......

......

......

. ...

......

....

..R...

......

..T--A

.MQ.

.QKM

......

.I....

.--HS

.VPK

..B...

......

.....S

......

S....

..T

.....

BVQ

LVBT

OO

GLV

QPK

GSL

KLSC

PASG

FSPN

T--N

AMNW

VRQ

APG

KGLB

WVA

RIRS

KSNN

YATY

YADS

VKDR

PTIS

RDDS

QSM

LYLQ

MNN

LKTB

DYYC

...

...

VWW

RM...

......

..A...

.T...

--Y...

......

......

......

..s

......

......

......

......

......

......

......

.. BV

QLLB

TGGO

LVQP

OOSR

GLSC

BGSG

FTFS

G--F

WM

SWVR

QTPG

KTLB

WIG

DINS

--~AI

NYAP

SI~R

FTIF

RDND

KSTL

YLQM

SNVR

SBDT

A~F~

RY

KPXQ

XW[T

CSIT

XFPI

TSG-

YYW

IWIR

QSPG

KPLB

~GYI

T---H

SGBT

FYNP

SLQS

PISI

TRBT

SKNQ

FFLQ

LNS~

BDT~

~~GD

GA

VQBS

GPoL

V.NS

.S.F

Ln...

.G...

...-..

......

......

......

.---.W

BNPL

QPIP

SRA.

S....

......

......

......

......

A ..

QVQL

VBTG

GGLV

RPGN

SLKL

SCVT

SOPT

BSN-

-YRM

HWLR

QPPG

KRLB

WIA

VI~D

~~~S

~GRF

ACSR

G BV

QLM

)SG

ABW

-PG

ASVK

LSCT

ASG

FNIK

D--D

YMHW

AKQ

RP~L

BWIG

RIDP

--AID

DTDY

APKF

QDK

ATM

ITSS

NIAY

LQSS

SSLT

SB~A

~YCP

Y ...

......

. ..-

......

......

......

--....

......

......

......

--....

......

......

......

......

......

......

...~

.. ...

......

. L.

RS...

......

......

..--Y

....V

....B

......

.W...

--BNG

..B...

...G.

...TA

.....T

....L

...

......

.....

......

....

..K...

......

......

...--S

....V

....B

......

.....-

-

.NGN

.K.D

....G

...IT

A....

.T.H

..L.R

...

....

......

.. ..L

.K...

......

......

...--T

....V

....B

......

.....-

-.NGN

.K.D

....G

...IT

A....

.T...

.L

......

......

.. ...

.....

..L.K

......

......

......

--T...

.V...

.B...

....V

...--.

NGIP

I.D...

.....I

TA...

..T

......

......

......

. ...

.....

..L.K

......

......

......

--T...

.V...

.B

......

.V

...--.

NGFP

N.D.

...G.

..ITA

.....T

...

......

......

. ..A

R ...

...

..L.R

...L.

....K

......

..--Y

....V

....B

......

.W

.. .--

BNGN

.I.D.

...G.

.SIT

A....

.T...

.L

......

.....

..AR

~VHL

QQ~G

~~LR

S~GS

~~LS

~FDS

BVF~

I-A~N

~WVR

QKPG

HGFB

WIG

DILP

--SIG

RTIY

GBKF

BDKA

TLDA

DTVS

NTAY

LBLN

SLTS

BDSA

IY~~

D

Rear

rang

ed

gene

= sta

tus+

PS

L6'

(2)

F 12

2 RF

T2

(2)

F 12

3 NF

12

4

L69

(7)

F 12

5 2B

7 (3

) F

126

TB32

(1

) F

127

C55-

7B3

(3)

F 12

8 NF

12

9 NF

13

0 AN

08

(5)

F 13

1

PS

PS

NFSD

13

2 M

RL-H

iston

e (7

) P

133

NF

134

87.9

2.6

(0)

F 13

5 NF

13

6 13

7 20

8 (1

5)

r 13

8

Fig.

l(e

) Fi

g.

1. M

ultipl

e am

ino

acid

sequ

ence

s ali

gnm

ent

of m

ice V

,, ge

rm-lin

e ge

ne s

egm

ents

. (r)

Po

sitio

ns

prim

arily

re

spon

sible

for

the

varia

ble

imm

unog

lobuli

n fo

ld (V

-lg-fo

ld)

cons

erve

d fe

atur

es

(Cho

thia

rr ul

., 19

88)

and

hype

rvar

iable

loop

defin

ition

(Cho

thia

and

Lesk

. 19

87).

With

in

this.

B s

tand

s fo

r re

sidue

s bu

ried

with

in

the

prot

ein;

T: r

esidu

es

in tu

rns;

1:

Inte

r-dom

ain

resid

ues;

V:

res

idues

be

twee

n B

and

C do

main

s (C

hoth

ia er

r nl.,

198

8);

1: H

l an

d 2:

H2

defin

ition

(Cho

thia

and

Lesk

, 19

87).

(8)

Resid

ue

num

berin

g as

in C

hoth

ia an

d Le

sk (

1987

). (x

) Vu

fam

ily

and

prot

otyp

e se

quen

ces.

(6

) Na

me,

clo

ne

or s

eque

nce

acce

ss n

umbe

r in

Genb

ank.

or n

ame

of t

he s

eque

nce

in th

e lite

ratu

re.

Supe

rscr

ipts

in

the

nam

e of

the

seq

uenc

e ind

icate

th

e st

rain

of

the

or

igin

of e

ach

of t

he s

eque

nces

as

fol

lows

: a:

C57

BL/6

; b:

BAL

B/c;

c:

C57

BL/6

J:

d: A

/J;

e: M

RL/M

pJ-L

PR/L

PR;

f: M

RL-L

PR/L

PR;

g: B

ALBi

cJ;

h: N

FS/N

; i:

BALB

/b;

j: BA

LB.K

; k:

M

RL/M

P-lp

r/lpr

; 1:

MRL

llpr;

m:

C57B

L/6

x BA

LB/c

. On

ly re

sidue

s wh

ich

diver

ge

with

re

spec

t to

the

pro

toty

pe

sequ

ence

s of

the

fam

ily

are

repr

esen

ted.

(8

) Nam

e (in

the

Kab

at’s

Data

base

) of

the

clos

est

V,,

rear

rang

ed

gene

and

nu

mbe

r of

am

ino

acid

diffe

renc

es

betw

een

this

and

the

germ

-line

gene

. (4

) F

stan

ds

for

sequ

ence

s wi

th

a re

arra

nged

co

unte

rpar

t (fu

nctio

nal);

NF

: No

n-fu

nctio

nal

sequ

ence

du

e to

not

hav

ing

a re

arra

nged

co

unte

rpar

t. Su

pers

cript

“S

D.”

mea

ns s

truct

ural

de

fect

s,

this

unde

rlined

in

the

sequ

ence

; PS

: Ps

eudo

gene

. In

serti

ons

or d

eletio

ns

that

pro

duce

fra

me

shift

cha

nges

in

the

amino

ac

id se

quen

ce

were

elim

inate

d to

obt

ain

the

mos

t co

rrect

im

mun

oglob

ulin-

like

sequ

ence

s.

Aest

hetic

s wi

thin

th

e se

quen

ces

mea

ns

a st

op c

odon

. Nu

mbe

rs

at t

he r

ight

mos

t pa

rt re

pres

ent

the

code

of

eac

h se

quen

ce

in Fi

g.

3. T

he

mult

iple

sequ

ence

s ali

gnm

ent

and

all t

he c

alcu

latio

ns

ther

ein

pres

ente

d we

re

mad

e by

usin

g th

e VI

R pa

ckag

e (A

lmag

ro

c’r ul

., 19

95).

1206 J. C. ALMAGRO et al.

elements or recombination signals (Tomlinson et a/., Table 1. Classification and repertoire of the mice V,, gene seg- 1992). ments

In vivo expression of the VH gene segments was per- formed by assigning their acid sequences to their closest rearranged functional VH sequence in a database of 627 VH amino acid sequences compiled from the Kabat’s Database on-line service (Kabat et al., 1991; see web site http://immuno.bme.nwu.edu). We chose the VW rearranged sequences having a reported specificity, in order to avoid non-productive rearrangements, therefore guaranteeing assignment of functional V, gene segments only. The database with the 627 VH amino acid sequences is available on request to the authors.

VH Gene Prototype Number of V, gene segments

family” membe? Estimated’ Found*

VH 1 VI,2 VI,3 V,4 VH5 VH6

vH7

vH8

V,,9 V,lO V,ll VH12 V,,I3 v,14 VH15 Total

55.58 Q52

36-60 X-24 7183 5606 SlO7 3609

GAM3-8 MRL-DNA4

CP3 CH27 3609N SM7

V,15a

60-1000 15

5-8

120 10

2 16

It is worth mentioning that the criterion for choosing the V, rearranged sequences, as those having a reported specificity may bias the assemble of rearranged sequences due to researchers’ interests. However, inspection of the 627 sequences indicates 137 different specificities there included. Moreover, many of the sequences reported as possessing the same specificity probably correspond to antibodies elicited against different epitopes, particularly in the case of large antigens like proteins. This increases the actual amount of different specificities. Therefore, the database of V, rearranged sequences would be sufficiently heterogeneous to detect most of the functional VH gene segments of mice.

4 8

IO

34 2-3

123-1073

7

185

To determine structural defects, we analyzed those resi- dues mainly responsible for the structural conserved fea- tures of antibodies V domains (Amzel and Poljak, 1979; Chothia and Lesk, 1987; Chothia et al., 1988). Such resi- dues were derived early from the analysis of the VL and V,, domains of the seven antibodies of known three- dimensional structure (Chothia and Lesk, 1987; Chothia et al., 1988). However, the pattern depends to some extent on the number of structures analyzed. Currently, there exist atomic structures of more than 50 antibodies with different amino acid sequences, thus allowing to update the pattern. In addition we decided, to further improve updating, to add the 627 VH amino acid sequences com- piled from the Kabat’s Database. This was done sup- posing that these sequences, having a reported specificity, are functional and should have no structural defects. The pattern is summarized in Table 1.

“ V,, gene families defined for mice. V, I to VH 14 (Kofler et ul., 1992) and V,15 (Mainville et (II., 1996).

b Name of the prototype sequence of each family. ‘Number of sequences estimated by Southern blot hybrid-

ization and sequencing: V, families l-14 (Kofler et al., 1992) and VH15 (Mainville et al., 1996).

d Number of V,, germ-line genes and V, pseudogenes found in our compilation (see Fig. 1).

types 2 and 3 (these types share the same length but differ in their conformation), and type 4 was identified with the longest loop (8 residues). Recently, two other sizes for H2 have been distinguished in the functional V, gene segments of humans: one having 7 residues (between the size of types 2/3 and type 4) and named type 5 (Chothia et al., 1992), and another one shorter than type 1 (4 residues) named type 6 [I.M. Tomlinson, personal com- munication].

Determination oj’ the canonical structures in H1 and H2

In structural terms, HI has been defined as the hyp- ervariable loop beginning at position 26 and finishing at position 32 (see head of Fig. 1). Three different sizes have been identified for this loop: canonical structures type 1 (seven residues), type 2 (eight residues) and type 3 (nine residues) (Chothia and Lesk, 1987; Chothia et al., 1989; Chothia et al., 1991).

On the other hand, H2 is defined from a structural point of view as the hypervariable loop running from position 52 to position 56 (Chothia and Lesk, 1987; Cho- thia et al., 1989). Currently, five different sizes have been found (Chothia et al., 1992; Tramontano et al., 1990). Early works assigned canonical structural type 1 to the shortest loop (5 residues), the next length (6 residues) to

The patterns of residues determining the different canonical structures for HI and H2 have been described in detail by Chothia et al. (1992). Starting from this pattern, we defined a new one (Fig. 2). This new pattern includes the recent analysis of Barr6 et al. (1994) in shark VH sequences, as well as our own analysis of recently solved VH X-ray structures (underlined amino acids in Fig. 2). For example, in H2. Valine (v) was added at position 71 in the pattern of type 2 because Fab 8F5 (Tormo et al., 1992) has this residue. This residue was not previously considered in the patterns (Chothia and Lesk, 1987; Chothia et al., 1989; Chothia et al., 1992) and does not modify the H2 conformation [the rms of the 8F5 in H2 when compared with NC41, a prototype of H2 type 2 (Chothia et al., 1989), is 0.36 A].

V, structural repertoires of mice and humans 1207

Psttm-nr

Type1 j”.,.. , /X---xX0X R

D K

. . . ..j

Fig. 2. Amino acid pattern for the canonical structure classes as defined as the simultaneous combination of canonical structures in a given sequence (Chothia et ul., 1992). The amino acid residues are shown in one letter code. X means any residue. Underlined

residues are those differing with respect to the original pattern (see Material and Methods for details).

RESULTS

The known VH germ-line gene segments of mice

Although the exact number of gene segments in the entire mice Vn germ-line gene repertoire is currently unknown, the complexity of most individual V, families has been established within a narrow range for several strains of the mouse (Kofler et al., 1992). Only the size of the largest family (Vnl) is controversial, varying from 60 (Brodeur and Riblet, 1984) to N 1000 members (Livant et al., 1986). Several lines of evidence suggest, however, that the size of the V,, 1 family is closer to 60 than to 1000 (Kofler et al., 1992).

Based on the estimated complexity of the individual V, families of mice, we first established how rep- resentative our compilation of mice Vn gene segments really was (Table 2). In most V, families the estimated number of genes and the amount we found are in good agreement. We compiled 120 V, gene segments in the Vnl family (Fig. I), supporting the proposition that the size of this family is indeed closer to 60 members than it is to 1000 (Kofler et al., 1992). In other 9 V, families (Vn2, Vu3, Vn4, Vn5, Vnl, Vn8, Vn9, VnlO and Vnl2) the established quantities of V, gene segments are also similar to those we found (see Table 2), suggesting these 9 Vn families to be well represented in our compilation.

Four VI, families (Vn6, Vn 11, Vn 13 and Vn15) showed discrepancies when the estimated and found complexity were compared (see Table 2). In the Vn6 family, less segments than expected were assigned. For the Vnl 1,

Vnl3 and Vnl5 families, no Vu gene segments were found. Nonetheless, these families have one or only a few members (Table 2) and therefore their contribution to the whole mice Vn germ-line gene repertoire should be minimal.

The functional VH germ-line gene sqyments of’ mice

Analysis of the expression in civo of the 138 Vn gene segments reported as germ-line genes and potentially functional, suggests that only 72 of them are functional (Fig. 3). Of the 66 Vn gene segments not expressed in vivo, and therefore defined as non-functional, 13 present structural defects when those residues responsible for the structural conserved features of Vn domains are analyzed (see Table 1 and the status column of Fig. 1). For exam- ple, 3 sequences within the V, 1 family (VHl45/Clegc, C19c and Cl 5c; see Fig. 1) possess Serine (s) instead of Cysteine (c) at position 22. These sequences are unable to establish the disulfide bridge that stabilizes the stan- dard fold of Vn domains (Amzel and Poljak, 1979; Cho- thia and Lesk, 1987).

In the remaining 53 sequences not showing structural defects, it was difficult to define why they had not any counterpart in the V, rearranged sequences. Hence, we can only infer that they have minor genetic defects outside the coding region. This hypothesis however, could not be properly scrutinized because information outside the coding region is not reported in many sequences. In such way we cannot discard the possibility of some of these

1208 J. C. ALMAGRO et ul.

Table 2. Pattern of residues determining the structural features The structural repertoire qffunctional VH germ-line gene of the V-Ig-fold” segments

Intra-domain positions

Position Residues buried between the p-sheets

4 6

12 18 20 22 24 34 36 38 48 49 69 78 80 82 88 90 92

G A L v L 1 C G A v L W R K L v G A G A A L L M L 1 G A Y F C

15 G S 42 G D 66 R K 67 V F 82c v L 86 D E

37 v I 39 Q H 45 L F 47 w Y 91 F Y 93 A L

VPFSH NPR VMTL IEKC A IMRKQ M V

STVPFD I IM A FW

V IMNQST IW RM F S SDTV VIMFLS FYVTIGS FISTV

MVFS STV

HN

Residues in turns KEN EHAKVQRSTW AEHQT TSL IAG

MAPIT S

Inter-domains positions Between variable domains FM A L EKLPR RPQ FIHCLS

HIS VTDKSGHMN

Between V, and C,, domains 11 L V ISFPT

“Residues differing with respect to the original pattern described by Chothia et al. (1988) are underlined. In italic those residues identified in the 627 rearranged sequences.

VH genes actually being functional even though no rearranged counterpart was found. This is so because the database of rearranged sequences was built chosen those sequences having a reported specificity to avoid non- productive rearrangements, in spite of the fact that this would introduce some bias due to the researcher’s inter- ests. However, the sample of rearranged sequences would be sufficiently heterogeneous (see Material and Methods section) to lead to the conclusion that, if some of the VH genes defined as non-functional are indeed functional, they should be exceptional.

In Fig. 4 the canonical structure classes implicit in the 72 defined as functional V, germ-line genes of mice are shown. Seventy-one of them present patterns compatible with some canonical structure in Hl. In H2, three sequences do not have a proper pattern to fit any of the canonical structure known to exist.

Analysis of the structural repertoire indicates that mice encode 6 canonical structures classes. Class l-2 is the most frequent (64%), followed by class I- 3 (17%) and class l-l (7%). Classes l-4, 3-l and 2-1 are very poorly represented in the sequences (3%, 3% and I%, respec- tively).

Interestingly, the structural repertoire of mice is not randomly distributed among the VH families. Almost all sequences within a family encode the same canonical structure class (Fig. 4). Therefore, their structural rep- ertoire is family-specific, suggesting it to be preserved despite actual diversification of the VH gene segments.

Comparison between the structural repertoire oj’mice and humans

To compare the structural repertoire of mice and humans, those canonical structure classes implicit in the 5 1 functional VH germ-line genes of humans are depicted in Fig. 5. Differently from mice, humans encode 8 canoni- cal structure classes (Fig. 6). Canonical structure classes 3-5 and l-6 implicit in human sequences were not found in the functional VH mice germ-line genes.

Canonical structure class 3-5 is encoded by germ-line 6-Ol/DP74; the only gene segment defining the human VH6 family (see Fig. 5). In mice, neither the sequences nor the pseudogenes compiled in Fig. 1 possess the proper size to fit canonical structure 5 in H2. Inspection of the 627 functional rearranged VH mice sequences indicates this size not to be present either. Therefore, it is unlikely that mice germ-line genes possess this class.

In the case of canonical structure class 1-6, one doubly sequenced pseudogene of mice (V31/VMU-3.1; Fig. 1) has 4 residues at the H2 loop which is the size cor- responding to canonical structure type 6. Because this size is found in 5 functional rearranged VH sequences [PY54, PY2 (Ruff-Jamison et al., 1991); 8H3 (Mukherjee et al., 1993) 246B.4g, 245F.6g (Limpanasithikul et u/., 1995), it would be responsible to expect that this pseudo- gene has its functional counterpart in some mouse or in certain strains of mouse. Alternatively, the pseudogene encoding this loop size might had given the segment com- prising H2 to some functional gene segment by somatic gene conversion (Weill and Reynaud, 1996), so gen- erating the rearranged sequences presenting this canoni- cal structure class.

Differences between the structural repertoire of humans and mice are also found in the proportion by which these species encode classes 2-l and 3-1 (Fig. 6). Class 2-1 is encoded only by one gene segment belonging to the mice VH3 family whilst class 3-l is encoded by two sequences: those belonging to the VH8 family (see Fig. 4).

Position Family NWW

1 E-i-186-2 1 C36e/B7c 1 C31e 1 V23 1 B25c 1 CllC 1 B12c 1 C16c 1 VH124 1 p2M5 1 H30 1 B16e 1 HS 1 H9 1 VH105 1 H13-3 1 J558-43y 1 VH3 1 B23c/BlEx 1 VH-Id-11 1 5558-28 1 Blc 1 pMl1 1 B26c 1 VH-Id-7 1 BALB71 1 C57G5 1 BALB17 1 C57C27 1 37All 1 GLvh50 1 C57C2 1 C57G9 1 irssa-43x 1 BALBll 1 C57G26

0 lL++l

V, structural repertoires of mice and humans 1209

/la -~=~~x~4~~~x9~~~~P~x68 - -

Gene segment code (See status column of Figure 1)

Fig. 3. Usage of V, germ-line gene segments of mice.

Hl

B 111111111 B 24 34

.;:a,. . A GYTPTS--Y M A GYTPTS--Y M A GYTFTS--Y M A GYTPTS--Y M A GYTFTS--Y M A GYTFTS--Y M

A GYTFTS--Y M A GYTFTS--Y I A GYTFTS--Y M A GYTFTS--Y I A GYTFTS--Y M A GYNFTS--Y I A GYTFTS--Y I A GYTPTS--Y I A GYTFTS--Y I A GYTFTS--Y M A GYTFTS--Y I A GYTFTS--Y M A GYTFTD--Y M A GYTFTS--Y I A GYVFTN--Y I A GYTFTS--Y I A GYTFTN--Y I A GYSFTS--Y M

T GYTFTS--Y I A GYTFTD--Y M A GYKFTD--Y M T GYTFTB--Y M A GYTFTD--Y M A GYTFSS--Y I A GYTFTD--Y M A GYSFTG--Y M A GYTFTD--Y M A GYTFTD--H I A GYTFTD--Y M A GYSFTG--Y M

H2 V&SC” Hl

22222222 B B 111111111 B 52 55 71 Position 24

.abc..j. . Family NatlIe . . . . . ..a.. 1' DP--NSGG V l-2 1 M34976(91A3) A GYTFTS--S I DP--NSGG V 1-2 HP--NSGS V l-2 NP--SNGG V l-2 NP--SNGR V l-2 YP--GSSS V l-2 DP--SDSB V l-2 YP--GSGS V l-2 DP--SDSY V l-2 YP--GSSS V l-2 NP--SSGY A l-2 YP--GSGS V l-2 YP--GNVN - l-? YP--GDGS A l-2 YP--RDGS A l-2 NP--SSGK A l-2 YP--GDGN A l-2 YP--SDSB V l-2 DT--SDSY V l-2 NP--GNGY V l-2 NP--GSGG A l-2 YP--LDSN V l-2 YP--GGGY A l-2 HP--SDSB V l-2 YP--GNGY 2 l-? NP--NNGA V l-2 NP--NNGG V l-2 NP--NNGG V l-2 NP--NNGG V l-2 LP--GSGS A l-2 YP--mGG v l-2 NP--N-NGG V l-2 NP--NNGG V l-2 SP--GNGD A l-2 NP--NYDS V l-2 NP--STGG V l-2

1 2

2 3 3

4 4 5 5 5 5 5 5 5 5 5

5 6

8 8 9 9 9 9 9 9 9

14 14 14

H130/H18 A GYSFTG--Y M PJ14 V GFSLTG--Y V VHox-1 V GFSLTS--Y V VHlOl V GFSLTS--Y V VH-36-60 V GDSITS--D W SB32 V GYSITSD-Y W VH3Al V G&SIlTGNY W V-H 441 A GFDFSR--Y M V(H)55 A GFDFSR--Y M 61-1P A GFTFSS--F M 98-30 A GFTPSS--Y M 76-1BG A GFTPSS--Y M

VH7183.13 A GFTPSS--Y M VHlO-19 A GFTPSS--Y M VH7183.14 A GFAFSS--Y M VH283 A GFTFSS--Y M V(Hj50.1 T GFTFSD--Y M VH7183.10 A GFTPSS--Y M 57-1M A GFTFSS--Y M 68-51 A GFTFSS--Y M Vli22.1 A GFTFSN--Y M Vl/pBV132 T GFTFSD--F M Vll/pBVl9B4 T GFTFTD--Y M CBl'IH-1 F GFSLSTSGM V CBl'IH-10 F GFSLSTSNM I VFMll/VGKlB A GYTFTN--Y M VMSP/VGKlA A GYTFTN--Y M VGKC A GYTFTN--Y M VMs2 A GYTFTN--Y M 264 A GYTFTT--Y M VFMl A GYTFTD--Y M

VGKP A GYTFm--A M HZb-3 A GFNIKD--Y M VHlO A GFNIKD--T M VH4a-3 A GFNIKD--Y M

H2 v,csc

22222222 B 52 55 71

.abc..I. HP--GKGY V l-2 NP--YNGD V l-2 W---GDGS K l-l W---AGGS K 1-l W---SGGS K l-l S---YSGS R l-l S---YSGS R 2-l Y---YSGT R 7-l NP--DSST R l-3 NP--GSST R l-3 SS--GSST R l-3 SS--GGSY R l-3 SS--GGSY R l-3 SN--0005 R l-3 SS--GGSY R l-3 SS--GGSY R l-3 SS--GGGN R l-3 SN--GGGS R l-3 NS--NGGS R l-3 S---SGGS R 1-l NS--NGGS R l-3 RLKSDNYAR l-4 Rw-YT R l-? RNKANGYTR l-4 Y---WDDD K 3-l W---WNDD K 3-l NT--YTGB L l-2 NT--YTGB L l-2 NT--BTGB L l-2 NT--NTGB L 1-2 NT--YSGV L l-2 NT--BTGB L l-2 NT--HSGV L l-2 DP--BNGD A l-2 DP--ANGN A l-2 DP--BNGN A l-2

Fig. 4. Structural repertoire of the functional V, germ-line gene segments of mice. (a) V&SC: Canonical structure of classes of V,,. ?: means that the loop does not fit the canonical structure pattern. Those residues responsible for the mismatch are underlined.

1210 J. C. ALMAGRO et al.

Position Family Name

1 l-02/DP75 1 l-03/DP25 1 l-OB/DP15 1 l-18/DP14 1 l-24/DP5 1 l-45/DP4 1 l-46/DP7 1 l-58/DP2 1 l-69/DPlO 1 l-e/DPl38 1 l-f/DP3 2 2-05/DP76 2 2-26/DP26 2 2-70/DP28 3 3-07/DP54 3 3-09/DP31 3 3-ll/DP35 3 3-13/DP48 3 3-15/DP38 3 3-20/DP32 3 3-21/DP77 3 3-23/DP47 3 3-30/DP49 3 3-30.3/DP46 3 3-30.5/DP4 3 3-33/DP50

HI

B 111111111 B 24 34

. . .;o,. . A GYTFTG--Y M A GYTFTS--Y M A GYTPTS--Y I A GYTFTS--Y I V GYTLTB--L M A GYTFTY--R L A GYTFTS--Y M A GFTFTS--S V A GGTFSS--Y I A GGTFSS--Y I V GYTPTD--Y M P GFSLSTSGV V V GFSLSNARM V F GFSLSTSGM V A GFTFSS--Y M A GFTFDD--Y M

A GFTFSD--Y M A GFTFSS--Y M A GFTFSN--A M A GFTFDD--Y M A GFTFSS--Y M

A GFTFSS--Y M A GFTFSS--Y M A GFTFSS--Y M

A GFTFSS--Y M A GPTFSS--Y M

H2 V&SC

22222222 B 52 71 .abc..I. .

NP--NSGG R l-3 NA--GNGN R l-3 NP--NSGN R l-3 SA--YNGN T l-2 DP--BDGB g l-? TP--FNGN R 1-3 NP--SGGS R l-3 V-V--GSGN R l-3 IP--1FGT A l-2 IP--1FGT A l-2 DP--BDGB A 1-2 Y---WNDD K 3-l F---SNDB K 3-l D---WDDD K 3-l KQ--DGSB R l-3 SW--NSGS R l-3 SS--SGST R l-3 G---TAGD R l-l KSKTET R l-? NW--NGGS R 1-3 SS--SSSY R 1-3 SG--SGGS R l-3 SY--DGSN R 1-3 SY--DGSN R l-3 SY--DGSN R l-3 WY--DGSN R l-3

Position Family Name

3 3-43/DP33 3 3-48/DP51 3 3-49 3 3-53/DP42 3 3-64/DP61 3 3-66/DP06 3 3-72/DP29 3 3-73/YAC3 3 3-74/DP53 3 3-a 4 4-04/DP70 4 4-20/DP68 4 4-31/DP65 4 4-30.2/DP64 4 4-30.4/DP70 4 4-30.1/DP65 4 4-34/DP63 4 4-39/DP79 4 4-59/DP71 4 4-61/DP66 4 4-b/DP67 5 5-51/DP73 5 5-a 6 6-Ol/DP74 7 7-4.1/DP21

HI H2 V&SC

B 111111111 B 24 34

. . . ..oab. . A GFTFDD--Y M A GFTFSS--Y M A GFTFGD--Y M A GFTVSS--N M A GFTFSS--Y M A GFTVSS--N M

A GFTFSD--H M A GFTFSG--S M A GFTFSS--Y M A GFTVSS--N M V GGSISSS-N W V GYSISSS-N W v GGSISSGGY w v GGSISSGGY w V GGSISSGDY W v GGSISSGGY w V GGSFSG--Y W v GGSISSSSY w v GGSISS--Y w V GGSVSSGSY W V GYSISSG-Y W G GYSFTS--Y I G GYSFTS--Y I I GDSVSSNSA W A GYTFTS--Y M

22222222 52 .abc.. I. I' SW--DGGS R 1-3 SS--SSST R l-3 RSKAYST R l-? Y---SGGS R 1-l SS--NGGS R l-3 Y---SGGS R 1-l RNKANSYTR l-4 RSKANSYA R l-4 NS--DGSS R l-3 S----G&S R l-6 Y---HSGS V 2-l Y---YSGS V 2-l Y---YSGS V 3-1 Y---HSGS V 3-l Y---YSGS V 3-l Y---YSGS V 3-1 N---HSGS V l-l Y---YSGS V 3-l Y---YSGS V 1-l Y---YSGS V 3-1 Y---HSGS V 2-1 YP--GDSD A 1-2 DP--SDSY A l-2 YYR-SKWY P 3-5 NT--NTGN L l-2

Fig. 5. Structural repertoire of the functional VH germ-line gene segments of humans. V&SC: Canonical structure classes of V,. ?: means that the loop does not fit the canonical structure pattern. Those residues responsible of the mismatch are underlined.

10 +

0 a

H Mice q lHumans

I-l 1-2 1-3 l-4 l-6 2-l 3-1 3-5 ?-1 l-?

Canonical structure classes Fig. 6. Comparison of the VH structural repertoire in mice and humans.

In humans, canonical structure class 2-l is encoded by l-3 have inverted proportions in humans and mice (Fig. three gene segments, while class 3-l is implicit in 9 6). The most common class in mice (l-2; -64%) has sequences: 6 from the VH4 family (half the sequences a lower frequency in humans (- 14%). Conversely, in belonging to this VH family) and 3 from the Vn3 family humans the most frequent one is l-3 (- 39%), which has (see Fig. 5). a relatively low frequency in mice (- 17%). Among those

Besides the differences described above, classes l-2 and found, this contrast is the most noticeable because it

V, structural repertoires of mice and humans 121 I

involves roughly half the structural repertoire of mice and humans. Since the human V, locus has been completely determined (Cook and Tomlinson, 1995), the scope of this astounding difference depends on how complete and precise our compilation of the functional V, gene seg- ments of mice turns out to be. Nonetheless, several obser- vations support the validity of the difference found.

First, as previously stated, the structural repertoire of mice is family-specific. So, due to the fact that the largest family (V,l) encodes canonical structure class l-2 (see Fig. 4), the structural repertoire of mice should remain dominated by this class, although we might have over- estimated the number of functional gene segments in this family. Second, the amount of expected and found sequences in those V, families encoding for class 1-3 are similar (see Table 2). Therefore, the estimation of the contribution of class l-3 to the structural repertoire of mice should be correct. Third, in families where no gene segments were found (VH1 1, VH1 3 and V,15) only the representative sequence of the V,l 1 family encodes class l-3 (see Fig. 1) and this family has from one to six members (see Table 2). Thus, the contribution of this family to the proportion of class l-3 in the structural repertoire of mice should be marginal. Finally, within those other families in which no gene segments were found (VH1 3 and VH1 5 families), their representative members encode classes 14 and 2-2 (VH13 and VH 15 families, respectively), therefore, they do not contribute to the total amount of classes l-2 and 1-3. Altogether, these observations indicate that, when knowledge of the mice VH repertoire is completed, the difference between humans and mice regarding classes l-2 and l-3 might change quantitatively but not qualitatively.

DISCUSSION In the preceding sections we have shown that humans

and mice encode inverted proportions of canonical struc- ture classes l-2 and 1-3 in their VH germ-line genes. From a structural point of view, canonical structure classes l- 2 and l-3 differ at the canonical structure of H2. The canonical structures 2 and 3 are the only two hyp- ervariable loops that, having the same size (Fig. 2), dis- play different conformations (Chothia and Lesk, 1987; Chothia et al., 1989). However, this change does not contribute so much to the variations of the antigen-bind- ing site shape (Vargas-Madrazo et al., 1995a). Thus, the difference found may be fortuitous, i.e., irrelevant for the mechanism of the immune response or, alternatively, such structural divergence may have a functional mean- ing.

From an evolutionary perspective, VH gene segments of mice and humans have been classified in three main groups or clans (Schroeder et al., 1990; Tutter et al., 1991; Kirkham et al., 1992). These clans represent three progenitor elements whose descendants have coexisted in the vertebrate genome for 200 millions years (Anderson and Matsunaga, 1995) or more (Ota and Nei, 1994) before the divergence of humans and mice took place - 70 million years ago. Expansion and divergence from

those three clans have generated the currently known 15 VH mice families and the 7 VH human families (Schroeder et al., 1990; Kirkham et al., 1992). Clans and families have preserved distinctive structural features, such as the framework 1 (FRl) and framework 3 (FR3) structures, throughout evolution. Structural preservation of these portions has been explained in terms of the essential roles they play in antibody function (Schroeder et al., 1990; Kirkham et al., 1992).

In contrast with the structurally conserved FRl and FR3, it has been proposed that the hypervariable loop, being directly implied in the specific recognition of a wide variety of antigens, have been the target of strong environmental diversifying pressures in the course of evolution (Perlmutter et al., 1985; Schroeder er al., 1990; Kirkham et al.. 1992; Sims et al., 1992; Litman et al., 1993). However. as already mentioned, the structural rep- ertoire of mice is family-specific (Fig. 4) which implies restrictions to the random diversification of the hyp- ervariable loops conformations (canonical structures) and their combinations within the same V,., segment (canonical structure classes). Although less prominent, human repertoire follows this same family-specific fea- ture (Fig. 5). Moreover, inspection of the structural rep- ertoire of humans and mice, as classified by clans, shows that canonical structures are also clan-specific (Table 3). Therefore, preservation of the structural repertoire, even across species, strongly suggests restrictions operating to counteract the random diversification of the hyp- ervariable loop structure.

A more detailed analysis of the evolutionary relation- ships of the V,, repertoire of mice and humans reveals that the largest family in mice (V,l) belongs to clan 1 while the largest one in humans (V,3) belongs to clan 111 (Schroeder et al., 1990; Kirham et al.. 1992). The fact that the largest families in their respective species have developed from different ancestral elements suggests that the VH gene segments of human and mice have followed different evolutionary pathways. Interestingly, this diver- gence correlates well with the difference found in the structural repertoire. That is. the V,l family of mice encodes canonical structure 1-2 (see Fig. 4) while the V,3 family of humans mainly encodes class 1 -3 (see Fig. 5). Therefore, this correlation, jointly with the suggestion that some mechanism preserves the structural repertoire, supports the proposition that the found ditrerences have a functional meaning.

A possibility to explain the different development of the structural repertoire of mice and humans relies on the indirect or direct interaction of classes ll2 and ll3 with bacterial or self-antigens named superantigens (Zouali, 1995). In humans, for example, the protein A of StapphJs- lococcus aweus is highly specific to the Vi,3 family (Sass0 rt al., 1989; Sasso et al., 1991). This specificity is probably due to a direct contact between the superantigen molecule and the VH3 family-conserved FR3 region of antibodies (Sass0 et al., 1989; Sasso et a/., 1991). In structural terms, residue 71 within the FR3 segment is the major deter- mining factor of the conformation of canonical structures 2 and 3 in H2 (Tramontano et ~1.. 1990). That implies a

1212 J. C. ALMAGRO et al.

Table 3. Comparison of the VH structural repertoire between human and mice as classified by clans.

Mice

Frequency Clan” v,csc (%I

I l-2h 94.1 I-? 3.9 l-1 2.0 l-3

II l-l 60.0 3-1 20.0 2-l 10.0 ?-I 10.0 3-5

III 1-3 75.0 l--I 12.5 l-l 6.3 l-? 6.3 l-6

Humans

v,csc

l-2 1-? l-l I-3 l-l 3-l 2-1 ?-1 3-5 l-3 IL4 l-l 1-? l-6

Frequency (%I

50.0 7.1

42.9 13.3 60.0 20.0

6.7 64.0 9.0

14.0 9.0 4.0

“Clan I includes the human Vnl and V,5 families, and mice V, 1, VH9 and V,l4 families. Clan II is defined by the human V,2, V,4 and V,6 families, and the mice V,2, V,3, V,8 and V,l2 families. Clan III consists of the human V,3 family and the mice V,4, V,5, V,7, V,lO families (Schroeder et al., 1990). It should be noted that the V,l4 family of mice was described after the classification of Schroeder et al.

(1990). However, their members are very similar to the V, 1 family (> 80%) and thus it is easy to assign them to the clan I.

‘The specific canonical structure classes for each clan are shown in bold.

close relationship between the H2 conformation and the FR3 region which indirectly may account for differences in classes l-2 and l-3. A more direct interaction of classes l-2 and l-3 with superantigens might also be conceived. Since H2 is adjacent to FR3 in the three-dimensional structure, these regions jointly conform a continuous area exposed to solvent. Therefore, the shape of this area would change depending on the conformation of H2, which is in turn determined by position 71 in FR3. in that way, canonical structures 2 and 3 in H2 together with the FR3 structure might be recognized directly by different superantigens. Since superantigens are family specific and might be important within the immune response (Zouali, 1995), they would account for the different conservation and development of the specific structural repertoires of mice and humans.

A second explanation for the origin of the differences between the structural repertoire of mice and humans, its development and preservation once established, is that those genes having canonical structure classes l-2 and l- 3 possess different regulatory roles in their respective species. To support this, it is worth noting that the most frequently expressed sequence in the human repertoire is germ-line 3-23 (also called VH26, DP47, Vn3Opl and Vn 182) (Stewart et al., 1992; Schwartz and Stollar, 1994).

The 3-23 V, gene segment belongs to the Vn3 family and possesses canonical structure class l-3 (Fig. 4). Several lines of evidence suggest that over-expression of this gene segment and its idiotype (Id 16/6) is associated with important physiological roles (Stewart et al., 1992). In mice, frequent usage of the Vn gene segment H 10 (VH 10 in Fig. 1) has been reported (Schiff et al., 1988) so making an equivalent example of the human germ-line 3-23. The HI0 gene has canonical structure class l-2 (Fig. 4) and, although being assigned to the V,l4 family, it shares more than 80% nucleotide identities with sequences belonging to the VH 1 family. This gene is used in response to different antigens (Schiff et al., 1988) and, in its germ- line configuration, it is used in anti-GAT antibodies as well as in the GAT idiotypic cascade (Schiff et al., 1988). That suggests a regulatory function for this gene segment within the immune response of mice, e.g., a role to play in the idiotypic network (Schiff et al., 1986). Thus, the development of the Vu3 family in humans, particularly those members having canonical structure class l-3, and the development of the V,l family as well as the closely related VW14 family in mice (which encodes class l-2) would be associated to regulatory roles these VH families (and classes) have had in the immune response of their respective species.

Finally, a third argument is the one related to structural divergence of human and murine V, and Vi germ-line genes on the one hand (Williams et al., 1996; Almagro et al., 1997) and the differences of human and murine repertoire of D gene segments on the other (Wu et al., 1993). It has been suggested that different Vi, impose restrictions to the use of some Vn gene segments or Vn families (Yurovky and Kelsoe, 1993). That indicates additional pressures acting on the divergence of Vi, rep- ertoire in humans and mice. Furthermore, it has been shown that the length of H3 is significantly longer in human than in murine antibodies (Wu et al., 1993) which has been related with the different lengths present in the repertoire of D gene segments (Wu et al., 1993). Since a long H3 interact directly with Hl and H2 (Chothia et al., 1987) this difference may also have given shape to the currently known repertoires of different human and murine Vn genes. Of course, these restrictions do not exclude any of the other two reasons, i.e., regulatory pressures and/or specific interaction with other molecules (like superantigens), which could perfectly happen to be complementary.

In summary, we have shown that the difference between the structural repertoire of VH germ-line genes of mice and humans may have a functional meaning. Although such difference does not influence the antigen- binding site shape strongly and, thus, cannot be directly related with the initial structural restrictions operating to recognize different types of antigens, it may indeed be a reflection of species-specific regulatory and/or structural restrictions at work to balance the random diversification of the structural repertoire of V,, gene segments. There- fore, the difference here described could be very useful as a guide to choose the most human-compatible murine antibodies for human therapy.

Acknowledgements--We thank I. Tomlinson for kindly pro- viding the sequences of the 51 functional V, gene segments of humans, to H. Ceceiia and B. Levin for revision of the submitted manuscript. E. V. was supported by S-CONACyT grant no. 1843. This work was partially supported by the grant from DGAPA-UNAM lN213796.

W. Jr, (1992) lmmunoglobulin VH clan and family identity predicts variable domain structure and may influences anti- gen binding. EMBO J. 11,603-609.

Klein, R., Jaenichen, R. and Zachau, H. G. (1993) Expressed human immunoglobulin k genes and their hypermutation. Eur J. Immunol. 23, 3248-327 I.

REFERENCES

Almagro, J. C., Vargas-Madrazo, E., Zenteno-Cuevas, R., Her- mandez-Mendiola, V. and Lara-Ochoa, F. (1995) VIR: A computational tool for analysis of immunoglobulin sequences. BioSystems 35, 25-32.

Almagro, J. C.. Dominguez-Martinez, V., Lara-Ochoa, F. and Vargas-Madrazo, E. ( 1996) Structural repertoire in human VL pseudogenes of immunoglobulins: Comparison with functional germline genes and amino acid sequences. Immu- nogenetics 43, 92-96.

Kofler, R., Geley, S., Kofler, H. and Helmberg, A. (1992) Mouse variable-region gene families: complexity, poly- morphism and use in non-autoimmune responses. Immunol. Rea. 128, 5-21,

Lara-Ochoa, F., Almagro, J. C., Vargas-Madrazo, E. and Conrad, M. (1996) Antibody-antigen recognition: a canoni- cal structure paradigm. J. Mol. Ezvl. 43, 678-684.

Limpanasithikul, W., Ray, S. and Diamond, B. (195) Cross- reactive antibodies have both protective and pathogenic potential. J. Zmmunol. 155, 967-973.

Almagro, J. C., Hernandez, I., Ramirez, M. C. and Vargas- Madrazo, E. (1998) Characterization of the differences between the structural repertoire of V, germ-line gene seg- ments of mice and humans. Immunogenetics (in press).

Amzel, L. M. and Poljak, R. J. (1979) Three dimensional struc- ture of immunoglobulins. Annu. Rec. Biochem. 48, 961-997.

Anderson. A. and Matsunaga, T. (1995) Evolution of immu- noglobulin heavy chain variable region genes: a VH family can last for 15&200 million years or longer. Immunogenetics 41, 18-28.

Litman, G. W., Rast, J. P., Shamblott, M. J., Haire, R. N., Hulst, M.. Roess, W., Lipman, R. T., Hinds-Frey, K. R., Zilch, A. and Amemiya, C. T. (1993) Phylogenetic diver- sification of immunoglobulin genes and the antibody reper- toire. Mol. Biol. El&. 10, 60-72.

Barr&, S., Greenberg, A. S., Flajnik, M. and Chothia, C. (1994) Structural conservation of hypervariable regions in immu- noglobins evolution. Nuture Structural Biology 1,915-920.

Brodeur, P. H. and Riblet, R. (1984) The immunoglobulin heavy chain variable region (lgh-V) locus in the mouse 1. One hundred lgh-V genes comprise seven families of homologous genes. Eur. J. Immunol. 14, 922-930.

Chothia. C. and Lesk, A. M. (1987) Canonical structures for the hypervariable regions of imunoglobulins. J. Mol. Biol. 196,901-917.

Livant, D., Blatt, C. and Hood. L. (1986) One heavy chain variable region gene segment subfamily in the BALB/c mouse contains 50&1000 or more members. Cell47,461-470.

Mainville, C. A.. Sheehan, K. M., Klaman. L. D., Giorgetti, C. A., Press. J. L. and Brodeur. P. H. (1996) Deletional mapping of fifteen mouse VH gene families reveals a common organ- ization for three lgh haplotypes. J. Immunol. 156, 1038-1046.

Mukherjee, J.. Casadevall, A. and Scharff, M. D. (1993) Molec- ular characterization of the humoral responses to Cryp- tococcus neoformans infection and glucuronoxylomannan- tetanus toxoid conjugate immunization. ./. ~?.YP. Med. 17, 1105~1116.

Chothia. C.. Lesk. A. M., Tramontano, A., Levitt, M., Smith- Gill, S. J., Air, G., Sheriff, S., Padlan, E. A., Davies, D., Tulip. W. R.. Colman, P. M., Spinelli, S., Alzari, P. M. and Poljak, R. J. (1989) Conformations of immunoglobulin hypervariable regions. Nature 324, 877-883.

Chothia. C., Lesk. A. M., Gherardi, E., Tomlinson, 1. M., Walter. G.. Marks, J. D., Llewelyn, M. B. and Winter, G. (I 992) Structural repertoire of the human V, segments. J. Mol. Biol. 227, 799-8 17.

Ota, T. and Nei, M. (1994) Divergent evolution and evolution by the birth-and-death process in the immunoglobulin VH gene family. Mol. Biol. Evol. 11, 469482.

Perlmutter. R. M., Berson, B., Griffin, J. A. and Hood, L. (1985) Diversity in the germline antibody repertoire. Molec- ular evolution of the T15 VH gene family. J. Eyp. Med. 162, 1998-2016.

Chothia, C., Boswell, D. R. and Lesk, A. (1988) The outline structure of the T-cell a/j receptor. EMBO J. 7, 3745-3755.

Cook, G. P. and Tomlinson, 1. M. (1995) The human immu- noglobulin V,, repertoire. Immunol. Today,. 16, 237-242.

Crews, S.. Griffin, J., Huang, H., Calame, K. and Hood, L. (1981) A single VH gene segment encodes the immune response to phosphorylcholine: somatic mutation is cor- related with the class of the antibody. Cell 25, 59-66.

Hood, L., Gray. W. R., Sanders, B. G. and Dreyer, W. J. (I 967) Cold Spring Hnrhor S~wp. Quant. Biol. 32, 133.

Kabat. E. A. and Wu, T. T. (1971) Attempts to locate comp- lementarity determining residues in the variable positions of light and heavy chains. Ann. NY. Acad. Sci. 190, 382-383.

Kabat. E. A.. Wu. T. T., Perry, H. M., Gottesmann, K. S. and Foeller. C. ( I99 I ) Sequences of’ proteins of’ immunologicul intere.s/. 5th Edn., Public Health Service. N.I.H. Washington. D.C.

Poljak. R. J.. Amzel. L. M., Avey, H. P., Chen. B. L., Phiz- acherley, R. P. and Saul, F. (1973) Three-dimensional struc- ture of the Fab’ fragment of a human immunoglobulin at 2.8 A resolution. Proc. Nut. Acad. Sci. U.S.A. 70, 3305-3310.

Poul, M-A. and Lefranc, M-P. (1995) Structural cor- respondence between mouse and human immunoglobulin VH genes. Applications to the humanization of mouse mon- oclonal antibodies. Ann. N. Y. Acud. Sci. 764, 359-361.

Ruff-Jamison, S., Campos-Gonzalez, R. and Glenney, J. R., Jr. (1991) Heavy and light chain variable region sequences and antibody properties of anti-phosphotyrosine antibodies reveal both common and distinct features. J. Biol. C’hem. 26, 6607-6613.

Sasso, E.H.. Silverman, G. J., and Mannik, M. ( 1989) Human IgM molecules that bind staphylococcal protein A contain VHIII H chains. J. Immunol. 142, 277&2783.

Sasso, E.H., Silverman, G. J., and Mannik. M. (1991) Human IgA and IgG F(ab’)z that bind to staphylococcal protein A belong to the V,lll subgroup. J. Immunol. 147, 1877-1883.

Schiff, C., Milili. M., Hue, I., Rudikoff, S. and Fougereau, M. (1986) Genetic basis for expression of the idiotypic network, One unique lg VH germline gene accounts for the major family of Abl and Ab3 (Abl’) antibodies of the GAT system. J. Exp. Med. 163, 573-587.

Kirkham. P. M.. Mortari. F.. Newton. J. A. and Schroeder H . Schiff, C.. Corbet. S. and Fougereau, M. (1988) The lg germline

V, structural repertoires of mice and humans 1213

1214 J. C. ALMAGRO et al.

gene repertoire: economy or wastage? Immunol Today 9, IO- 14.

Schroeder, H. W. Jr., Hillson, J. L. and Perlmutter, R. M. (1990) Structure and evolution of mammalian VH families. ht. Immunol. 20, 41-50.

Schwartz, R. S. and Stellar, B. D. (1994) Heavy-chain directed B-cell maturation: continuous clonal selection beginning at the pre-B cell stage. Immunol. Today. 15,27-32.

Sims, M. J., Krawinkel, U. and Taussig, M. (1992) Charac- terization of germ-line genes of the VGAM3.8 VH family from BALB/c mice. J. Zmmunol. 149, 1642-1648.

Stewart, A. K., Huang, C., Long, A. A., Stellar, B. D. and Schwartz, R. S. (1992) VH-gene representation in autoan- tibodies reflects the normal human B-cell repertoire. Immu- nol. Reo. 128, 101-122.

Tomlinson, I. A., Walter, G., Marks, J. D., Llewelyn, M. B. and Winter, G. (1992) The repertoire of human germline Vn sequences reveals about fifty groups of V, segments with different hypervariable loops. J. Mol. Biol. 227, 776-798.

Tomlinson, 1. A., Cox, J. P., Gherardi, E., Lesk, A. M. and Chothia, C. (1995) The structural repertoire of the human V kappa domain. EMBO. J. 14,46284638.

Tonegawa, S. (1983) Somatic generation of antibody diversity. Nature 302, 575-581.

Tormo, J., Stadler, E., Skern, T., Auer, H., Kanzler, O., Betzel, C., Blaas, D. and Fita, I. (1992) Three-dimensional structure of the Fab fragment of a neutralizing antibody to human rhinovirus serotype 2. Protein Sci. 1, 11541161.

Tramontano, A., Chothia, C. and Lesk, A. M. (1990) Frame- work residue 71 is a major determinant of the position and conformation of the second hypervariable region in the VH domains of immunoglobulins. J. Mol. Biol. 215, 175-l 82.

Tutter, A. and Riblet, R. (1989) Conservation of an immu-

noglobulin variable-region gene family indicates a specific noncoding function. Proc. Natl. Acad. Sci. USA 86, 7460- 7464.

Tutter, A., Brodeur, P., Shlomchik, M. and Riblet, R. (1991) Structure, map position, and evolution of two newly diverged mouse Ig VH gene families. J. Immunol. 147, 3215-3223.

Vargas-Madrazo, E., Lam-Ochoa, F. and Almagro, J. C. (1995a) Canonical structure repertoire of the antigen-binding site of immunoglobulins suqgests strong geometrical restric- tions associated to the mech. nism of immune recognition. J. Mol. Biol. 254,487-504.

Vargas-Madrazo, E., Almagro, J. C. and Lara-Ochoa, F. (1995b) Structural repertoire in V, pseudogenes of immu- noglobulins: comparison with human germhne genes and human amino acid sequences. J. Mol. Biol. 246, 74-8 1.

Weill, J-C. and Reynaud, C-A. (1996) Rearrange- ment/hypermutation/gene conversion: when, where and why? Immunol. Today. 17,92-97.

Williams, S. C., Frippiat, J-P., Tomlinson, 1. A., Ignatovich, O., Lefranc, M-P. and Winter, G. (1996) Sequence and evol- ution of the human germhne VA repertoire. J. Mol. Biol. 264, 22&232.

Wu, T. T. and Kabat, E. A. (1970) An analysis of the sequences of the variable regions of Bence Jones proteins and myeloma light chains and their implications for antibody comp- lementarity. J. Exp. Med. 132, 21 I-250.

Wu, T. T., Johnson, G. and Kabat, E. A. (1993) Length dis- tribution of CDRH3 in antibodies. Proteins 16, 1-7.

Yurovky, V. and Kelsoe, G. (1993) Pairing of VH gene families with the il light chain: evidence for a non-stochastic associ- ation. Eur. J. Immunol. 23, 1975-I 979.

Zouali, M. (1995) B-cell superantigens: implications for selec- tion of the human antibody repertoire. Immunol TodaJj 16, 399405.