Post on 28-Jan-2023
M&c&r Imnumlog~. Vol. 34, No. 16-17. pp. 1199 1214, I997
Pergamon PII: 80161~5890(97)00118-l
(‘ I997 Elsevier Science Ltd. All rights reserved Printed in Great Britain
0161-5890~97 Sl7.00 + 0.00
THE DIFFERENCES BETWEEN THE STRUCTURAL REPERTOIRES OF VH GERM-LINE GENE SEGMENTS OF
MICE AND HUMANS: IMPLICATION FOR THE MOLECULAR MECHANISM OF THE IMMUNE RESPONSE
JAUN CARLOS ALMAGRO,*$ ISMAEL HERNANDEZ,” MARIA DEL CARMEN RAMIREZ* AND ENRIQUE VARGAS-MADRAZO?
* Instituto de Biotecnologia, Universidad National Aut6noma Mkxico, Apdo Postal 045 I O-3. Cuernavaca, Morelos 62250, Mexico; t Instituto de Investigaciones BioGgicas, Universidad
Veracruzana, Araucarias 280 Col. Animas, Xalapa, Ver, 91190. Mexico.
(First received 2 June 1997; accepted in revised fbrtn 4 September 1997)
Abstract-Although human and murine antibodies are similar when considering their diversification strategies, they differ in the proportion by which K and ,i type chains are present in their receptive V, repertoires. It has been shown that this difference implies a divergence in the structural repertoire of the ti and i genes of these species. Nonetheless, the differences in V, have not been systematically studied. In this paper a systematic characterization of the V, structural repertoire of mice is made. so that a comparison with the V, structural repertoire of humans. described in detail elsewhere. could be properly accomplished. Our study shows the structural repertoire of mice to be dominated by canonical structure class I-2 (- 60%) while in humans the dominant one is class l--3 ( - 40%). Analysis of the evolutionary relationships between human and mice suggest that this divergence may have a functional meaning. The implications of such findings are discussed. 1~’ 1997 Elsevier Science Ltd. All rights reserved.
Key words: immunoglobulin, Ig, canonical structures, VH repertoire, structural repertoire.
INTRODUCTION
The antigen-binding site of antibodies consists of six hyp- ervariable loops; three from VH and three from VL denoted HI. H2, H3 and Ll, L2, L3 respectively (Wu and Kabat, 1970; Kabat and Wu, 1971; Poljak et al., 1973). Although there is great sequence variability in these regions (Wu and Kabat, 1970; Kabat and Wu, 1971), it has been shown that excepting H3, the remaining five hypervariable loops have one of a small set of main- chain conformations or canonical structures (Chothia and Lesk, 1987; Chothia et al., 1989). Based on that fact, it has been found that from the total number of possible combinations of canonical structures only a few possi- bilities do exist in the known antibody sequences, named structural repertoire (Chothia et ul., 1992; Tomlinson et al., 1995; Vargas-Madrazo et al., 1995a Vargas-Madrazo et ul., 1995b; Almagro et ul., 1996). Furthermore, it has been suggested that the antigen-binding site shapes allowed by the structural repertoire correlate with the kind of antigen the antibody interacts with (Vargas-Mad-
$ Author to whom all correspondence should be addressed. Tel.: (52) (73) 291605; Fax: (52) (73) 172388; e-mail: almagro (a‘ibt.unam.mx
Ahhrez~iations: V,,, Variable heavy domain; VL, Variable light domain.
razo et ul., 1995a; Lara-Ochoa et ul., 1996). Taken to- gether, these findings provide evidence concerning struc- tural restrictions at work in the process of antigen rec- ognition.
Genetically, the structural repertoires of human and murine antibodies are generated in a similar fashion (Weill and Reynaud, 1996): Ll, L2 and most of L3 are encoded in the VL gene segments (ti and i. type), while H I and H2 are encoded in the VH gene segments (Tonegawa, 1983). In spite of this similarity, it has been noticed that the corresponding repertories of humans and mice differ in the relative proportion by which h’ and /. type chains are present in VL. In humans, roughly 60% of the V,- repertoire is x type [40 functional VK germ-line genes versus 30 functional Vj germ-line genes (Klein et ul., 1993; Tomlinson et ul., 1995; Williams et al., 1996)]. In mice, x type preponderates, being as much as 95% (Hood rt al., 1967). Such divergence implies differences in the struc- tural repertoire of humans and murine V, and V,. germ- line genes (Williams et al., 1996: Almagro et al.. 1998) and, consequently, differences in the initial structural restrictions operating to recognize different types of anti- gens.
Although differences in VH are less evident, recent stud- ies we made in the rearranged VbI sequences of mice indicate that the combination of canonical structures most frequently used is the I -~2 class (combination of
1199
1200 J. C. ALMAGRO et ul.
canonical structures in HI and H2) (Vargas-Madrazo et al., 1995a; Lara-Ochoa et al., 1996). In contrast, human V, germ-line genes, which have been thoroughly char- acterized (Cook and Tomlinson, 1995), have shown to encode predominantly canonical structure class l-3 (Chothia et al., 1992; Vargas-Madrazo et al., 1995b). We have also found that same difference at pseudogene level (Vargas-Madrazo et al., 1995b). This suggests that VH germ-line gene segments of mice and humans may encode different structural repertoires in VH too.
Such difference, however, has not been properly char- acterized, partially because the structural repertoire of the mice VH germ-line genes has not been systematically studied. A proper characterization of this subject could provide insight and additional ideas to the theories addressing the origin, organization, complexity and use of VH genes. Furthermore, if such differences in V, do exist, then taken together with the structural divergence in the repertoire of VL germ-line genes, they could shed light on the different structural constrains at work when antigen recognition takes place in human and mouse (Vargas-Madrazo et al., 1995a). In addition, such a characterization might prove useful as a criterion to choose human V, genes for humanization of murine antibodies (Poul and Lefranc, 1995).
In this paper we compiled the information published on V, gene germ-line segments of mice to characterize their structural repertoire. Comparison with its human counterpart corroborates the differences found in rearranged sequences and pseudogenes. Implications of such findings for the molecular mechanism of the immune response are discussed.
MATERIAL AND METHODS
The germ-line V, gene segments of mice
We compiled all of the Mus musculus VH gene segments reported as germ-line genes or pseudogene sequences at Genbank and LIGM, as well as in current literature up to April 1996. We found a total of 295 VH gene segments and immediately discarded 42 of them because of being duplicates (different accession numbers but identical entries) or not comprising one or both hypervariable loops (see web site http://www.ibt.unam.mx/ N almagro for a full description of the sequences).
Of the remaining 253 VH gene segments, some were identical at nucleotide level, so we considered them to be the same VH gene segment because current available information does not allow to distinguish if these sequences are recent copies of a particular VH gene seg- ment in the mice genome or if they have been sequenced more than once.
There were also present, pairs of sequences with one or two nucleotide differences (99.6% and 99.2% identities respectively). Those sequences having silent mutations (100% identical at amino acid level) were also considered to be the same gene segment. This is so because they might be alleles in different individuals or in different strains of mouse. Sequences in which the nucleotide
difference resulted in replacements (different amino acid sequences) were considered as distinct VH gene segments. Although this might seem very conservative, we relied on it because there is no established criterion to define alleles based only on the analysis of nucleotide identities. Thus, we preferred to include in the analysis all those sequences differing by at least one amino acid in order to avoid underestimating the available information.
A unique exception was made with those genes belong- ing to the S107 (V,7) family which has been well char- acterized in two strains of mice: BALB/c (Crews et al., 1981) and C57BL/lO (Perlmutter et al., 1985). In this family we have taken into account only the alleles of BALB/c (the most represented strain within the com- pilation; see below), in spite of those from C57BL/lO which differ by more than one amino acid when compared with the BALB/c sequences. In this way, we managed to finally gather 185 sequences as representative of the mice VH locus.
Classification qf the known VH gene segments in gene,fam- dies
V, gene segments in mice have been classified in 15 families based on Southern blot hybridization and sequ- encing (Brodeur and Riblet, 1984; Kofler et al., 1992; Mainville et al., 1996). Each family is represented by a prototype member defining the name of the family (Kofler et al., 1992; Mainville et al., 1996). VH sequences within families share an identity of at least 80%, whereas among those belonging to different families the identity is at most 75% (Brodeur and Riblet, 1984). Following these criteria we clustered the 185 sequences finally gath- ered into the 15 established V, families. In the case of the VH14 family, in which some members are greater than 80% identical to sequences belonging to the VH1 family, the assignment was made following the criteria estab- lished by Tutter and coworkers (199 1).
The resultant sequence alignment, as organized by families, is given in Fig. 1 and can be retrieved from web site http://www.ibt.unam.mx/ - almagro. Within VH families, the sequences are sorted according to the decreasing order or similarities they have with respect to prototype members.
Determination of,functional V, gene segments
From the 185 VH gene segments depicted in Fig. 1, 47 were reported as pseudogenes in databases or in the literature (see status column of Fig. 1). This led us to assume they had serious genetic defects and, so, were not taken into account to determine the mice V, gene segments functional repertoire. The remaining 138 VH gene segments reported as germ-line genes and potentially functional were examined to see what would their in vivo expression be. VH gene segments not expressed in vivo might have defects within the coding region hindering the formation of a stable three-dimensional V, domain. Otherwise, they may have minor genetic defects outside the coding region, for example in splicing sites, regulatory
Ig-fo
ld=
Posit
ior?
Fa
mily
' NZ
llN6
VHl
5558
VH
-186
-Za
V186
-2=/
B21c
b/B1
0Cb
C36e
C/B7
c b
VH14
5a/C
legc
= VH
186-
la
c14c
= B1
6cb
c19c
c c2
2eC
7C-0
7=
c44g
cc
c31e
c cz
oc=
C25c
; vH
28
c15c
= V2
3=
C3eg
cc/C
35ec
/C45
gc/C
44eC
B1
3c
b
VH3b
c2
2g=
B3e
S25c
b VH
6=
C46g
c c1
1c=
B20c
b c9
gcc
J558
-122
Td
c40c
c Bl
2lZb
C1
6cc
C27c
c VH
124b
/VH
124b
clo
g=
c33e
gC
C23c
c c3
aeC
C8CJ
C cz
c=
B9cb
C6
eC
p2M
5=
Cl1
= B6
C %
BB
B T
B B
B B
1111
1111
1 B
BIB1
T
I IB
B 22
2222
22
TT
B BB
B T
T B
BIB1
1 ,..
.,....
;:...,
....;:
...,..
..;:a
b...,
....f:
...,.
. . .
. ..a
bc...
.....:
...,..
..;:.
80
..,...
.,..a
bc..,
....q
p...,
Re
arra
nged
ge
ne'
statu
s'
......
......
......
......
.....
..--..
......
......
......
..--
......
......
......
......
......
......
......
...
P....
......
......
......
......
..--..
......
......
......
..--
......
......
......
......
......
. ...
......
......
......
s..
......
.--...
......
......
......
.--
. ...
......
......
......
......
......
......
......
...
......
......
......
......
.. ..-
-.....
......
......
.....-
-.....
......
......
...TS
......
......
......
H ...
.
......
......
......
. ..R
......
...--.
......
....Q
......
....--
...
......
......
......
......
....
P....
......
......
......
......
..--..
......
......
...*..
..--
......
......
......
......
......
. ...
......
......
....
..s...
......
-- ...
......
. .Q
......
....--
...
......
......
......
......
. ..I
B.
......
......
......
......
.....-
-.....
......
Q....
......
--....
..S
......
......
......
......
...
......
......
......
.. ..V
...K.
--....
+....
......
......
.--
......
......
......
......
......
......
......
...
......
......
......
......
.....
..--..
......
...Q
......
M.H
.--...
S.N
......
......
...S
......
...
B....
......
......
......
......
..--..
......
...Q.
.....M
.H.--
...S.
N....
......
.....S
...
......
......
......
......
......
.....
..--..
......
...Q.
.....M
.H.--
...S.
N....
......
.....S
..S
...
..I
......
......
......
......
.....
..--..
......
...Q.
......
...--G
.SS.
N....
......
....T
S ...
...
..I
......
......
......
......
.....
..--..
.Q...
....Q
...
. ..N
.N.--
...S.
N ...
......
......
......
......
......
.. ..T
R ...
......
......
......
s..
......
.--...
......
..Q...
...N.
N.--S
N...N
...
......
......
......
....
......
..T
......
......
......
....--
......
.....Q
......
N.N.
--SN.
..N...
......
......
S ...
......
......
......
.. ...
......
......
......
......
.. ..-
-.....
......
Q.P.
.....H
.--SN
...N.
......
......
.TS.
.....H
..
......
......
......
. M
......
.....-
-..IT
*.....
......
....Y
.--S.
...N.
......
......
.TS
......
......
......
.. ..I
...
...
..T...
......
......
......
.--...
......
..Q...
...N.
N.--S
N...N
......
......
...S.
.S
...
..I
......
..S
......
......
......
....--
S....
A....
.Q...
......
.--...
N.N.
.....G
......
.TS.
....~
..
......
......
......
......
.....
..--..
......
...Q
....
..B.N
.--SN
.R.N
......
......
...S
......
..S
...
......
......
......
......
.. ..-
-.....
......
B....
..N.Y
.--G.
SS.N
......
......
..T
......
......
......
......
. .
.T
. . .
. . .
. . .
. . .
. . .
.T
. ...
......
...
. . .
. . .
. .
. ...
...
..M
...
. .
. . .
. .
. .
......
.. ..R
. .
. .
. . .
. .S
V .R
....
......
.G
. .D
...
..K...
......
.
. . .
......
......
. .
. .
. . .
...
..P
......
. .
. I..
. ...
...
..M
...
. . .
. .
. .
. . .
......
..M
...
,‘.
. ..I
. .
. .
. . .
........
....
.G.
. .
. .
......
..v
... .G
. .
. .
.* ...
......
. .G
. . .
.
. ...
...
..v
...
.G.
. ..I
B.
. ...
...
..v
...
.G.
. ...
...
..M
...
. .
......
..M
...
...
. T
......
. .
....
T ...
....
,.D...
....
......
......
.....
..--..
......
..AQ.
.....N
.N.--
SN...
N ...
. ...
.. ..-
-.....
......
B....
..N.Y
.--G
.SS.
N
....
.....
..--..
.T...
....Q
......
D.Y.
--G..S
.N
....
.....
..--..
......
...B.
.....N
.Y.--
G.SS
.N
....
.....
..--..
.N...
....Q
......
G.Y.
--...S
.D
....
.....
..--
.. .Y
......
.Q...
R..B
.N.--
GN...
N ...
. ...
.. ..-
-...N
......
....Q
......
.--SD
SB.H
..Q.
.....
..--..
IT...
....Q
...
. ..D
.Y.--
G..S
.N
....
.....
..--
.. IT
......
.Q...
...D.
Y.--G
..S.N
...
. ...
.. ..-
-.....
......
Q....
..B...
--SDS
Y.N.
.Q.
.....
..--..
......
...Q.
......
.H.--
SDSD
.N..Q
. ...
.. ..-
-.....
......
Q....
..B...
--SDS
Y.N.
.Q.
.....
..--
......
....
.Q...
.....H
.--SD
SD.N
..Q.
.....
..--..
......
...Q.
......
.H.--
SDSD
.N..Q
. ...
.. ..-
-..IT
....S
..Q...
...D.
Y.--G
..S.N
...
. ...
.. ..-
-..IT
...
.. ..Q
......
DTH.
--G..S
.N
....
...
N...-
-..IN
...L.
..Q...
...D.
Y.--G
..S
......
...
.. ..-
- ..
IN...
L...Q
......
D.Y.
--G..S
.N
....
.....
..-
.. IN
...
...
.Q...
...N.
Y.--G
.SS.
N ...
. ...
.. ..-
-.....
......
Q....
....H
.--SD
SD.N
..Q.
.....
..--
.. IN
......
.Q...
...D.
Y.--G
T.I.N
...
. .G
.
. . .
. .
.s....
. .
. .
.TS.
...
. .
. ..T
S....
.
. . .
. T
S....
.
. . .
. T
S..
.T
. .
. .
..s...
. .
. . .
..s.
...
. .
. . .
TS...
. .
. . .
TS.
...
. .
. . .
.s....
.
. . .
..s.
...
. .
. . .
.s....
.
. . .
..s.
...
. .
. . .
.s....
. .
.
TS...
. .
. . .
.TS.
...
TS...
. .
. TS.
...
TS...
. .
. .
. ..s
....
L.TS
....
. ...
.....
v ...
...
...
..M
...
QVQL
QQPo
PgLv
KW;n
svKL
SCKn
sGYT
FTS-
-Ym
HWVK
QRPG
RGLE
WIG
RIDP
--NSG
GTKY
NBKF
KSKA
TL?W
KPSS
TAm
QLSS
LTSB
DSAV
YYCA
R TO
99
Bll-1
4
cyd-
1
TO77
163.
100
T210
RF-4
PA
N H2
0-A1
5
L3
11D
MRA
'IH
(0)
(15)
(17)
(0)
(1)
(5) (4)
F F NFSD
NF
PS
PS
NF
SD
NF
PS
NF
P NF
NF
PS
NFSD
F NF
PS
NF
NF
P NF
NF
P NF
PS
NF
NF
SD
F (1
0)
F NF
(2)
F NF
PS
NF
NF
NF
NF
NF
NF
(0)
F PS
NF
5 6 7 8 9 10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Fig.
l(a).
Ig-fo
ld=
Posi
timP
Fam
ily1
Nam
e' VH
I 55
58
(con
tinue
d)
B14e
b/B2
cb/B
5eb
VHIO
Za
VH33
b
VH5p
H3
0 PC
DPL.
lb
VH
104B
b/VH
104B
b C2
6CC
BlFb
HE
H9
b B1
3eb
VH5b
H1
3-1
b
VH10
5 b
H13-
3b
J558
-43y
d VH
3a
B23c
b/B1
8cb
VH-Id
-lid/
V(HJ
Id(C
R)d
3558
-28
VH31
b EJ
1cb
5558
-83
d
VHlO
hb
pMll=
VH
lllb
H16b
55
58-1
.3
d
VGAM
3-om
B2
6cb
B4cb
M
H VH
-Id-7
d/VH
-Id-1
4d
BALB
71
C57G
5 VA
RlOO
BA
LB17
BA
LB67
C57C
2~~C
57G6
37
All
BALB
C C5
7Clk3
/C57
G3/C
57Gl
4/C5
7C9
BB
B T
B B
B B
1111
1111
1 B
BIB1
T
I IB
B 22
2222
22
TT
B BB
B T
T B
BIB1
1 ,...,.
...;~
...,..
..;p.
..,...
.;pab
...,..
..fP.
..,...
.;P.a
bc..,
...
..,...
.;:...
,....;
:..bc
..,...
.;:...
, Re
arra
nged
ge
ne"
statu
s+
QVQL
QQPG
ABLV
KPGn
sVKL
SC~G
YTFT
S--Y
WM
HWVK
QRPG
RGLB
WIG
RIDP
--NSG
GTKY
NBKF
KSKA
TLTM
KPSS
TAYM
QLSS
LTSB
DSAV
YYCA
R ...
......
......
....
M...
......
..--..
IN...
....Q
......
D.Y.
--GR.
I.N...
......
...L.
TS
......
...
H ...
......
......
.. .V
......
.....-
-.....
......
Q ...
...
..H.--
SDSD
.N..Q
...G.
......
.S
......
......
......
.. ..I
...
......
K.
......
......
......
..--..
....B
*...Q
......
E.N.
--SN.
..N...
...R.
......
.S...
......
......
......
T I
T....
.M...
......
..--.T
...
.....
.Q...
...Y.
N.--S
..Y.N
..Q...
D....
.A..S
...
......
......
......
.
....
..S...
...T
....
.M...
......
..--.T
...
....
..Q...
...Y.
N.--S
..Y.N
..Q...
D....
.A..S
...
......
......
......
.. ...
......
V.
.RH.
......
......
....--
S....
A...H
.Q...
...B.
H.--.
..N.N
......
G....
....S
.....V
D ...
......
......
. ...
.....
SV..R
..T...
......
......
--....
.A...
..Q...
...B.
H.--.
C.NI
N....
..G...
....T
S....
.VD
......
......
....
......
......
......
......
.....
..--
......
......
......
...
---PY
SDI..
S....
N....
......
N....
.H
I ...
......
....
..T
......
....
.N...
--..IN
...L.
..Q...
...D.
Y.--G
..S.N
......
......
..TS
......
...
. ..S
.P
......
.. .R
I.....
......
--.YI
......
..Q...
...W
.Y.--
GNVN
......
..G
.....
...
S.P.
......
...M
......
.....-
-.YI..
......
Q....
..W.Y
.--GD
.S...
.....G
.T
. ..A
..B
...S.
PQ...
......
I.....
..S...
--....
......
.Q...
..AM
...--S
DSB.
.*.Q.
......
.....S
...
......
'...
......
..NT
....
.M...
......
..--.T
......
.L.Q
......
Y.N.
--S..Y
.N..Q
...D.
....A
..S
......
......
......
.....
...
S.P
.....
..L..I
......
.....-
-.DIN
......
.Q...
...W
.Y.--
GD.S
......
..G...
..A..S
......
......
..~
.. ...
...
S.P.
......
...I..
......
...--.
YI...
.....Q
......
Y.Y.
--RD.
S.N.
.....G
.....A
.TS.
......
......
......
P ...
...
...
SA...
AR...
...M
......
.....-
-.T...
......
Q....
..Y.N
.--S.
.Y.B
..Q...
D.T.
..A..S
...
......
......
......
.. ...
. ..S
.P
......
. ..R
I.....
......
--.NI
......
..Q...
...W
.Y.--
GD.N
......
..G.T
...A.
.S...
......
......
....F
...
...
......
...
R..S
......
......
...--.
..D...
....Q
......
N.Y.
--SDS
B.H.
.Q...
D....
....S
...
......
......
......
.. ...
......
...
M...
...M
......
....D
--....
......
.Q...
...T.
.T--S
DSY.
S..Q
...G.
......
BS
......
...
S...M
......
.....-
-.GIN
......
.Q...
...Y.
N.--G
N.Y.
......
.G.T
......
S....
....R
......
....P
...
...
...
S....
.R..T
...V.
......
V..N
--.LI
B....
...Q.
.....V
.N.--
G....
N....
..G...
..A..S
......
......
.D...
..F
...
.A...
.S...
......
...M
......
.S...
--.YI
......
..QE.
....*.
FL--G
..N...
.....G
.....A
.TS.
......
......
.....H
F ...
...
......
. ..R
......
......
......
-- ..
IN...
....Q
...
. ..N
.Y.--
LDSN
.N..Q
...D.
......
.S
......
...
....
..s
......
...
..RI..
.T...
....--
.NI..
..B...
Q....
..W.Y
.--GD
.N...
.....G
.T...
A..S
......
......
......
.P
...
B ...
..S
.P...
......
.I....
......
D--.N
......
SH.K
S....
.Y.Y
.--YN
...G.
.Q...
......
..NS.
.....B
...
......
......
. ...
. ..s
...
. .R
..T...
M...
.A...
..N--.
.IG...
....H
......
D.Y.
--GG.
Y.N.
.....G
.....A
.TS.
......
......
....I
.....
......
S.
....R
..T...
K....
.....A
N--..
IG...
....H
......
D.Y.
--GD.
V.N.
.....A
.....A
..S...
...B.
.R...
.....*
...
. B
...
..S.P
......
....I.
......
...D-
-.N...
...SH
.KS.
....Y
.Y.--
YN...
G..Q
......
.....N
S....
..DVR
...
.....
E....
.S...
..R..S
......
.T...
....--
.GIN
......
.Q...
...Y.
YI--G
N.N.
B....
......
..S.T
S....
..B...
......
.I.F
...
B....
.S.T
V.AR
......
M...
T....
...--.
......
....Q
...
. ..A
.Y.--
GNSD
.S..Q
...G.
.K..A
VTSA
.....B
.....N
......
..T.
......
....
..R
......
......
. .S
...--.
..N...
....Q
......
M.H
.--SD
SB.R
L.Q.
..D...
.....S
...
......
...
......
......
S.
......
....N
...--.
.IN...
L..Y
Q.I..
.*D.Y
.--G.
.S.N
......
......
..TS
......
...
S....
...T.
......
--.GI
N....
...Q
......
Y.
Y.--G
N.Y.
A...Q
..G...
....T
S....
....R
...
.. S.
......
T....
...--.
GIN.
......
Q....
..Y.Y
.--GN
.Y.A
...Q.
.G...
..S.T
S....
....R
...
....
B....
.P
......
......
I
......
...
.D--.
N....
..SH.
KS...
..G.N
.--.N
.A.S
..Q...
G....
....S
......
B .R
B.
....S
.P
......
.. ..M
......
.K..D
--.Y.
.....S
H.KS
...
..D
.N.--
.N...
S..Q
...G.
......
.S
......
..N
...
. ..S
.P
.. .R
..L...
......
..I.IT
--...N
......
.Q...
...Q.
F.--A
..S.N
...M
.EG.
......
TS
......
......
......
.....
B....
.s.P
......
....I.
..T...
...B-
-.T...
...SH
.KS.
....G
.N.--
.N...
S..Q
...G.
......
.S...
...B
.R
B....
.F.P
......
....I.
......
...D-
-.N.D
....S
H.KS
.....D
.N.--
.N...
I..Q.
..G...
.....S
......
E .R
B.
....S
.P...
......
.I....
......
D--.Y
.D
.. ..S
H.KS
.....D
.N.--
.N...
I..Q.
..G...
.....S
......
B .R
...
. L.
S ..
..M...
....I.
...T.
...S.
--..IB
......
.H...
...K.
L.--G
..S.N
......
G..K
F.A.
IS.N
...
......
......
......
B.
....S
.P...
....L
..I
.. .T
......
B--.T
......
SH.K
S ...
..G
.N.--
.N...
S..Q
...G.
......
.S...
...E
.R
B....
.s.P
......
....IP
......
...D-
-.N.D
...
. SH
.KS.
....D
.N.--
.N...
I..Q.
..G...
.....S
......
B .R
R A003
=40/
5G7
13
L77
H72
A 111.
68
MRA
llH
0~0-
2 AS
WAl
CO
17-1
AC
RF-2
H163
-130
H9
L2
11c
anti-
(cyd
-1)
llF6
MO
eCl0
4B
129
H 19.1
.2
NF
NF
PS
PS
(3)
F PS
NF NF
SD
(2)
F (4
) F
(11)
F PS
;SD
(12)
F
(4)
F (9
) F
(4)
F (1
4)
F (0
) F
(1)
F PS
(2)
F NF
NF
(0)
F PS
NF
PS
PS
(17)
F PS
(12)
:
(16)
F
(2)
F NF
(1)
F NF
(14)
F
(4)
F NF
NF
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
Fig.
I(b
)
Ig-fo
lda
Posit
i&
Fam
ily1
Nam
e"
VHl
w&l0
4 b
5558
(c
ontin
ued)
V10$
Ab/V
AR10
4A
H24
VH36
-65=
G
LVh5
0f
BALB
B BA
LBS
J558
-122
B d
VHAT
AG-Z
b J5
58-1
86d
C57C
2 C5
7G9
C57G
l BA
LB58
/B$L
B13
J558
-42X
/2
1d/V
HATA
G-lb
/H17
b pH
C103
C5
7C48
H2
6-lb
Ba
lbll/
Balb
lS
C570
26/C
57C1
6 c5
7c17
VH
104A
b C5
7G30
M
3497
6(zlA
3jd
J558
-15
H130
/HlS
b C5
7015
/C57
G10
c57c
44
VIilO
EB
b
H26
-6
b
A5Bg
b C
5702
2 pH
C102
b
vAR3
4 VH
2 ~5
2 PJ
14/V
OO76
7 VH
ox-l;
/vox-
lb M
3780
8
M26
982;
M
2698
4 M
2698
1 b
v(ox
2)g
Q5SH
.100
h
BB
B T
B B
B B
1111
1111
1 B
BIB1
T
I IB
B 22
2222
22
TT
B BB
B T
T B
BIB1
1 ,..
.,....
;~...
,....;
~...,
....;~
ab...
l....f
Y...I
....;~
.,b,..
,....~
Y...I
....;Y
...I..
..~~.
,b,..
l....f
Y...I
Re
arra
nged
ge
ne=
statu
s+
QVQL
QQ~~
LVKP
GasV
KLSC
KRSG
YTPT
S--Y
WM
HWVK
QRPG
RGLB
WIG
RIDP
--NSG
GTKY
NBKP
KSKA
TLTV
DKPS
STAY
MQL
SSLT
SBDS
AWYC
RR
......
S.
P...R
..T...
I.....
....L
T--..
.N...
*..AQ
......
Q.P.
--A..S
.N...
M..G
......
.TS.
......
......
......
F ...
...
...
S.P.
..R..T
...I..
......
.LT-
-...N
...*M
..Q
....
..Q.F
.--A.
.S.N
...M
..G...
....T
S....
......
......
...F
...
......
S.
....R
..T...
K...V
.....A
N--..
IG...
....H
......
D.Y.
--GD.
V.N.
.....G
.....A
..S...
...*..
......
......
S B.
....S
.....R
..S...
M...
T....
...--.
GIN
......
.Q
......
Y.YI
--GN.
Y.G.
.....G
.....S
.TS
......
......
.....
B....
.S.P
......
D...M
...
.....
..D--.
Y.D.
...SH
.KS.
....Y
.Y.--
.N
.. .S
..Q...
G....
....S
......
B.H
......
......
.. B.
L...S
.P
......
....
IT...
......
D--.N
.D...
.SH.
KS...
..D.N
.--.N
...I..
Q...G
......
..S...
...B
.R
B.L
.. .S
.P...
......
.IP
......
.. .D
--.N.
D....
SH.K
S....
.D.N
.--.N
...I..
Q...G
......
..S...
...B
.R
....
..S.P
...R.
.T...
I.....
....IT
--...N
...'...
Q.X.
...Q.
F.--A
..S.N
...M
..G
......
. TS
......
......
......
HF
...
......
SD
T....
......
......
.....D
--HAI
......
.BQ.
.....Y
.S.--
GN.D
I.....
..G...
..A..S
......
..N...
......
.F.K
B.
....S
.....R
..S...
....T
......
.--.G
IN...
....Q
...
. ..Y
.YL-
-GN.
Y.A.
.....G
.....S
.TS.
......
.R...
....V
I~
...
B ...
..S
.P...
......
......
...S.
.G--.
Y.N.
...S.
B~...
..B.N
.--.N
...S.
.Q...
G....
....S
......
B .R
B
...
..S.P
......
....I.
......
...D-
-.Y.N
....S
H.KS
...
. .D
.N.--
.N...
S..Q
.I.G.
......
.S...
...B
.R
B....
.S.P
...N.
.....I
......
.S..G
--.Y.
N....
S..K
S....
.B.N
.--ST
...T.
.Q...
A....
....S
...
...
..K
B....
.S.P
......
....I.
....+
....D
--.N.
.....S
H.KS
...
. .G
.N.--
.N.A
.S..Q
...G.
......
.S...
...B
.R
....
..SD.
......
....I
......
.. ..D
--HAI
.....K
.BQ.
.....Y
.S.--
GN.D
I.....
..G...
..A..S
......
..N...
......
.P.K
.
. ..S
.P...
...P.
..I...
....S
..G--.
Y....
..SH.
KS...
..B.N
.--YN
...S.
.Q...
G....
...TS
......
B.H
.....
L .._
...
...
B ...
..S
.P...
......
.I ...
.....
..D--.
Y....
S~B.
A....
.D.N
.--.N
...S.
.Q...
G....
....S
...
...
..N
B....
.S.P
....L
.P...
I.....
..S..G
--.Y.
.....S
H.KS
.....B
.N.--
YN...
S..Q
...G.
......
TS...
...B.
H....
...L
......
B.
....F
......
......
I.....
.....D
--.N.
D....
SH.~
.....D
.N.--
.~S.
S..Q
...G.
......
.S...
...B
.R
B.
.. ..S
.P...
......
.I....
...S.
.G--.
Y.N.
...S.
BKS.
....B
.N.--
ST...
T..Q
...A.
......
.S
......
..K
B.
....S
.P...
......
.I....
...S.
.G--.
Y.N.
...S.
BKS.
....B
.N.--
ST...
T..Q
...A.
......
.S
......
..K
...
. ..S
.P...
R..T
...I..
......
.LT-
-...N
...'M
..Q...
...A.
F.--A
O.S.
N..Q
M..G
......
.TS
......
......
......
. P
...
B ...
..S
.P...
......
.I....
...S.
.G--.
Y.N.
...S.
BKS.
....B
.N.--
ST...
T..Q
...A.
......
.S...
..I*
.K
B....
.S...
..RT.
S...M
......
.....-
-SGI
N....
...Q.
.....Y
.H.--
GK.Y
IH...
R..G
.T...
...S.
......
.R...
......
.F
...
B....
.S...
.GR.
.S...
....T
......
.--.G
IN
......
.Q
D....
.Y.Y
.--GN
.Y.A
.....Q
OB...
.S.T
S....
....R
......
..I.F
...
B.
....S
.P...
......
.I....
...S.
.G--.
F.N.
.M.S
H.KS
......
.N.--
YN.D
.F..Q
...G.
......
.S...
.H.B
.R..A
...
......
.. B.
....S
.P...
......
.I..M
....Q
.SD-
-.Y..*
..
.SH.
KS...
..Y.N
.--.N
.C.S
..Q...
G....
...TS
......
B .H
B.
....S
.P...
......
.I..M
..
..S.S
D--.Y
..'...S
H.KS
.....Y
.N.--
.N.C
.S..Q
...G.
......
TS...
...B
.H
B....
.S.P
......
....IT
....D
.S..G
--.I.N
....S
H.KS
.....B
.N.--
YN...
S..Q
...G.
......
TS...
...B.
H....
...L
......
...
T.
....I.
......
S..G
--.Y
....
..SH.
KS...
..Y.S
C--Y
N.A.
S..Q
...G.
..F...
TS...
....F
N ...
......
.....
......
S.
...M
......
.I....
T..K
.S.--
.NIB
......
BQ...
...E.
L.--G
.DY.
Y.I..
..G...
F.A.
TS.N
......
.G
......
......
. B
...
..S.P
...
......
. I..
M...
LS.S
D--.Y
..*...
SH.K
S....
.Y.N
.--.N
.C.S
..Q...
G....
...TS
......
B .H
B.
H ..
.SLP
KV..A
.P...
I.....
..S..G
--.Y.
.....S
H.KI
.QR.
BYVN
.--YN
...G.
.....D
.....A
..SF.
.....P
......
..L
......
E.
..K...
TVV.
......
.I..Q
....S
..G--.
Y....
..SHB
KS.*
...
L.I.-
-YN.
N.SN
.Q...
G....
....S
....N
.B
.C
QVQL
KBSG
ffiLV
APSQ
SLSI
TCTV
SGPS
LTG-
-YGV
NWVR
QPPG
KGLB
WLG
TIW
---GN
GSTD
YNST
LKSR
LTIT
KNSK
SQVF
L~NS
LQTD
DTAV
...
......
......
......
......
.. ..-
-.....
......
......
.M..-
--.D.
......
A ...
.. S.
S ...
......
......
......
...
..R
......
......
......
......
......
S-
-...H
......
......
..V..-
--AG.
..N...
A.M
...S.
S....
......
......
.....M
..
..R
......
......
......
......
...
S--..
.H...
......
.....V
..---S
D...N
.I.A.
....S
.S...
......
......
......
M
.. ..R
...
...
T....
......
......
...I..
.S--.
..H...
......
....W
..---S
D...N
...A.
....S
.S...
......
......
......
M
.. ..R
...
.. Q.
.....*
...
......
......
.. S-
-...H
....S
......
...V.
.---S
G....
...AP
I...S
.S...
......
F....
..A...
.M
.. ..K
...
.. Q.
.....Q
......
......
.....S
--...H
....S
......
...V.
.---S
G....
..AAF
I ...
S.
S....
.....F
......
A....
I ..
..R
.....
Q....
..Q...
......
......
..S--.
..H...
.S...
......
V..--
-SG.
.....r
JLFI
...S.
S....
.....F
......
~...I
...
.. ...
.. Q.
......
.....F
.....Y
.....S
--.EI
......
......
...V.
.---T
G...N
...A.
I...S
.S...
...L.
......
......
.I .
..VR
163.
72
1410
B.lO
e AC
38
205.
12
3-l-3
mAb
A4
1 50
12-6
91A3
CR
I-
Al2
D1.3
DB
l-453
.2
PS
PS
PS
NF
(0
) F NF
NF
PS
NF
NF
SD
(9)
F (1
) F NF
PS
(2
) F N
FSD
PS
PS
(6
) F
(1)
F PS
PS
PS
(0)
F NF
(1)
F PS
PS
PS
NF
NF
PS
PS
PS
(2)
F (0
) F PS
PS
PS
PS
PS
PS
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
Fig.
I(c
)
Ig-fo
ld'
BB
B T
B B
B B
1111
1111
1 B
BIBI
T
I IB
B 22
2222
22
TTB
BBB
T T
B BI
B1
Posit
iona
1
10
80
Fam
ily'
Nam
e' ( .
( .
. .
I....,
....
;: . .
. .
. .
. . ;
:a,
. ..(.
...
;: .
..(.
. . .
.Y..,
..,...
.;j~.
..,...
.;P...
,....,
...b,
..,...
.4"..
.,
VHZ
Q52
(con
tinue
d)
QVQr
XsSG
PGLV
APSQ
SLSI
TCTV
SGFS
LTG-
-YGV
NWVR
QPPG
KGLE
WLG
TIW
---GN
GSTD
YNST
LKSR
LTIT
KDNS
KSQV
F~NS
LQTD
DTAR
V v(
ox2)
g b
VHlO
l vH
3 36
-60
36-6
0 VH
-36-
60
d
VH-3
;-60b
SB
32
VH3A
l* VH
4 X-
24
V-H
441/
V441
b VH
55b
vH5
L 71
83
...
..Q...
...Q
......
......
.....
S--.
..H...
.S...
......
V..--
-SG.
.....A
API..
.S.S
......
...F.
.....A
N...I
..
..R
...
..Q...
...Q
......
......
.....
S--..
.H...
.S...
......
V..--
-SG.
.....~
I...S
.S...
......
F....
..SN.
..I
.. ..R
BV
QLQ
BSG
PSLV
KPSQ
TLSL
TCSV
TODS
ITS-
-DYW
NWIR
KFPG
hlKL
BYM
GYI
S---Y
SGST
YYNP
S~RI
SI~D
TS~Q
YY~~
S~SB
~ATY
Y~SL
...
......
......
......
......
.. ..-
-.....
......
......
....--
- ...
......
......
......
......
......
......
....
. ...
......
......
......
......
.. ..-
-.....
......
......
....--
-.....
......
......
......
......
......
...~
....
A ...
......
......
......
......
.. ..-
- G
......
......
......
..---
......
......
......
......
......
.. ..T
...P
.. ..A
D.
.....X
.X...
...S.
....T
...Y.
...D-
YA...
..Q...
....W
.....-
--....
.S...
......
......
....F
F ...
D.
...
.. ..G
......
.V...
.T...
I...T
GNYR
.S...
Q....
...W
I...Y
---...
TIT.
....T
..TI..
......
.FF.
BM..L
.A
.....
BVKV
IBSG
GO
LVQ
POO
SLKL
SCAA
SGFD
FSR-
-YW
MSW
VRQ
APG
KGLB
WIG
BINP
--DSS
TINY
TPSL
I(DKP
IISRD
NAKN
TLY~
S~SB
~DTA
tYYC
ARL
61-lP
p SE
-3Gb
76
-lBGb
/Wb7
183.
9D
vH71
83
.i' VH
lO-1
9 Vh
7183
WIi6
9.1)
b vH
7183
.14b
VH
283
VH37
.1b
VHB4
-psib
VH
7183
.llb
vH5.
0.
lb
VH71
83.
lob
57-lM
b/VH
7183
.12b
68
-S&
VHEI
Xb
VH6
5606
VH
22.1
b VH
7 To
7 Vl
b pB
V132
b
b Vl
l /p
BVlS
BQ
V13b
v3
b
VHE
3609
.7
CB17
H-3a
CB
17H-
la
CB17
H-lO
a CB
17H-
Ea
CB17
H-6a
CB
17H-
9=
VH36
09
...
LL...
......
......
......
.....-
-.....
......
......
.....-
- ...
......
......
......
......
......
......
......
...
LL
......
......
.N...
......
...--.
....A
......
.Q...
.....-
-G
......
......
......
......
......
......
......
.. DV
QLVs
SGGG
LVQP
GoSR
KLS~
GFTF
SS--F
~IIW
VRQA
PBKG
LBW
ISS-
-GSS
TLHY
ADTV
KORF
TISR
DNP~
LF~M
TSLR
SBDT
...
......
......
.....
..--..
......
......
......
..--
....
IY .
......
......
......
......
......
......
.. ...
.....
--YA.
S....
S...R
.....B
...--.
G.~.
P...T
......
...A.
...Y.
B.S
......
......
. . .
.K...
.L
......
......
. --Y
A.S.
...T.
..R...
..T...
--.G.
YTY.
P.S.
......
....A
....Y
...S
......
......
.. .K
....L
......
......
.--rP
.S...
.T...
R....
....N
--.OG
STY.
P....
......
...A.
...Y.
..S..K
...
......
.. ..K
......
...K.
...L.
......
......
--rP.
S....
T...R
.....T
...--.
G.Ym
.P.S
......
.....A
....Y
...S.
.K...
......
T .
B.K.
......
..K...
.L...
......
....--
rP.S
....S
...R.
....T
...--.
G.YT
Y.P.
S....
......
.A...
.Y...
S..K
......
...T
. .K
..
..L
......
. ..A
...--Y
D.S.
...T.
..R...
..T...
--.G.
YTY.
P.S.
......
....A
R...Y
...S.
......
.L
.....
B.M
......
...K.
...L.
......
......
--rP.
S....
T...R
.....T
...--.
00~.
P.S.
......
....A
..N.Y
...S.
......
.L
.....
B.K.
......
..K...
.L...
..T...
....--
Y..S
....T
...R.
....T
..G--.
G.m
.P.S
......
.....A
..N.Y
...S.
......
.L
.....
BL...
......
......
......
......
..--Y
A.S.
...T.
..R...
..A..T
--DG.
PIY*
P....
......
...A.
......
.S...
Y....
....L
.
.K...
......
......
..D--Y
..A...
...G.
.P...
.F...
--LAY
SIY.
....T
......
.B.A
....Y
.B.S
...
......
.....
B.K.
......
......
.L...
..T...
...D-
-YY.
Y....
T...R
......
..N--.
CGST
Y.P.
......
......
A....
Y...S
R.K
......
.....
.K .
. ..L
......
......
.--YY
.S...
.T...
R..L
..A.N
.--NG
DSTY
.P...
......
....A
....Y
...S.
.K...
..L
.....
...
K....
L....
......
...--Y
A.S.
...T.
..R...
..S..-
--SOG
STY.
P.S.
......
....A
R.I.Y
...S
......
......
.. ...
.....
L....
......
...--Y
..S
.. ..T
.D.R
..L..T
.N.--
NGGS
TY.P
.S...
......
..A...
.Y...
S..K
...
......
.. B.
......
......
gB.L
...
. BS
NEYB
.P.--
HD.S
...KT
..
.R..L
..A.N
.--DO
OSTY
.P..M
BR..I
.....T
.K..Y
...S.
......
.L
.....
BVKL
BBSG
GG
LVQ
PGG
SMKL
SCVA
SGFT
PSN-
-YW
MNW
VRQ
SPBK
GLB
WVA
BIRL
KSNN
YATH
YABS
VKO
RFTI
SRDD
SKSS
VYLQ
n NN
LRAB
DTGI
YYCT
TG
......
......
......
......
.....
..--
...
S....
......
....Q
...
.. D
......
......
......
......
......
......
......
. G
BVKL
VSSG
GGLV
QPGG
SLRL
SCAT
SGFT
FSD-
-FYN
BWVR
QPPG
KRLB
WIA
RN
KAND
YTTB
YSAS
VKG
RFIV
SRDT
SQSI
LYLQ
M
NALP
ABDT
AIYY
CARD
...
......
......
......
......
.. ..-
- ...
......
......
......
......
......
......
......
......
......
......
......
. ...
......
......
......
......
.. T.
--Y..S
......
..A...
LOFI
.....G
......
......
.TI..
.N...
......
.T...
..S.T
...
...
....
M...
......
.A...
...BA
...
. .T
.--Y.
.S...
.L.R
.SP.
.L.L
I.....
G....
......
...TI
...N.
.N...
....T
....A
S.T.
...K
. ...
......
....
..A...
....S
...
. .T
.--Y.
.N..H
R....
P...L
.LI..
...G.
I.....
.M...
.TI..
.N...
......
.T.S
T..S
.T
......
QV
TLKs
SGPG
ILKP
SQTL
SLTC
SFSG
FSLS
TSOM
OVGW
IRQP
SGKG
LBW
~W---
WDD
DKYY
NPSL
KSQL
TISK
TSRN
QVF~
ITSV
DTAD
TAV
......
....
..Q
......
......
......
F.
..I...
......
......
...---
......
...A.
..R...
.....N
......
......
....T
.
......
....
..QS
......
......
......
....
s ...
......
......
.Y
---...
..R...
....R
......
......
......
......
.T.
......
....
..Q
......
......
......
. N.
.I....
......
......
..---.
N....
......
.R...
.....N
......
..T...
....T
.
......
......
QS
......
......
...N.
.....S
......
......
....Y
---...
..R...
....R
......
......
......
......
.T
......
......
QS
......
.....V
.....P
....S
......
......
....Y
---..B
..H.K
.....R
......
..N...
......
......
.T
. ...
......
. ..Q
...
......
...
V...N
.F...
.S...
......
......
.Y---
..B..H
.K...
..R...
.....N
......
..T...
....T
.
....
F....
T....
....Y
..M.S
.MC.
......
V...L
..---C
NN..G
...F.
R....
......
N....
......
.P...
.T
....
Rear
rang
ed
gene
' st
atus
+
D23
Pab
419
LB8
NBO
C72-
3Al
XRPC
44
XRPC
24
RF-3
PA
N H3
7-40
H3
7-45
AS
WA2
H3
5-C6
H37-
60
MRK
lC
B5Fv
B1
3 AN
10
B112
79
68.2
DE
NQ10
.3.8
H2
20-7
ASW
Bl
B6.2
(1)
(4) (2)
(0)
(0)
(0)
(1)
(0)
(3)
(5)
(0)
(3)
(1)
(0)
(1)
(3)
(4)
(4)
(0)
(0)
(1)
PS
F 89
F 90
NF
SD
91
NFSD
92
F
93
F 94
F F F F F F F NF
F F NP
PS
NF
F F F F NFSD
F F F NFSD
PS
NF
F
95
96
97
98
99
100
101
I3
102
103
.n
104
c 10
5
106
: 10
7 10
8 8
109
2
110
e
111
112
113
114
115
116
117
118
119
120
121
(10)
F NF
NF
NF
PS
Fig.
I(d
)
Ig-fo
ld"
Posit
icmP
Fam
ily'
Nam
e"
wia
3609
.7
(con
tinue
d)
V31h
/vN
tJ-3.
1 vH
9 GA
M?-
8 VF
Ml;b
/VG
Kl$'
w4
sg
/VGK
TA~/
~S~~
16
1 VG
KCj
VNS;
b/VG
K4j
264
VFM
lb/2
81b/
VGK7
j VM
Sl:/1
41b/
VGK3
j VG
IC53
VG
K2'
VHlO
M
RL-D
NA4
MRL
-RP2
4BGk
/M
2146
9 VH
ll CP
3 vH
12
CH27
16
-A
vH13
vh
3609
N m
i14
vhem
7-13
vh
em7-
13
1
li2b-
;b/V
H2b-
3b
37A4
VH
10~/
H10b
/M33
391-
7' 17
c1;
14c3
vn
4a-3
b/H4
a-3
b
vH15
Vh
lSA
BB
B T
B B
B B
1111
1111
1 B
BIB1
T
I IB
B 22
2222
22
TT
B BB
B T
T B
BIB1
1 ,..
.,....
;~...
,....~
~...,
....;~
,b...
,....f
~...,
....;~
,ab,
..,.,.
.~~,
..,...
.;~...
,....f
~.ab
c..,.
...~~
...,
QVTL
KBSG
PGIL
KPSQ
TLSL
TCSF
SGPS
LSTS
GMGV
GWIR
QPSG
KGLB
WLA
HIW
---W
DDDK
YYNP
SLKS
QLTI
SKDT
SRNQ
VFLK
ITSV
DTAD
TASY
YCAR
V . V
......
Q....
.G.A
.T...
I.....
...LS
.L.K
.Q.R
-.....
S..--
--NN.
N....
....R
.....E
..N...
...L.
......
S~...
.~
QIQL
vQSG
pBLK
KpGE
TVKI
SCKA
SGYT
FTN-
-YGL
NWVK
QAPG
KGLK
WM
GWIN
T--Y
TGKS
TYAD
DFKG
RFAP
SLBT
SAIT
AYLQ
INNL
KNBD
MAT
YF~R
S ...
......
......
......
......
.. ..-
- ..
M...
......
......
....--
...BP
......
......
......
S....
......
......
...A
......
......
......
......
.....
..--
.. M
......
......
......
.--
.. .B
P....
......
......
..S...
......
....T
.....A
. ...
......
......
......
......
.. ..-
-..M
......
......
......
.--...
BP...
......
......
.C.S
......
.....Q
.T
.. ...
......
......
......
......
.. ..-
-..M
......
......
......
.--B.
.BP.
......
......
.....S
......
......
.T
.....
......
......
......
......
.....
..--
.. M
.....
......
......
..-
-N..B
P...B
B....
......
...S.
......
......
T....
.A.
......
......
......
......
....
..T--.
.MS.
......
......
.....-
-.S.V
P....
......
......
..S...
......
....T
.....A
.
......
......
......
......
....
..D--.
SMH.
......
......
.....-
-B..B
P....
......
......
..S...
......
....T
.....A
.
......
......
......
......
.....
..--.A
MH.
...
......
.....
.KY.
--N..B
P..G
......
......
...S.
......
......
......
A.
......
......
......
......
.. ..T
--A.M
Q..Q
KM...
....I.
....--
HS.V
PK..B
......
......
..S
......
......
......
. ...
......
....
..R...
......
..T--A
.MQ.
.QKM
......
.I....
.--HS
.VPK
..B...
......
.....S
......
S....
..T
.....
BVQ
LVBT
OO
GLV
QPK
GSL
KLSC
PASG
FSPN
T--N
AMNW
VRQ
APG
KGLB
WVA
RIRS
KSNN
YATY
YADS
VKDR
PTIS
RDDS
QSM
LYLQ
MNN
LKTB
DYYC
...
...
VWW
RM...
......
..A...
.T...
--Y...
......
......
......
..s
......
......
......
......
......
......
......
.. BV
QLLB
TGGO
LVQP
OOSR
GLSC
BGSG
FTFS
G--F
WM
SWVR
QTPG
KTLB
WIG
DINS
--~AI
NYAP
SI~R
FTIF
RDND
KSTL
YLQM
SNVR
SBDT
A~F~
RY
KPXQ
XW[T
CSIT
XFPI
TSG-
YYW
IWIR
QSPG
KPLB
~GYI
T---H
SGBT
FYNP
SLQS
PISI
TRBT
SKNQ
FFLQ
LNS~
BDT~
~~GD
GA
VQBS
GPoL
V.NS
.S.F
Ln...
.G...
...-..
......
......
......
.---.W
BNPL
QPIP
SRA.
S....
......
......
......
......
A ..
QVQL
VBTG
GGLV
RPGN
SLKL
SCVT
SOPT
BSN-
-YRM
HWLR
QPPG
KRLB
WIA
VI~D
~~~S
~GRF
ACSR
G BV
QLM
)SG
ABW
-PG
ASVK
LSCT
ASG
FNIK
D--D
YMHW
AKQ
RP~L
BWIG
RIDP
--AID
DTDY
APKF
QDK
ATM
ITSS
NIAY
LQSS
SSLT
SB~A
~YCP
Y ...
......
. ..-
......
......
......
--....
......
......
......
--....
......
......
......
......
......
......
...~
.. ...
......
. L.
RS...
......
......
..--Y
....V
....B
......
.W...
--BNG
..B...
...G.
...TA
.....T
....L
...
......
.....
......
....
..K...
......
......
...--S
....V
....B
......
.....-
-
.NGN
.K.D
....G
...IT
A....
.T.H
..L.R
...
....
......
.. ..L
.K...
......
......
...--T
....V
....B
......
.....-
-.NGN
.K.D
....G
...IT
A....
.T...
.L
......
......
.. ...
.....
..L.K
......
......
......
--T...
.V...
.B...
....V
...--.
NGIP
I.D...
.....I
TA...
..T
......
......
......
. ...
.....
..L.K
......
......
......
--T...
.V...
.B
......
.V
...--.
NGFP
N.D.
...G.
..ITA
.....T
...
......
......
. ..A
R ...
...
..L.R
...L.
....K
......
..--Y
....V
....B
......
.W
.. .--
BNGN
.I.D.
...G.
.SIT
A....
.T...
.L
......
.....
..AR
~VHL
QQ~G
~~LR
S~GS
~~LS
~FDS
BVF~
I-A~N
~WVR
QKPG
HGFB
WIG
DILP
--SIG
RTIY
GBKF
BDKA
TLDA
DTVS
NTAY
LBLN
SLTS
BDSA
IY~~
D
Rear
rang
ed
gene
= sta
tus+
PS
L6'
(2)
F 12
2 RF
T2
(2)
F 12
3 NF
12
4
L69
(7)
F 12
5 2B
7 (3
) F
126
TB32
(1
) F
127
C55-
7B3
(3)
F 12
8 NF
12
9 NF
13
0 AN
08
(5)
F 13
1
PS
PS
NFSD
13
2 M
RL-H
iston
e (7
) P
133
NF
134
87.9
2.6
(0)
F 13
5 NF
13
6 13
7 20
8 (1
5)
r 13
8
Fig.
l(e
) Fi
g.
1. M
ultipl
e am
ino
acid
sequ
ence
s ali
gnm
ent
of m
ice V
,, ge
rm-lin
e ge
ne s
egm
ents
. (r)
Po
sitio
ns
prim
arily
re
spon
sible
for
the
varia
ble
imm
unog
lobuli
n fo
ld (V
-lg-fo
ld)
cons
erve
d fe
atur
es
(Cho
thia
rr ul
., 19
88)
and
hype
rvar
iable
loop
defin
ition
(Cho
thia
and
Lesk
. 19
87).
With
in
this.
B s
tand
s fo
r re
sidue
s bu
ried
with
in
the
prot
ein;
T: r
esidu
es
in tu
rns;
1:
Inte
r-dom
ain
resid
ues;
V:
res
idues
be
twee
n B
and
C do
main
s (C
hoth
ia er
r nl.,
198
8);
1: H
l an
d 2:
H2
defin
ition
(Cho
thia
and
Lesk
, 19
87).
(8)
Resid
ue
num
berin
g as
in C
hoth
ia an
d Le
sk (
1987
). (x
) Vu
fam
ily
and
prot
otyp
e se
quen
ces.
(6
) Na
me,
clo
ne
or s
eque
nce
acce
ss n
umbe
r in
Genb
ank.
or n
ame
of t
he s
eque
nce
in th
e lite
ratu
re.
Supe
rscr
ipts
in
the
nam
e of
the
seq
uenc
e ind
icate
th
e st
rain
of
the
or
igin
of e
ach
of t
he s
eque
nces
as
fol
lows
: a:
C57
BL/6
; b:
BAL
B/c;
c:
C57
BL/6
J:
d: A
/J;
e: M
RL/M
pJ-L
PR/L
PR;
f: M
RL-L
PR/L
PR;
g: B
ALBi
cJ;
h: N
FS/N
; i:
BALB
/b;
j: BA
LB.K
; k:
M
RL/M
P-lp
r/lpr
; 1:
MRL
llpr;
m:
C57B
L/6
x BA
LB/c
. On
ly re
sidue
s wh
ich
diver
ge
with
re
spec
t to
the
pro
toty
pe
sequ
ence
s of
the
fam
ily
are
repr
esen
ted.
(8
) Nam
e (in
the
Kab
at’s
Data
base
) of
the
clos
est
V,,
rear
rang
ed
gene
and
nu
mbe
r of
am
ino
acid
diffe
renc
es
betw
een
this
and
the
germ
-line
gene
. (4
) F
stan
ds
for
sequ
ence
s wi
th
a re
arra
nged
co
unte
rpar
t (fu
nctio
nal);
NF
: No
n-fu
nctio
nal
sequ
ence
du
e to
not
hav
ing
a re
arra
nged
co
unte
rpar
t. Su
pers
cript
“S
D.”
mea
ns s
truct
ural
de
fect
s,
this
unde
rlined
in
the
sequ
ence
; PS
: Ps
eudo
gene
. In
serti
ons
or d
eletio
ns
that
pro
duce
fra
me
shift
cha
nges
in
the
amino
ac
id se
quen
ce
were
elim
inate
d to
obt
ain
the
mos
t co
rrect
im
mun
oglob
ulin-
like
sequ
ence
s.
Aest
hetic
s wi
thin
th
e se
quen
ces
mea
ns
a st
op c
odon
. Nu
mbe
rs
at t
he r
ight
mos
t pa
rt re
pres
ent
the
code
of
eac
h se
quen
ce
in Fi
g.
3. T
he
mult
iple
sequ
ence
s ali
gnm
ent
and
all t
he c
alcu
latio
ns
ther
ein
pres
ente
d we
re
mad
e by
usin
g th
e VI
R pa
ckag
e (A
lmag
ro
c’r ul
., 19
95).
1206 J. C. ALMAGRO et al.
elements or recombination signals (Tomlinson et a/., Table 1. Classification and repertoire of the mice V,, gene seg- 1992). ments
In vivo expression of the VH gene segments was per- formed by assigning their acid sequences to their closest rearranged functional VH sequence in a database of 627 VH amino acid sequences compiled from the Kabat’s Database on-line service (Kabat et al., 1991; see web site http://immuno.bme.nwu.edu). We chose the VW rearranged sequences having a reported specificity, in order to avoid non-productive rearrangements, therefore guaranteeing assignment of functional V, gene segments only. The database with the 627 VH amino acid sequences is available on request to the authors.
VH Gene Prototype Number of V, gene segments
family” membe? Estimated’ Found*
VH 1 VI,2 VI,3 V,4 VH5 VH6
vH7
vH8
V,,9 V,lO V,ll VH12 V,,I3 v,14 VH15 Total
55.58 Q52
36-60 X-24 7183 5606 SlO7 3609
GAM3-8 MRL-DNA4
CP3 CH27 3609N SM7
V,15a
60-1000 15
5-8
120 10
2 16
It is worth mentioning that the criterion for choosing the V, rearranged sequences, as those having a reported specificity may bias the assemble of rearranged sequences due to researchers’ interests. However, inspection of the 627 sequences indicates 137 different specificities there included. Moreover, many of the sequences reported as possessing the same specificity probably correspond to antibodies elicited against different epitopes, particularly in the case of large antigens like proteins. This increases the actual amount of different specificities. Therefore, the database of V, rearranged sequences would be sufficiently heterogeneous to detect most of the functional VH gene segments of mice.
4 8
IO
34 2-3
123-1073
7
185
To determine structural defects, we analyzed those resi- dues mainly responsible for the structural conserved fea- tures of antibodies V domains (Amzel and Poljak, 1979; Chothia and Lesk, 1987; Chothia et al., 1988). Such resi- dues were derived early from the analysis of the VL and V,, domains of the seven antibodies of known three- dimensional structure (Chothia and Lesk, 1987; Chothia et al., 1988). However, the pattern depends to some extent on the number of structures analyzed. Currently, there exist atomic structures of more than 50 antibodies with different amino acid sequences, thus allowing to update the pattern. In addition we decided, to further improve updating, to add the 627 VH amino acid sequences com- piled from the Kabat’s Database. This was done sup- posing that these sequences, having a reported specificity, are functional and should have no structural defects. The pattern is summarized in Table 1.
“ V,, gene families defined for mice. V, I to VH 14 (Kofler et ul., 1992) and V,15 (Mainville et (II., 1996).
b Name of the prototype sequence of each family. ‘Number of sequences estimated by Southern blot hybrid-
ization and sequencing: V, families l-14 (Kofler et al., 1992) and VH15 (Mainville et al., 1996).
d Number of V,, germ-line genes and V, pseudogenes found in our compilation (see Fig. 1).
types 2 and 3 (these types share the same length but differ in their conformation), and type 4 was identified with the longest loop (8 residues). Recently, two other sizes for H2 have been distinguished in the functional V, gene segments of humans: one having 7 residues (between the size of types 2/3 and type 4) and named type 5 (Chothia et al., 1992), and another one shorter than type 1 (4 residues) named type 6 [I.M. Tomlinson, personal com- munication].
Determination oj’ the canonical structures in H1 and H2
In structural terms, HI has been defined as the hyp- ervariable loop beginning at position 26 and finishing at position 32 (see head of Fig. 1). Three different sizes have been identified for this loop: canonical structures type 1 (seven residues), type 2 (eight residues) and type 3 (nine residues) (Chothia and Lesk, 1987; Chothia et al., 1989; Chothia et al., 1991).
On the other hand, H2 is defined from a structural point of view as the hypervariable loop running from position 52 to position 56 (Chothia and Lesk, 1987; Cho- thia et al., 1989). Currently, five different sizes have been found (Chothia et al., 1992; Tramontano et al., 1990). Early works assigned canonical structural type 1 to the shortest loop (5 residues), the next length (6 residues) to
The patterns of residues determining the different canonical structures for HI and H2 have been described in detail by Chothia et al. (1992). Starting from this pattern, we defined a new one (Fig. 2). This new pattern includes the recent analysis of Barr6 et al. (1994) in shark VH sequences, as well as our own analysis of recently solved VH X-ray structures (underlined amino acids in Fig. 2). For example, in H2. Valine (v) was added at position 71 in the pattern of type 2 because Fab 8F5 (Tormo et al., 1992) has this residue. This residue was not previously considered in the patterns (Chothia and Lesk, 1987; Chothia et al., 1989; Chothia et al., 1992) and does not modify the H2 conformation [the rms of the 8F5 in H2 when compared with NC41, a prototype of H2 type 2 (Chothia et al., 1989), is 0.36 A].
V, structural repertoires of mice and humans 1207
Psttm-nr
Type1 j”.,.. , /X---xX0X R
D K
. . . ..j
Fig. 2. Amino acid pattern for the canonical structure classes as defined as the simultaneous combination of canonical structures in a given sequence (Chothia et ul., 1992). The amino acid residues are shown in one letter code. X means any residue. Underlined
residues are those differing with respect to the original pattern (see Material and Methods for details).
RESULTS
The known VH germ-line gene segments of mice
Although the exact number of gene segments in the entire mice Vn germ-line gene repertoire is currently unknown, the complexity of most individual V, families has been established within a narrow range for several strains of the mouse (Kofler et al., 1992). Only the size of the largest family (Vnl) is controversial, varying from 60 (Brodeur and Riblet, 1984) to N 1000 members (Livant et al., 1986). Several lines of evidence suggest, however, that the size of the V,, 1 family is closer to 60 than to 1000 (Kofler et al., 1992).
Based on the estimated complexity of the individual V, families of mice, we first established how rep- resentative our compilation of mice Vn gene segments really was (Table 2). In most V, families the estimated number of genes and the amount we found are in good agreement. We compiled 120 V, gene segments in the Vnl family (Fig. I), supporting the proposition that the size of this family is indeed closer to 60 members than it is to 1000 (Kofler et al., 1992). In other 9 V, families (Vn2, Vu3, Vn4, Vn5, Vnl, Vn8, Vn9, VnlO and Vnl2) the established quantities of V, gene segments are also similar to those we found (see Table 2), suggesting these 9 Vn families to be well represented in our compilation.
Four VI, families (Vn6, Vn 11, Vn 13 and Vn15) showed discrepancies when the estimated and found complexity were compared (see Table 2). In the Vn6 family, less segments than expected were assigned. For the Vnl 1,
Vnl3 and Vnl5 families, no Vu gene segments were found. Nonetheless, these families have one or only a few members (Table 2) and therefore their contribution to the whole mice Vn germ-line gene repertoire should be minimal.
The functional VH germ-line gene sqyments of’ mice
Analysis of the expression in civo of the 138 Vn gene segments reported as germ-line genes and potentially functional, suggests that only 72 of them are functional (Fig. 3). Of the 66 Vn gene segments not expressed in vivo, and therefore defined as non-functional, 13 present structural defects when those residues responsible for the structural conserved features of Vn domains are analyzed (see Table 1 and the status column of Fig. 1). For exam- ple, 3 sequences within the V, 1 family (VHl45/Clegc, C19c and Cl 5c; see Fig. 1) possess Serine (s) instead of Cysteine (c) at position 22. These sequences are unable to establish the disulfide bridge that stabilizes the stan- dard fold of Vn domains (Amzel and Poljak, 1979; Cho- thia and Lesk, 1987).
In the remaining 53 sequences not showing structural defects, it was difficult to define why they had not any counterpart in the V, rearranged sequences. Hence, we can only infer that they have minor genetic defects outside the coding region. This hypothesis however, could not be properly scrutinized because information outside the coding region is not reported in many sequences. In such way we cannot discard the possibility of some of these
1208 J. C. ALMAGRO et ul.
Table 2. Pattern of residues determining the structural features The structural repertoire qffunctional VH germ-line gene of the V-Ig-fold” segments
Intra-domain positions
Position Residues buried between the p-sheets
4 6
12 18 20 22 24 34 36 38 48 49 69 78 80 82 88 90 92
G A L v L 1 C G A v L W R K L v G A G A A L L M L 1 G A Y F C
15 G S 42 G D 66 R K 67 V F 82c v L 86 D E
37 v I 39 Q H 45 L F 47 w Y 91 F Y 93 A L
VPFSH NPR VMTL IEKC A IMRKQ M V
STVPFD I IM A FW
V IMNQST IW RM F S SDTV VIMFLS FYVTIGS FISTV
MVFS STV
HN
Residues in turns KEN EHAKVQRSTW AEHQT TSL IAG
MAPIT S
Inter-domains positions Between variable domains FM A L EKLPR RPQ FIHCLS
HIS VTDKSGHMN
Between V, and C,, domains 11 L V ISFPT
“Residues differing with respect to the original pattern described by Chothia et al. (1988) are underlined. In italic those residues identified in the 627 rearranged sequences.
VH genes actually being functional even though no rearranged counterpart was found. This is so because the database of rearranged sequences was built chosen those sequences having a reported specificity to avoid non- productive rearrangements, in spite of the fact that this would introduce some bias due to the researcher’s inter- ests. However, the sample of rearranged sequences would be sufficiently heterogeneous (see Material and Methods section) to lead to the conclusion that, if some of the VH genes defined as non-functional are indeed functional, they should be exceptional.
In Fig. 4 the canonical structure classes implicit in the 72 defined as functional V, germ-line genes of mice are shown. Seventy-one of them present patterns compatible with some canonical structure in Hl. In H2, three sequences do not have a proper pattern to fit any of the canonical structure known to exist.
Analysis of the structural repertoire indicates that mice encode 6 canonical structures classes. Class l-2 is the most frequent (64%), followed by class I- 3 (17%) and class l-l (7%). Classes l-4, 3-l and 2-1 are very poorly represented in the sequences (3%, 3% and I%, respec- tively).
Interestingly, the structural repertoire of mice is not randomly distributed among the VH families. Almost all sequences within a family encode the same canonical structure class (Fig. 4). Therefore, their structural rep- ertoire is family-specific, suggesting it to be preserved despite actual diversification of the VH gene segments.
Comparison between the structural repertoire oj’mice and humans
To compare the structural repertoire of mice and humans, those canonical structure classes implicit in the 5 1 functional VH germ-line genes of humans are depicted in Fig. 5. Differently from mice, humans encode 8 canoni- cal structure classes (Fig. 6). Canonical structure classes 3-5 and l-6 implicit in human sequences were not found in the functional VH mice germ-line genes.
Canonical structure class 3-5 is encoded by germ-line 6-Ol/DP74; the only gene segment defining the human VH6 family (see Fig. 5). In mice, neither the sequences nor the pseudogenes compiled in Fig. 1 possess the proper size to fit canonical structure 5 in H2. Inspection of the 627 functional rearranged VH mice sequences indicates this size not to be present either. Therefore, it is unlikely that mice germ-line genes possess this class.
In the case of canonical structure class 1-6, one doubly sequenced pseudogene of mice (V31/VMU-3.1; Fig. 1) has 4 residues at the H2 loop which is the size cor- responding to canonical structure type 6. Because this size is found in 5 functional rearranged VH sequences [PY54, PY2 (Ruff-Jamison et al., 1991); 8H3 (Mukherjee et al., 1993) 246B.4g, 245F.6g (Limpanasithikul et u/., 1995), it would be responsible to expect that this pseudo- gene has its functional counterpart in some mouse or in certain strains of mouse. Alternatively, the pseudogene encoding this loop size might had given the segment com- prising H2 to some functional gene segment by somatic gene conversion (Weill and Reynaud, 1996), so gen- erating the rearranged sequences presenting this canoni- cal structure class.
Differences between the structural repertoire of humans and mice are also found in the proportion by which these species encode classes 2-l and 3-1 (Fig. 6). Class 2-1 is encoded only by one gene segment belonging to the mice VH3 family whilst class 3-l is encoded by two sequences: those belonging to the VH8 family (see Fig. 4).
Position Family NWW
1 E-i-186-2 1 C36e/B7c 1 C31e 1 V23 1 B25c 1 CllC 1 B12c 1 C16c 1 VH124 1 p2M5 1 H30 1 B16e 1 HS 1 H9 1 VH105 1 H13-3 1 J558-43y 1 VH3 1 B23c/BlEx 1 VH-Id-11 1 5558-28 1 Blc 1 pMl1 1 B26c 1 VH-Id-7 1 BALB71 1 C57G5 1 BALB17 1 C57C27 1 37All 1 GLvh50 1 C57C2 1 C57G9 1 irssa-43x 1 BALBll 1 C57G26
0 lL++l
V, structural repertoires of mice and humans 1209
/la -~=~~x~4~~~x9~~~~P~x68 - -
Gene segment code (See status column of Figure 1)
Fig. 3. Usage of V, germ-line gene segments of mice.
Hl
B 111111111 B 24 34
.;:a,. . A GYTPTS--Y M A GYTPTS--Y M A GYTFTS--Y M A GYTPTS--Y M A GYTFTS--Y M A GYTFTS--Y M
A GYTFTS--Y M A GYTFTS--Y I A GYTFTS--Y M A GYTFTS--Y I A GYTFTS--Y M A GYNFTS--Y I A GYTFTS--Y I A GYTPTS--Y I A GYTFTS--Y I A GYTFTS--Y M A GYTFTS--Y I A GYTFTS--Y M A GYTFTD--Y M A GYTFTS--Y I A GYVFTN--Y I A GYTFTS--Y I A GYTFTN--Y I A GYSFTS--Y M
T GYTFTS--Y I A GYTFTD--Y M A GYKFTD--Y M T GYTFTB--Y M A GYTFTD--Y M A GYTFSS--Y I A GYTFTD--Y M A GYSFTG--Y M A GYTFTD--Y M A GYTFTD--H I A GYTFTD--Y M A GYSFTG--Y M
H2 V&SC” Hl
22222222 B B 111111111 B 52 55 71 Position 24
.abc..j. . Family NatlIe . . . . . ..a.. 1' DP--NSGG V l-2 1 M34976(91A3) A GYTFTS--S I DP--NSGG V 1-2 HP--NSGS V l-2 NP--SNGG V l-2 NP--SNGR V l-2 YP--GSSS V l-2 DP--SDSB V l-2 YP--GSGS V l-2 DP--SDSY V l-2 YP--GSSS V l-2 NP--SSGY A l-2 YP--GSGS V l-2 YP--GNVN - l-? YP--GDGS A l-2 YP--RDGS A l-2 NP--SSGK A l-2 YP--GDGN A l-2 YP--SDSB V l-2 DT--SDSY V l-2 NP--GNGY V l-2 NP--GSGG A l-2 YP--LDSN V l-2 YP--GGGY A l-2 HP--SDSB V l-2 YP--GNGY 2 l-? NP--NNGA V l-2 NP--NNGG V l-2 NP--NNGG V l-2 NP--NNGG V l-2 LP--GSGS A l-2 YP--mGG v l-2 NP--N-NGG V l-2 NP--NNGG V l-2 SP--GNGD A l-2 NP--NYDS V l-2 NP--STGG V l-2
1 2
2 3 3
4 4 5 5 5 5 5 5 5 5 5
5 6
8 8 9 9 9 9 9 9 9
14 14 14
H130/H18 A GYSFTG--Y M PJ14 V GFSLTG--Y V VHox-1 V GFSLTS--Y V VHlOl V GFSLTS--Y V VH-36-60 V GDSITS--D W SB32 V GYSITSD-Y W VH3Al V G&SIlTGNY W V-H 441 A GFDFSR--Y M V(H)55 A GFDFSR--Y M 61-1P A GFTFSS--F M 98-30 A GFTPSS--Y M 76-1BG A GFTPSS--Y M
VH7183.13 A GFTPSS--Y M VHlO-19 A GFTPSS--Y M VH7183.14 A GFAFSS--Y M VH283 A GFTFSS--Y M V(Hj50.1 T GFTFSD--Y M VH7183.10 A GFTPSS--Y M 57-1M A GFTFSS--Y M 68-51 A GFTFSS--Y M Vli22.1 A GFTFSN--Y M Vl/pBV132 T GFTFSD--F M Vll/pBVl9B4 T GFTFTD--Y M CBl'IH-1 F GFSLSTSGM V CBl'IH-10 F GFSLSTSNM I VFMll/VGKlB A GYTFTN--Y M VMSP/VGKlA A GYTFTN--Y M VGKC A GYTFTN--Y M VMs2 A GYTFTN--Y M 264 A GYTFTT--Y M VFMl A GYTFTD--Y M
VGKP A GYTFm--A M HZb-3 A GFNIKD--Y M VHlO A GFNIKD--T M VH4a-3 A GFNIKD--Y M
H2 v,csc
22222222 B 52 55 71
.abc..I. HP--GKGY V l-2 NP--YNGD V l-2 W---GDGS K l-l W---AGGS K 1-l W---SGGS K l-l S---YSGS R l-l S---YSGS R 2-l Y---YSGT R 7-l NP--DSST R l-3 NP--GSST R l-3 SS--GSST R l-3 SS--GGSY R l-3 SS--GGSY R l-3 SN--0005 R l-3 SS--GGSY R l-3 SS--GGSY R l-3 SS--GGGN R l-3 SN--GGGS R l-3 NS--NGGS R l-3 S---SGGS R 1-l NS--NGGS R l-3 RLKSDNYAR l-4 Rw-YT R l-? RNKANGYTR l-4 Y---WDDD K 3-l W---WNDD K 3-l NT--YTGB L l-2 NT--YTGB L l-2 NT--BTGB L l-2 NT--NTGB L 1-2 NT--YSGV L l-2 NT--BTGB L l-2 NT--HSGV L l-2 DP--BNGD A l-2 DP--ANGN A l-2 DP--BNGN A l-2
Fig. 4. Structural repertoire of the functional V, germ-line gene segments of mice. (a) V&SC: Canonical structure of classes of V,,. ?: means that the loop does not fit the canonical structure pattern. Those residues responsible for the mismatch are underlined.
1210 J. C. ALMAGRO et al.
Position Family Name
1 l-02/DP75 1 l-03/DP25 1 l-OB/DP15 1 l-18/DP14 1 l-24/DP5 1 l-45/DP4 1 l-46/DP7 1 l-58/DP2 1 l-69/DPlO 1 l-e/DPl38 1 l-f/DP3 2 2-05/DP76 2 2-26/DP26 2 2-70/DP28 3 3-07/DP54 3 3-09/DP31 3 3-ll/DP35 3 3-13/DP48 3 3-15/DP38 3 3-20/DP32 3 3-21/DP77 3 3-23/DP47 3 3-30/DP49 3 3-30.3/DP46 3 3-30.5/DP4 3 3-33/DP50
HI
B 111111111 B 24 34
. . .;o,. . A GYTFTG--Y M A GYTFTS--Y M A GYTPTS--Y I A GYTFTS--Y I V GYTLTB--L M A GYTFTY--R L A GYTFTS--Y M A GFTFTS--S V A GGTFSS--Y I A GGTFSS--Y I V GYTPTD--Y M P GFSLSTSGV V V GFSLSNARM V F GFSLSTSGM V A GFTFSS--Y M A GFTFDD--Y M
A GFTFSD--Y M A GFTFSS--Y M A GFTFSN--A M A GFTFDD--Y M A GFTFSS--Y M
A GFTFSS--Y M A GFTFSS--Y M A GFTFSS--Y M
A GFTFSS--Y M A GPTFSS--Y M
H2 V&SC
22222222 B 52 71 .abc..I. .
NP--NSGG R l-3 NA--GNGN R l-3 NP--NSGN R l-3 SA--YNGN T l-2 DP--BDGB g l-? TP--FNGN R 1-3 NP--SGGS R l-3 V-V--GSGN R l-3 IP--1FGT A l-2 IP--1FGT A l-2 DP--BDGB A 1-2 Y---WNDD K 3-l F---SNDB K 3-l D---WDDD K 3-l KQ--DGSB R l-3 SW--NSGS R l-3 SS--SGST R l-3 G---TAGD R l-l KSKTET R l-? NW--NGGS R 1-3 SS--SSSY R 1-3 SG--SGGS R l-3 SY--DGSN R 1-3 SY--DGSN R l-3 SY--DGSN R l-3 WY--DGSN R l-3
Position Family Name
3 3-43/DP33 3 3-48/DP51 3 3-49 3 3-53/DP42 3 3-64/DP61 3 3-66/DP06 3 3-72/DP29 3 3-73/YAC3 3 3-74/DP53 3 3-a 4 4-04/DP70 4 4-20/DP68 4 4-31/DP65 4 4-30.2/DP64 4 4-30.4/DP70 4 4-30.1/DP65 4 4-34/DP63 4 4-39/DP79 4 4-59/DP71 4 4-61/DP66 4 4-b/DP67 5 5-51/DP73 5 5-a 6 6-Ol/DP74 7 7-4.1/DP21
HI H2 V&SC
B 111111111 B 24 34
. . . ..oab. . A GFTFDD--Y M A GFTFSS--Y M A GFTFGD--Y M A GFTVSS--N M A GFTFSS--Y M A GFTVSS--N M
A GFTFSD--H M A GFTFSG--S M A GFTFSS--Y M A GFTVSS--N M V GGSISSS-N W V GYSISSS-N W v GGSISSGGY w v GGSISSGGY w V GGSISSGDY W v GGSISSGGY w V GGSFSG--Y W v GGSISSSSY w v GGSISS--Y w V GGSVSSGSY W V GYSISSG-Y W G GYSFTS--Y I G GYSFTS--Y I I GDSVSSNSA W A GYTFTS--Y M
22222222 52 .abc.. I. I' SW--DGGS R 1-3 SS--SSST R l-3 RSKAYST R l-? Y---SGGS R 1-l SS--NGGS R l-3 Y---SGGS R 1-l RNKANSYTR l-4 RSKANSYA R l-4 NS--DGSS R l-3 S----G&S R l-6 Y---HSGS V 2-l Y---YSGS V 2-l Y---YSGS V 3-1 Y---HSGS V 3-l Y---YSGS V 3-l Y---YSGS V 3-1 N---HSGS V l-l Y---YSGS V 3-l Y---YSGS V 1-l Y---YSGS V 3-1 Y---HSGS V 2-1 YP--GDSD A 1-2 DP--SDSY A l-2 YYR-SKWY P 3-5 NT--NTGN L l-2
Fig. 5. Structural repertoire of the functional VH germ-line gene segments of humans. V&SC: Canonical structure classes of V,. ?: means that the loop does not fit the canonical structure pattern. Those residues responsible of the mismatch are underlined.
10 +
0 a
H Mice q lHumans
I-l 1-2 1-3 l-4 l-6 2-l 3-1 3-5 ?-1 l-?
Canonical structure classes Fig. 6. Comparison of the VH structural repertoire in mice and humans.
In humans, canonical structure class 2-l is encoded by l-3 have inverted proportions in humans and mice (Fig. three gene segments, while class 3-l is implicit in 9 6). The most common class in mice (l-2; -64%) has sequences: 6 from the VH4 family (half the sequences a lower frequency in humans (- 14%). Conversely, in belonging to this VH family) and 3 from the Vn3 family humans the most frequent one is l-3 (- 39%), which has (see Fig. 5). a relatively low frequency in mice (- 17%). Among those
Besides the differences described above, classes l-2 and found, this contrast is the most noticeable because it
V, structural repertoires of mice and humans 121 I
involves roughly half the structural repertoire of mice and humans. Since the human V, locus has been completely determined (Cook and Tomlinson, 1995), the scope of this astounding difference depends on how complete and precise our compilation of the functional V, gene seg- ments of mice turns out to be. Nonetheless, several obser- vations support the validity of the difference found.
First, as previously stated, the structural repertoire of mice is family-specific. So, due to the fact that the largest family (V,l) encodes canonical structure class l-2 (see Fig. 4), the structural repertoire of mice should remain dominated by this class, although we might have over- estimated the number of functional gene segments in this family. Second, the amount of expected and found sequences in those V, families encoding for class 1-3 are similar (see Table 2). Therefore, the estimation of the contribution of class l-3 to the structural repertoire of mice should be correct. Third, in families where no gene segments were found (VH1 1, VH1 3 and V,15) only the representative sequence of the V,l 1 family encodes class l-3 (see Fig. 1) and this family has from one to six members (see Table 2). Thus, the contribution of this family to the proportion of class l-3 in the structural repertoire of mice should be marginal. Finally, within those other families in which no gene segments were found (VH1 3 and VH1 5 families), their representative members encode classes 14 and 2-2 (VH13 and VH 15 families, respectively), therefore, they do not contribute to the total amount of classes l-2 and 1-3. Altogether, these observations indicate that, when knowledge of the mice VH repertoire is completed, the difference between humans and mice regarding classes l-2 and l-3 might change quantitatively but not qualitatively.
DISCUSSION In the preceding sections we have shown that humans
and mice encode inverted proportions of canonical struc- ture classes l-2 and 1-3 in their VH germ-line genes. From a structural point of view, canonical structure classes l- 2 and l-3 differ at the canonical structure of H2. The canonical structures 2 and 3 are the only two hyp- ervariable loops that, having the same size (Fig. 2), dis- play different conformations (Chothia and Lesk, 1987; Chothia et al., 1989). However, this change does not contribute so much to the variations of the antigen-bind- ing site shape (Vargas-Madrazo et al., 1995a). Thus, the difference found may be fortuitous, i.e., irrelevant for the mechanism of the immune response or, alternatively, such structural divergence may have a functional mean- ing.
From an evolutionary perspective, VH gene segments of mice and humans have been classified in three main groups or clans (Schroeder et al., 1990; Tutter et al., 1991; Kirkham et al., 1992). These clans represent three progenitor elements whose descendants have coexisted in the vertebrate genome for 200 millions years (Anderson and Matsunaga, 1995) or more (Ota and Nei, 1994) before the divergence of humans and mice took place - 70 million years ago. Expansion and divergence from
those three clans have generated the currently known 15 VH mice families and the 7 VH human families (Schroeder et al., 1990; Kirkham et al., 1992). Clans and families have preserved distinctive structural features, such as the framework 1 (FRl) and framework 3 (FR3) structures, throughout evolution. Structural preservation of these portions has been explained in terms of the essential roles they play in antibody function (Schroeder et al., 1990; Kirkham et al., 1992).
In contrast with the structurally conserved FRl and FR3, it has been proposed that the hypervariable loop, being directly implied in the specific recognition of a wide variety of antigens, have been the target of strong environmental diversifying pressures in the course of evolution (Perlmutter et al., 1985; Schroeder er al., 1990; Kirkham et al.. 1992; Sims et al., 1992; Litman et al., 1993). However. as already mentioned, the structural rep- ertoire of mice is family-specific (Fig. 4) which implies restrictions to the random diversification of the hyp- ervariable loops conformations (canonical structures) and their combinations within the same V,., segment (canonical structure classes). Although less prominent, human repertoire follows this same family-specific fea- ture (Fig. 5). Moreover, inspection of the structural rep- ertoire of humans and mice, as classified by clans, shows that canonical structures are also clan-specific (Table 3). Therefore, preservation of the structural repertoire, even across species, strongly suggests restrictions operating to counteract the random diversification of the hyp- ervariable loop structure.
A more detailed analysis of the evolutionary relation- ships of the V,, repertoire of mice and humans reveals that the largest family in mice (V,l) belongs to clan 1 while the largest one in humans (V,3) belongs to clan 111 (Schroeder et al., 1990; Kirham et al.. 1992). The fact that the largest families in their respective species have developed from different ancestral elements suggests that the VH gene segments of human and mice have followed different evolutionary pathways. Interestingly, this diver- gence correlates well with the difference found in the structural repertoire. That is. the V,l family of mice encodes canonical structure 1-2 (see Fig. 4) while the V,3 family of humans mainly encodes class 1 -3 (see Fig. 5). Therefore, this correlation, jointly with the suggestion that some mechanism preserves the structural repertoire, supports the proposition that the found ditrerences have a functional meaning.
A possibility to explain the different development of the structural repertoire of mice and humans relies on the indirect or direct interaction of classes ll2 and ll3 with bacterial or self-antigens named superantigens (Zouali, 1995). In humans, for example, the protein A of StapphJs- lococcus aweus is highly specific to the Vi,3 family (Sass0 rt al., 1989; Sasso et al., 1991). This specificity is probably due to a direct contact between the superantigen molecule and the VH3 family-conserved FR3 region of antibodies (Sass0 et al., 1989; Sasso et a/., 1991). In structural terms, residue 71 within the FR3 segment is the major deter- mining factor of the conformation of canonical structures 2 and 3 in H2 (Tramontano et ~1.. 1990). That implies a
1212 J. C. ALMAGRO et al.
Table 3. Comparison of the VH structural repertoire between human and mice as classified by clans.
Mice
Frequency Clan” v,csc (%I
I l-2h 94.1 I-? 3.9 l-1 2.0 l-3
II l-l 60.0 3-1 20.0 2-l 10.0 ?-I 10.0 3-5
III 1-3 75.0 l--I 12.5 l-l 6.3 l-? 6.3 l-6
Humans
v,csc
l-2 1-? l-l I-3 l-l 3-l 2-1 ?-1 3-5 l-3 IL4 l-l 1-? l-6
Frequency (%I
50.0 7.1
42.9 13.3 60.0 20.0
6.7 64.0 9.0
14.0 9.0 4.0
“Clan I includes the human Vnl and V,5 families, and mice V, 1, VH9 and V,l4 families. Clan II is defined by the human V,2, V,4 and V,6 families, and the mice V,2, V,3, V,8 and V,l2 families. Clan III consists of the human V,3 family and the mice V,4, V,5, V,7, V,lO families (Schroeder et al., 1990). It should be noted that the V,l4 family of mice was described after the classification of Schroeder et al.
(1990). However, their members are very similar to the V, 1 family (> 80%) and thus it is easy to assign them to the clan I.
‘The specific canonical structure classes for each clan are shown in bold.
close relationship between the H2 conformation and the FR3 region which indirectly may account for differences in classes l-2 and l-3. A more direct interaction of classes l-2 and l-3 with superantigens might also be conceived. Since H2 is adjacent to FR3 in the three-dimensional structure, these regions jointly conform a continuous area exposed to solvent. Therefore, the shape of this area would change depending on the conformation of H2, which is in turn determined by position 71 in FR3. in that way, canonical structures 2 and 3 in H2 together with the FR3 structure might be recognized directly by different superantigens. Since superantigens are family specific and might be important within the immune response (Zouali, 1995), they would account for the different conservation and development of the specific structural repertoires of mice and humans.
A second explanation for the origin of the differences between the structural repertoire of mice and humans, its development and preservation once established, is that those genes having canonical structure classes l-2 and l- 3 possess different regulatory roles in their respective species. To support this, it is worth noting that the most frequently expressed sequence in the human repertoire is germ-line 3-23 (also called VH26, DP47, Vn3Opl and Vn 182) (Stewart et al., 1992; Schwartz and Stollar, 1994).
The 3-23 V, gene segment belongs to the Vn3 family and possesses canonical structure class l-3 (Fig. 4). Several lines of evidence suggest that over-expression of this gene segment and its idiotype (Id 16/6) is associated with important physiological roles (Stewart et al., 1992). In mice, frequent usage of the Vn gene segment H 10 (VH 10 in Fig. 1) has been reported (Schiff et al., 1988) so making an equivalent example of the human germ-line 3-23. The HI0 gene has canonical structure class l-2 (Fig. 4) and, although being assigned to the V,l4 family, it shares more than 80% nucleotide identities with sequences belonging to the VH 1 family. This gene is used in response to different antigens (Schiff et al., 1988) and, in its germ- line configuration, it is used in anti-GAT antibodies as well as in the GAT idiotypic cascade (Schiff et al., 1988). That suggests a regulatory function for this gene segment within the immune response of mice, e.g., a role to play in the idiotypic network (Schiff et al., 1986). Thus, the development of the Vu3 family in humans, particularly those members having canonical structure class l-3, and the development of the V,l family as well as the closely related VW14 family in mice (which encodes class l-2) would be associated to regulatory roles these VH families (and classes) have had in the immune response of their respective species.
Finally, a third argument is the one related to structural divergence of human and murine V, and Vi germ-line genes on the one hand (Williams et al., 1996; Almagro et al., 1997) and the differences of human and murine repertoire of D gene segments on the other (Wu et al., 1993). It has been suggested that different Vi, impose restrictions to the use of some Vn gene segments or Vn families (Yurovky and Kelsoe, 1993). That indicates additional pressures acting on the divergence of Vi, rep- ertoire in humans and mice. Furthermore, it has been shown that the length of H3 is significantly longer in human than in murine antibodies (Wu et al., 1993) which has been related with the different lengths present in the repertoire of D gene segments (Wu et al., 1993). Since a long H3 interact directly with Hl and H2 (Chothia et al., 1987) this difference may also have given shape to the currently known repertoires of different human and murine Vn genes. Of course, these restrictions do not exclude any of the other two reasons, i.e., regulatory pressures and/or specific interaction with other molecules (like superantigens), which could perfectly happen to be complementary.
In summary, we have shown that the difference between the structural repertoire of VH germ-line genes of mice and humans may have a functional meaning. Although such difference does not influence the antigen- binding site shape strongly and, thus, cannot be directly related with the initial structural restrictions operating to recognize different types of antigens, it may indeed be a reflection of species-specific regulatory and/or structural restrictions at work to balance the random diversification of the structural repertoire of V,, gene segments. There- fore, the difference here described could be very useful as a guide to choose the most human-compatible murine antibodies for human therapy.
Acknowledgements--We thank I. Tomlinson for kindly pro- viding the sequences of the 51 functional V, gene segments of humans, to H. Ceceiia and B. Levin for revision of the submitted manuscript. E. V. was supported by S-CONACyT grant no. 1843. This work was partially supported by the grant from DGAPA-UNAM lN213796.
W. Jr, (1992) lmmunoglobulin VH clan and family identity predicts variable domain structure and may influences anti- gen binding. EMBO J. 11,603-609.
Klein, R., Jaenichen, R. and Zachau, H. G. (1993) Expressed human immunoglobulin k genes and their hypermutation. Eur J. Immunol. 23, 3248-327 I.
REFERENCES
Almagro, J. C., Vargas-Madrazo, E., Zenteno-Cuevas, R., Her- mandez-Mendiola, V. and Lara-Ochoa, F. (1995) VIR: A computational tool for analysis of immunoglobulin sequences. BioSystems 35, 25-32.
Almagro, J. C.. Dominguez-Martinez, V., Lara-Ochoa, F. and Vargas-Madrazo, E. ( 1996) Structural repertoire in human VL pseudogenes of immunoglobulins: Comparison with functional germline genes and amino acid sequences. Immu- nogenetics 43, 92-96.
Kofler, R., Geley, S., Kofler, H. and Helmberg, A. (1992) Mouse variable-region gene families: complexity, poly- morphism and use in non-autoimmune responses. Immunol. Rea. 128, 5-21,
Lara-Ochoa, F., Almagro, J. C., Vargas-Madrazo, E. and Conrad, M. (1996) Antibody-antigen recognition: a canoni- cal structure paradigm. J. Mol. Ezvl. 43, 678-684.
Limpanasithikul, W., Ray, S. and Diamond, B. (195) Cross- reactive antibodies have both protective and pathogenic potential. J. Zmmunol. 155, 967-973.
Almagro, J. C., Hernandez, I., Ramirez, M. C. and Vargas- Madrazo, E. (1998) Characterization of the differences between the structural repertoire of V, germ-line gene seg- ments of mice and humans. Immunogenetics (in press).
Amzel, L. M. and Poljak, R. J. (1979) Three dimensional struc- ture of immunoglobulins. Annu. Rec. Biochem. 48, 961-997.
Anderson. A. and Matsunaga, T. (1995) Evolution of immu- noglobulin heavy chain variable region genes: a VH family can last for 15&200 million years or longer. Immunogenetics 41, 18-28.
Litman, G. W., Rast, J. P., Shamblott, M. J., Haire, R. N., Hulst, M.. Roess, W., Lipman, R. T., Hinds-Frey, K. R., Zilch, A. and Amemiya, C. T. (1993) Phylogenetic diver- sification of immunoglobulin genes and the antibody reper- toire. Mol. Biol. El&. 10, 60-72.
Barr&, S., Greenberg, A. S., Flajnik, M. and Chothia, C. (1994) Structural conservation of hypervariable regions in immu- noglobins evolution. Nuture Structural Biology 1,915-920.
Brodeur, P. H. and Riblet, R. (1984) The immunoglobulin heavy chain variable region (lgh-V) locus in the mouse 1. One hundred lgh-V genes comprise seven families of homologous genes. Eur. J. Immunol. 14, 922-930.
Chothia. C. and Lesk, A. M. (1987) Canonical structures for the hypervariable regions of imunoglobulins. J. Mol. Biol. 196,901-917.
Livant, D., Blatt, C. and Hood. L. (1986) One heavy chain variable region gene segment subfamily in the BALB/c mouse contains 50&1000 or more members. Cell47,461-470.
Mainville, C. A.. Sheehan, K. M., Klaman. L. D., Giorgetti, C. A., Press. J. L. and Brodeur. P. H. (1996) Deletional mapping of fifteen mouse VH gene families reveals a common organ- ization for three lgh haplotypes. J. Immunol. 156, 1038-1046.
Mukherjee, J.. Casadevall, A. and Scharff, M. D. (1993) Molec- ular characterization of the humoral responses to Cryp- tococcus neoformans infection and glucuronoxylomannan- tetanus toxoid conjugate immunization. ./. ~?.YP. Med. 17, 1105~1116.
Chothia. C.. Lesk. A. M., Tramontano, A., Levitt, M., Smith- Gill, S. J., Air, G., Sheriff, S., Padlan, E. A., Davies, D., Tulip. W. R.. Colman, P. M., Spinelli, S., Alzari, P. M. and Poljak, R. J. (1989) Conformations of immunoglobulin hypervariable regions. Nature 324, 877-883.
Chothia. C., Lesk. A. M., Gherardi, E., Tomlinson, 1. M., Walter. G.. Marks, J. D., Llewelyn, M. B. and Winter, G. (I 992) Structural repertoire of the human V, segments. J. Mol. Biol. 227, 799-8 17.
Ota, T. and Nei, M. (1994) Divergent evolution and evolution by the birth-and-death process in the immunoglobulin VH gene family. Mol. Biol. Evol. 11, 469482.
Perlmutter. R. M., Berson, B., Griffin, J. A. and Hood, L. (1985) Diversity in the germline antibody repertoire. Molec- ular evolution of the T15 VH gene family. J. Eyp. Med. 162, 1998-2016.
Chothia, C., Boswell, D. R. and Lesk, A. (1988) The outline structure of the T-cell a/j receptor. EMBO J. 7, 3745-3755.
Cook, G. P. and Tomlinson, 1. M. (1995) The human immu- noglobulin V,, repertoire. Immunol. Today,. 16, 237-242.
Crews, S.. Griffin, J., Huang, H., Calame, K. and Hood, L. (1981) A single VH gene segment encodes the immune response to phosphorylcholine: somatic mutation is cor- related with the class of the antibody. Cell 25, 59-66.
Hood, L., Gray. W. R., Sanders, B. G. and Dreyer, W. J. (I 967) Cold Spring Hnrhor S~wp. Quant. Biol. 32, 133.
Kabat. E. A. and Wu, T. T. (1971) Attempts to locate comp- lementarity determining residues in the variable positions of light and heavy chains. Ann. NY. Acad. Sci. 190, 382-383.
Kabat. E. A.. Wu. T. T., Perry, H. M., Gottesmann, K. S. and Foeller. C. ( I99 I ) Sequences of’ proteins of’ immunologicul intere.s/. 5th Edn., Public Health Service. N.I.H. Washington. D.C.
Poljak. R. J.. Amzel. L. M., Avey, H. P., Chen. B. L., Phiz- acherley, R. P. and Saul, F. (1973) Three-dimensional struc- ture of the Fab’ fragment of a human immunoglobulin at 2.8 A resolution. Proc. Nut. Acad. Sci. U.S.A. 70, 3305-3310.
Poul, M-A. and Lefranc, M-P. (1995) Structural cor- respondence between mouse and human immunoglobulin VH genes. Applications to the humanization of mouse mon- oclonal antibodies. Ann. N. Y. Acud. Sci. 764, 359-361.
Ruff-Jamison, S., Campos-Gonzalez, R. and Glenney, J. R., Jr. (1991) Heavy and light chain variable region sequences and antibody properties of anti-phosphotyrosine antibodies reveal both common and distinct features. J. Biol. C’hem. 26, 6607-6613.
Sasso, E.H.. Silverman, G. J., and Mannik, M. ( 1989) Human IgM molecules that bind staphylococcal protein A contain VHIII H chains. J. Immunol. 142, 277&2783.
Sasso, E.H., Silverman, G. J., and Mannik. M. (1991) Human IgA and IgG F(ab’)z that bind to staphylococcal protein A belong to the V,lll subgroup. J. Immunol. 147, 1877-1883.
Schiff, C., Milili. M., Hue, I., Rudikoff, S. and Fougereau, M. (1986) Genetic basis for expression of the idiotypic network, One unique lg VH germline gene accounts for the major family of Abl and Ab3 (Abl’) antibodies of the GAT system. J. Exp. Med. 163, 573-587.
Kirkham. P. M.. Mortari. F.. Newton. J. A. and Schroeder H . Schiff, C.. Corbet. S. and Fougereau, M. (1988) The lg germline
V, structural repertoires of mice and humans 1213
1214 J. C. ALMAGRO et al.
gene repertoire: economy or wastage? Immunol Today 9, IO- 14.
Schroeder, H. W. Jr., Hillson, J. L. and Perlmutter, R. M. (1990) Structure and evolution of mammalian VH families. ht. Immunol. 20, 41-50.
Schwartz, R. S. and Stellar, B. D. (1994) Heavy-chain directed B-cell maturation: continuous clonal selection beginning at the pre-B cell stage. Immunol. Today. 15,27-32.
Sims, M. J., Krawinkel, U. and Taussig, M. (1992) Charac- terization of germ-line genes of the VGAM3.8 VH family from BALB/c mice. J. Zmmunol. 149, 1642-1648.
Stewart, A. K., Huang, C., Long, A. A., Stellar, B. D. and Schwartz, R. S. (1992) VH-gene representation in autoan- tibodies reflects the normal human B-cell repertoire. Immu- nol. Reo. 128, 101-122.
Tomlinson, I. A., Walter, G., Marks, J. D., Llewelyn, M. B. and Winter, G. (1992) The repertoire of human germline Vn sequences reveals about fifty groups of V, segments with different hypervariable loops. J. Mol. Biol. 227, 776-798.
Tomlinson, 1. A., Cox, J. P., Gherardi, E., Lesk, A. M. and Chothia, C. (1995) The structural repertoire of the human V kappa domain. EMBO. J. 14,46284638.
Tonegawa, S. (1983) Somatic generation of antibody diversity. Nature 302, 575-581.
Tormo, J., Stadler, E., Skern, T., Auer, H., Kanzler, O., Betzel, C., Blaas, D. and Fita, I. (1992) Three-dimensional structure of the Fab fragment of a neutralizing antibody to human rhinovirus serotype 2. Protein Sci. 1, 11541161.
Tramontano, A., Chothia, C. and Lesk, A. M. (1990) Frame- work residue 71 is a major determinant of the position and conformation of the second hypervariable region in the VH domains of immunoglobulins. J. Mol. Biol. 215, 175-l 82.
Tutter, A. and Riblet, R. (1989) Conservation of an immu-
noglobulin variable-region gene family indicates a specific noncoding function. Proc. Natl. Acad. Sci. USA 86, 7460- 7464.
Tutter, A., Brodeur, P., Shlomchik, M. and Riblet, R. (1991) Structure, map position, and evolution of two newly diverged mouse Ig VH gene families. J. Immunol. 147, 3215-3223.
Vargas-Madrazo, E., Lam-Ochoa, F. and Almagro, J. C. (1995a) Canonical structure repertoire of the antigen-binding site of immunoglobulins suqgests strong geometrical restric- tions associated to the mech. nism of immune recognition. J. Mol. Biol. 254,487-504.
Vargas-Madrazo, E., Almagro, J. C. and Lara-Ochoa, F. (1995b) Structural repertoire in V, pseudogenes of immu- noglobulins: comparison with human germhne genes and human amino acid sequences. J. Mol. Biol. 246, 74-8 1.
Weill, J-C. and Reynaud, C-A. (1996) Rearrange- ment/hypermutation/gene conversion: when, where and why? Immunol. Today. 17,92-97.
Williams, S. C., Frippiat, J-P., Tomlinson, 1. A., Ignatovich, O., Lefranc, M-P. and Winter, G. (1996) Sequence and evol- ution of the human germhne VA repertoire. J. Mol. Biol. 264, 22&232.
Wu, T. T. and Kabat, E. A. (1970) An analysis of the sequences of the variable regions of Bence Jones proteins and myeloma light chains and their implications for antibody comp- lementarity. J. Exp. Med. 132, 21 I-250.
Wu, T. T., Johnson, G. and Kabat, E. A. (1993) Length dis- tribution of CDRH3 in antibodies. Proteins 16, 1-7.
Yurovky, V. and Kelsoe, G. (1993) Pairing of VH gene families with the il light chain: evidence for a non-stochastic associ- ation. Eur. J. Immunol. 23, 1975-I 979.
Zouali, M. (1995) B-cell superantigens: implications for selec- tion of the human antibody repertoire. Immunol TodaJj 16, 399405.