Emergence of the immunoglobulin family: conservation in protein sequence and plasticity in gene...

7
Glycobiology vol. 6 no. 7 pp. 657-663, 1996 MINI REVIEW Emergence of the immunoglobulin family: conservation in protein sequence and plasticity in gene organization John J.Marchalonls 1 , Ralph M.Bernstein, Shan Xiang Shen and Samuel F.Schluter Microbiology and Immunology, University of Arizona, Tucson, AZ 85724, USA 'To whom correspondence should be addressed Key words: evolution/light chains/heavy chains/recombina- tion activator genes/T-cell receptors Introduction The immune response of jawed vertebrates is an inducible, highly specific defense mechanism that is characterized by an enormous diversity in recognition capacity. This allows re- sponses to potential pathogens that have never been previously encountered in the evolution of the species. The molecules that carry out the specific recognition of antigen are glycoproteins termed immunoglobulins (Igs) or T-cell receptors (TCRs) that occur as heterodimers consisting of pairs of light and heavy chains (Igs) or a/|3 or 7/8 TCR chains (Kabat et al, 1991). These molecules are homologous members of the same family that express variable (V) and constant (C) domains, with the V/V pairs containing the combining site for antigen and the C domains involved in dimerization and effector function. The genes specifying each chain comprise individual multigene families which, in mammals, contain large arrays of variable segments (50-300), 1-10 diversity segments (heavy chains, TCR 3 chains), a set of joining (J) segments, (a few for Igs, but as many as a 100 for TCR a chains), and a few constant domains (1 for K light chains, approximately 10 for the heavy chain translocon). Individual sequence diversity and commit- ment of a B or T cell to differentiation into an antigen-specific immunocyte (B or T cell) results from the recombination of a V, D, and J element to form a complete variable region for surface expression with an appropriate constant region. Since Igs and TCRs exist as heterodimers, further amplification of recognition capacity is contributed by the selection of the part- ner chain; for example, if there are 5000 possible complete V H (VDJ) segments and 1500 possible complete V K (VJ) seg- ments, there are 7,500,000 possible V H /V K combining sites that can be formed. This calculation illustrates the magnitude of individual combining sites that can be generated, but is an underestimate because other mechanisms including somatic mutation and junction diversification within the D and J seg- ments contribute additional possibilities. We will summarize recent evidence to illustrate that the fundamental genetic mechanism allowing the combinatorial diversification of anti- body occurs in all jawed vertebrates, and is clearly present in representatives of the most primitive of living gnathastomes, the sharks. Early in the course of biochemical studies of Igs, Hill and his associates (Hill et al, 1966) made the seminal discovery that V and C domains of light and heavy chains were homologous to one another and proposed a scheme for the evolution of Igs based upon their derivation from a domain size precursor of approximately 110 residues. Attempts to gain an understanding of the 'big picture' of the evolution of Igs required the appli- cation of recombinant DNA technology. We will review recent data showing that clearly defined homologs of light chains (Schluter et al, 1989; Shamblott and Litman, 1989a; Green- berg et al, 1993; Raster al, 1994; Hohman etal, 1995), heavy chains (Kokubu et al, 1988a,b; Vazquez et al, 1992; Shen et al, 1996), and TCRs (Rast and Litman, 1994) are present in the most primitive of jawed vertebrates. In addition, sharks contain an additional class of Ig having many of the properties ex- pected of the primordial Ig (Bernstein et al, 1996b) as pro- posed by Hill et al. (Hill et al, 1966). Overall, the data ob- tained on the evolution of Igs, and their close relatives the TCRs, supports the concept that approximately 450 millions years ago (coincident with the origins of ancestral vertebrates) a 'big bang' (Marchalonis and Schluter, 1990a) of gene dupli- cation occurred which, coupled with the addition of DNA pro- cessing enzymes facilitating recombination (Greenhalgh et al, 1993; Bernstein et al, 1994, 1996a), generated the combina- torial immune response typical of vertebrates. Preconditions for expression of the combinatorial immune response Attempts to delineate relics of initial steps in the combinatorial immune response by investigating lower deuterostomes such as echinoderms and tunicates (Klein, 1989; Marchalonis and Schluter, 1990a,b; Smith and Davidson, 1992) and even the most primitive living vertebrates, the agnathan cyclostomes as represented by lampreys and hagfish, have thus far been in- conclusive. Although a number of workers have described an- tibody-like molecules in cyclostomes (Marchalonis and Edel- man, 1968; Litman etal, 1970; Raison etal, 1978), definitive structural studies or gene characterizations have not been pub- lished. Lampreys and hagfish have, however, been shown to possess molecules clearly related to complement components of higher vertebrates (Hanley et al, 1992; Nonaka and Taka- hashi, 1992). The definitive characteristic of the vertebrate immune re- sponse is the combinatorial rearrangement of gene segments to provide the enormously diversified arrays of recognition mol- ecules. RAG1 is an essential component of the recombination process. We have isolated genes from ancient organisms that © Oxford University Press 657 by guest on July 13, 2011 glycob.oxfordjournals.org Downloaded from

Transcript of Emergence of the immunoglobulin family: conservation in protein sequence and plasticity in gene...

Glycobiology vol. 6 no. 7 pp. 657-663, 1996

MINI REVIEW

Emergence of the immunoglobulin family: conservation in protein sequence andplasticity in gene organization

John J.Marchalonls1, Ralph M.Bernstein, Shan XiangShen and Samuel F.Schluter

Microbiology and Immunology, University of Arizona, Tucson, AZ 85724,USA

'To whom correspondence should be addressed

Key words: evolution/light chains/heavy chains/recombina-tion activator genes/T-cell receptors

Introduction

The immune response of jawed vertebrates is an inducible,highly specific defense mechanism that is characterized by anenormous diversity in recognition capacity. This allows re-sponses to potential pathogens that have never been previouslyencountered in the evolution of the species. The molecules thatcarry out the specific recognition of antigen are glycoproteinstermed immunoglobulins (Igs) or T-cell receptors (TCRs) thatoccur as heterodimers consisting of pairs of light and heavychains (Igs) or a/|3 or 7/8 TCR chains (Kabat et al, 1991).These molecules are homologous members of the same familythat express variable (V) and constant (C) domains, with theV/V pairs containing the combining site for antigen and the Cdomains involved in dimerization and effector function. Thegenes specifying each chain comprise individual multigenefamilies which, in mammals, contain large arrays of variablesegments (50-300), 1-10 diversity segments (heavy chains,TCR 3 chains), a set of joining (J) segments, (a few for Igs, butas many as a 100 for TCR a chains), and a few constantdomains (1 for K light chains, approximately 10 for the heavychain translocon). Individual sequence diversity and commit-ment of a B or T cell to differentiation into an antigen-specificimmunocyte (B or T cell) results from the recombination of aV, D, and J element to form a complete variable region forsurface expression with an appropriate constant region. SinceIgs and TCRs exist as heterodimers, further amplification ofrecognition capacity is contributed by the selection of the part-ner chain; for example, if there are 5000 possible complete VH

(VDJ) segments and 1500 possible complete VK (VJ) seg-ments, there are 7,500,000 possible VH/VK combining sites thatcan be formed. This calculation illustrates the magnitude ofindividual combining sites that can be generated, but is anunderestimate because other mechanisms including somaticmutation and junction diversification within the D and J seg-ments contribute additional possibilities. We will summarizerecent evidence to illustrate that the fundamental geneticmechanism allowing the combinatorial diversification of anti-body occurs in all jawed vertebrates, and is clearly present in

representatives of the most primitive of living gnathastomes,the sharks.

Early in the course of biochemical studies of Igs, Hill and hisassociates (Hill et al, 1966) made the seminal discovery that Vand C domains of light and heavy chains were homologous toone another and proposed a scheme for the evolution of Igsbased upon their derivation from a domain size precursor ofapproximately 110 residues. Attempts to gain an understandingof the 'big picture' of the evolution of Igs required the appli-cation of recombinant DNA technology. We will review recentdata showing that clearly defined homologs of light chains(Schluter et al, 1989; Shamblott and Litman, 1989a; Green-berg et al, 1993; Raster al, 1994; Hohman etal, 1995), heavychains (Kokubu et al, 1988a,b; Vazquez et al, 1992; Shen etal, 1996), and TCRs (Rast and Litman, 1994) are present in themost primitive of jawed vertebrates. In addition, sharks containan additional class of Ig having many of the properties ex-pected of the primordial Ig (Bernstein et al, 1996b) as pro-posed by Hill et al. (Hill et al, 1966). Overall, the data ob-tained on the evolution of Igs, and their close relatives theTCRs, supports the concept that approximately 450 millionsyears ago (coincident with the origins of ancestral vertebrates)a 'big bang' (Marchalonis and Schluter, 1990a) of gene dupli-cation occurred which, coupled with the addition of DNA pro-cessing enzymes facilitating recombination (Greenhalgh et al,1993; Bernstein et al, 1994, 1996a), generated the combina-torial immune response typical of vertebrates.

Preconditions for expression of the combinatorialimmune responseAttempts to delineate relics of initial steps in the combinatorialimmune response by investigating lower deuterostomes such asechinoderms and tunicates (Klein, 1989; Marchalonis andSchluter, 1990a,b; Smith and Davidson, 1992) and even themost primitive living vertebrates, the agnathan cyclostomes asrepresented by lampreys and hagfish, have thus far been in-conclusive. Although a number of workers have described an-tibody-like molecules in cyclostomes (Marchalonis and Edel-man, 1968; Litman etal, 1970; Raison etal, 1978), definitivestructural studies or gene characterizations have not been pub-lished. Lampreys and hagfish have, however, been shown topossess molecules clearly related to complement componentsof higher vertebrates (Hanley et al, 1992; Nonaka and Taka-hashi, 1992).

The definitive characteristic of the vertebrate immune re-sponse is the combinatorial rearrangement of gene segments toprovide the enormously diversified arrays of recognition mol-ecules. RAG1 is an essential component of the recombinationprocess. We have isolated genes from ancient organisms that

© Oxford University Press 657

by guest on July 13, 2011glycob.oxfordjournals.org

Dow

nloaded from

J J.Marchalonls et al

Mouse VKV Trout VX

Mouse VX

Human VK

Hydrolagous VXRaja VX

Sandbar Shark LCX

Sandbar Shark VXX

Rabbit VH

Mouse V|_|

Xenopus V(-|

Human

Nurse Shark

Sandbar Shark Vj_|Z

Sandbar Shark

Sandbar Shark VH32

Human V6

Sheep V8

Mouse V6

Human Va

NARV

Horn Shark VHII

Bull Shark Vm 4 ThyBull Shark V^ 2 Thy

Sandbar Shark V^ 4 WHISandbar Shark Va2 Spleen

Shark vp

Fig. 1. Phylogenetic radial tree analysis showing the relationships of variable region domains from lmmunoglobulins and T-cell receptors. The tree wasconstructed using the computer programs CLUSTALW (Thompson et al., 1994) which uses the neighbor joining method (Saitou and Nei, 1987), andDRAWTREE (Felsenstein, 1996). Bootstrapping was performed 10,000 times. Amino acid sequences derived from gene sequence were used in the analysis.Sequences are available from the GenBank database, except sandbar shark VH sequences (Shen et al.. 1996).

specify the recombinase activating gene 1 (RAG-1) using evo-lutionary PCR with primers based upon conserved peptide seg-ments (Bernstein et al., 1994). These primers allowed the iso-lation of highly homologous partial gene segments from Car-charhine sharks, the paddlefish (a chondrostean), the axolotl(an amphibian), the goldfish (a teleost), and the pig (Bernsteinet al., 1994). RAG-1 from in the bull shark, Carcharhinusleucas (Bernstein et al., 1996a), consists of 1112 amino acidresidues that can be arranged into six domains expressing de-grees of identity to human varying from less than 20% togreater than 80%. Thus, there is no question that the mostprimitive of living jawed fishes as represented by sharks pos-sess the genetic machinery essential for the expression of thecombinatorial immune response. An extremely interesting ho-mology between a segment of the shark RAG-1 and membersof the bacterial integrase family of site specific recombinasespossibly provides a significant clue to the function of the mol-ecule. Since RAG-1 works in association with RAG-2 in therecombination of immunoglobulin genes (Schatz et al., 1992),we analyzed RAG-2 for homology to integration host factors(IHF) which act in association with integrases in microbialsystems. There was sufficient homology between RAG-2 and

IHF (Bernstein et al., 1996a) to suggest that vertebrate recom-binase activating genes and bacterial site specific recombinasesshare common ancestry and have a similar mechanism of ac-tion. We and others have tried these same PCR primers, thathave worked successfully in all gnathostome vertebrates, toclone homologs from agnathans and lower deuterostomes withno success. Thus, the apparent sudden appearance of the RAGgene system in the jawed vertebrates may have been due to alateral transfer of microbial genes prior to 400 million yearsago as was first suggested by Schatz et al. (Schatz et al., 1989).

Igs of gnathostomes

Igs and TCRs comprise a family of antigen specific recognitionmolecules showing greater than 30% identity to one anotherwhen compared throughout the group of jawed vertebrates(Marchalonis and Schluter, 1989). Neither they nor the recom-bination mechanisms necessary for their generation (Klein,1989; Bernstein et al., 1994; Thompson, 1995) have yet beenfound below the phylogenetic level of Chondrichthyian fishes.However, the combinatorial immune system is essentiallycomplete at this lowest level of extant jawed vertebrates be-

658

by guest on July 13, 2011glycob.oxfordjournals.org

Dow

nloaded from

Evolution of immiinoglobulins

Immunoglobulinheavy chain

MHCclass I

Human class 1 a3

Carp class 1 a3

Carp class 2 P2

MHCclass II

Human |52 microglobulin

Moth hemolin

Bullfrog CK

Chick class 2 02

Human DR P2

Human DQ al

Nurse shark class 2 a2

ImmunoglobulinChick CX , . , ° , .

light chainHuman CX.

Human Tcr pMouse Tcrp

Chick Tcr p 1 c e "

receptor

FSg. 2. Rooted phylogenetic tree depicting the relationships among immunoglobulin constant domains of light chains, heavy chains, T-cell receptors, andMHC products. The ultimate carboxyl terminal domain was analyzed for heavy chains (e.g., C|x4 for IgM. C-y3 for IgG). Only the constant domain-likesegments of the MHC products (Marchalonis et al., 1984, 1994) and hemolin (Sun et al., 1990) were included in the analysis. The tree was constructed usingthe progressive alignment procedure of Feng and Doolittle (Feng and Doolittle, 1990).

cause the genes necessary for combinatorial rearrangement andexpression of bona fide Ig family members occur in broadprofusion in diverse members of this vertebrate class (Marcha-lonis and Schluter, 1990b; Greenhalgh et al., 1993; Bernsteinet al., 1994). The distinctions between variable and constantdomains, light chains and heavy chains, and TCRs were firmlyestablished by the advent of the cartilaginous fishes.

Figure 1 is a phylogenetic tree that illustrates the clusteringof variable domain sequences of light chains, heavy chains, andTCRs of elasmobranchs and various higher vertebrates. In ad-dition to VH domain sequence, data are included for the Vdomain of the sandbar shark w heavy chain, a recently discov-ered molecule (Bernstein et al., 1996b; Greenberg et al., 1996)which has six CH domains and a new set of variable domainsrelated to, but distinct from, those of traditional |x chains. Thismolecule has properties expected of the primordial Ig (Hill etal., 1966) or heavy chain in that it shows closer homology incomparison of its V and C domains to one another than do the|x heavy chains or \-like light chains in the same species. Inthis diagram, the heavy chains clearly form a cluster distinctfrom that of the light chains and TCRs and the Vw domains canbe used to root the VH groupings. It is interesting that a hornedshark VH domain which is the VH II set as defined by Litmanand his colleagues (Hinds-Frey et al., 1993) clusters with theVw domains. We have defined this grouping as the primordialVH set. The other shark VH domains form a distinct group withhuman VHIII and amphibian, mouse, and rabbit VH structures.

FR1 CDR1

HU VB8.1 (I)DAGVIQSPRHEVTEMGQEVTLRGKP Ig-QHHSLFHnS VR E R S R Y D V | G ) N T I , T V A E G [ J 7 1 ' T ' " : : " - C F Q Npw ,-.,,„...,.,HllVAV Q S A L T f O f S A - S G S K G ' . ^TGTSSDVMu VK D I Q M T ! § T T S S i i S A S L G _ . . - . CRASQ j _ . . . _ . : ,SbSVX D P v L T^j PIGIS i - S S S P G ( K | T V T I T C T M S G G T I S • • • • . ;

FR2 CDR2LJIYFNNNV-PIDDS GMPFD

IGTSDTSHu VI38.1 (I)WYPQTMMPGHnS VB WYHu VAV WY . .Mu VK WY .: Q r. I'DGTSbS VA WYWQKPDSAPAFVWSESDRMA--

FR3

DG P ,DG:P D

HuV(38.1 (I)R^SAKMPNASFSTHnS V(3 RFKVTRPDLKTC.HU VAV RFSGSKSGNTAfMu VK RFSGSGSGTDYJ5SbS VA RFAGSVDSSfNK.K

CDR3 -•4- FR4Hu V68.1 (I) SFSTCSANYGYTFGSGTRLTVVHnS V(3 SGHPgDgNSEA /FGDGTKLVVLHuVAV YEGSDNf 7FGTGTK7TVLGMU VK GNSLPrTFGAGTKLr : KSbSVA --AAgR§PfRSIFGSGTKL :.LG

Fig. 3. Comparative alignment of complete V domains of immunoglobulinlight chains and T-cell receptor p chains. Human (Hu), horned shark (HnS),mouse (Mu), and sandbar shark (SbS) sequences are referenced in the text.Identities between Vp and VL sequences are shaded and alternate identitiesare boxed. The framework (FR) and complementarity determining regions(CDR) are positioned according to the human V\ sequence (Kabat et al,1991).

659

by guest on July 13, 2011glycob.oxfordjournals.org

Dow

nloaded from

JJ.Marchalonis el at

\- FR1Hu VHIII EVQLLESGGGLVQPGGSLRLSCAASGFTFSSYAMSSbS VH DVVLTQPDEETGHPGDSLKLTCSTSGFKLANYWMGSbS Vo) DIVLTQPEGVVKKPGETVRLSCAVSGFDIARVYIS

FR2 CDR2

Hu VHIII WVRQAPGKGLEWV-SAISGSGGSTYYADSVKGSbSVH WIRQVPGQGLEWLVSYYSTESN--YYEPSIQGSbS VCD WVKQGPGKGLEWLLYHDSRPQG FAPGIEG

, FR3 ,Hu VHIII RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAKSbSVH RFTTSKDNN--MFSLHMTTLKTEDTAIYYCARSbS V(0 RFSPSAVSN--TAYIEITSLRADDTAIYYCAR

\- CDR3 FR4Hu VHIII GQVLYYGSGSYHWFDPWGQGTLVTVSSSbSVH VRSSSVAKGTGDWGQGTMVTVTTSbS V(0 GEYEYSTPFDYWGSGTFVEVTS

Fig. 4. Comparative alignment of heavy chain V regions of humans (Hu)(Kabat et al., 1991), and the sandbar shark (SbS) (Bernstein et al., 1996b;Shen et al., 1996). Identities are shaded. The framework (FR) andcomplementarity determining regions (CDR) are positioned according to thehuman sequence.

We consider these shark domains to belong to an 'archaic' VH

set that also includes some teleost fish VH domains as definedby Andersson and Matsunaga (Andersson and Matsunaga,1995). The TCR domains can be considered either to form acluster distinct from that of true light chains, but closely relatedto it, or to provide the base of the Light chain grouping. The VXlike domains of raffish (Hydrolagus), skate (Raja), and sandbarshark (SBS) form a cluster of their own that branches off fromhuman V\.

We would emphasize that the phylogenetic relationshipsgenerated here must be considered approximate because eachof these V domain sets represents a large diverse multigenefamily. The fact that such schemes can be generated resultsfrom the conservation of residues essential for proper foldingto provide a basis for domain/domain interaction and for thestabilization of combining sites for antigen.

Figure 2 is a dendrogram depicting the relationships amongIg constant domains of light chains and TCRs compared withthe most V-region-distal domains of heavy chains (e.g., thethird constant domain of the 7 chain and the fourth constantdomain of the (i chain) and the IgC-like domains of MHCproducts and the inducible Ig-superfamily molecule of the ce-cropia moth (Hemolin). The actual identity between hemolindomains and that of traditional Ig C regions is less than 20%which is indicated by the distance in this diagram. The MHCdomains and human (32 microglobulin form a separate clusterthat shows the expected separations into Class I and Class II,distinctions between a and (3 chains, and phylogenetic dis-crimination within the clusters. Heavy chains and light chainsform separate groupings within the Ig constant domains and theTCR chains map as an early offshoot of the light chains. Thesandbar shark C\ clusters with the CX group, and K light chainsof mammals and amphibians form their own grouping. In ad-dition to the X-like light chains of sharks (Schluter et al, 1989;Hohman et al, 1995), the horned shark has a separate typeapparently restricted to Chondrichthyians (Shamblott and Lit-man, 1989a), and the nurse shark has a K-like light chain(Greenberg et al, 1993). All of these individual light chaintypes are found in sharks, but the relative proportions vary indifferent species. For example, X-like light chains comprise the

major group of families in Carcharhine sharks, while the majorlight chain type in the nurse shark is K like, and that of thehorned shark is 'horned shark like' (termed VL I by Rast et al,1994).

As was first predicted by Singer and Doolittle (Singer andDoolittle, 1966) and shown by Hill et al (Hill et al, 1966), VH

and VL are as homologous to one another as are the V domainsto the constant domains. Figure 3 shows a comparative se-quence alignment of the complete V domains (V/(D)/J) of lightchain like Igs as represented by the type I TCR V(3 chains ofhumans (Kabat et al, 1991), a homologous V(3 of the homedshark (Rast and Litman, 1994), a human VX5 domain that hasbeen completely characterized in terms of sequence and 3-di-mensional structure (Edmundson et al, 1976), a murine VKdomain (Kabat et al, 1991), and a sandbar shark VX structure(Hohman et al, 1992). This diagram illustrates common fea-tures of the Ig variable domain. All of the sequences can bereadily mapped into framework (Fr) and complementarity de-termining regions (CDR) by homology with the human VX5molecule. In 3-dimensional folding, the frameworks form thescaffolding allowing the molecule to have a conformation nec-essary for VL/VH interaction and provide stabilization for theCDRs that contain the contact residues in antigen recognition.In general, the Frs contain (3 band structure and CDRs consistof regions of various structures forming loops between (3bands. Bona fide Ig family domains contain an internal disul-fide bond formed between the cysteine at the end of Frl and thecysteine two residues N-terminal to the end of Fr3. The regionsN-terminal to these cysteines are highly conserved in bothpositions and lie within (3 bands. By contrast, the residuesC-terminal to these cysteines are involved in CDRs and showgreat variation. In both cases, the CDRs are terminated byhighly conserved regions in the second and fourth Frs. Thesedefinitive characteristics are found in comparison of TCR (3chains with light chains and also within these groups in aphylogenetic comparison between species comprising approxi-mately 450 million years of divergence time.

Figure 4 shows a comparable alignment of VH type domainsof humans and sharks containing a VH3 structure, a sandbarshark VH derived from a classical IgM molecule and a repre-sentative of the primitive VM family. The degree of conserva-

4-1 (A) 4-2 (B)Hu CXSbSCX.Hu CB EDLNKVFPP"HnSCB GENDTlfcp

.OANKATLVCL:SAKNMATLVCLV.

3-1 (C)

4-4 ID) 4-3 IE)Hu CX CPVFAGVEJTKPSKQSSbSCX - :E|SRIQQEiHu CI3 ^ .'SfDPQ.PLKEQ.PAL

SL:;EQWKS-HRsysCS1NS-HELYS|

. .-. . .-fQNPRNHFRSHnS CB GKEKD4NDTNIHTDLN§ILSKE|}rSY£;S£RI(RFD£,LI

, . 3-3 (G)

•IRSKN-»E<S

- P T E B SHuCA. ®THE GSTVEKTVA-SbSCX L$KHEA*HuCB jHnS CP. ~{j- - v T NS F. S V P T T S f

Fig. 5. Comparative alignment of light chain constant region type domains:human (Hu) CX (Kabat et aL, 1991), sandbar shark (SbS) CX (Schluter etal., 1989), kuman CP (Kabat et al., 1991), and horned shark (HnS) Cp(Rast and Litman, 1994) constant region sequences are shown. The positionof p bands identified by x-ray crystallography of the human molecule(Edmundson et al., 1976) are indicated. Sequences identities are shaded, andalternate identities are boxed.

660

by guest on July 13, 2011glycob.oxfordjournals.org

Dow

nloaded from

Evolution of Immunoglobullns

vLFused cluster-VA.Chondrichthylan

SHARK

vL

vH

3kb

Fused and UnfusedChondrichthylanVH Clusters

VDJu Cj!

kappaTranslocon

-1.3 kb

4.5 kb

Vu Dji Jji Cu

•I • •3.0 kb

HUMAN

VK1

^ ^ M M • • • • • M M

VK-85 Jrl-5

23 kb

Translocon-51 VH9 30 D8 6Js

]\ I4++++-1100 kb

- • •* • • • •18kb 50 kb <1kb 5.5kb 8 kb

Fig. 6. Diagram companng the lmmunoglobulin genomic gene organization patterns of sharks and humans. The cluster type arrangement of sharks consists ofcassettes each composed of V, (D), J, and C gene segments. These cassettes are duplicated many times. The particular patterns shown here are for thesandbar shark (Hohman et al., 1993, 1995; S.Shen, unpublished observations). In contrast, the translocon type pattern consists of large arrays of V regionsegments arranged in tandem with a limited number of (D), J, and C segments (Lai et al.. 1989; Cook and Tomlinson, 1995). For each immunoglobulinpolypeptide (e.g, heavy chain, X light chain, K light chain) there is only one functional locus

tion in the heavy chains is striking and results from selectivepressures to preserve residues necessary for proper folding anddomain interaction. The Fr2 is highly conserved containing theGly-Leu-Glu-Trp (GLEW) motif characteristic of heavy chains(see Figure 3 above for contrast witJi light chain). The thirdframeworks are initiated by the dipeptide Arg-Phe (RF) andcontain the universally conserved aspartic acid (D) six residuesN-terminal to the cysteine (C) and the Tyr-Try-Cys-Ala(YYCA) motif. The fourth framework contains the motifWGXGT with the next residue being uncharged or hydropho-bic and shows subsequent conservation of hydrophobic resi-dues. By contrast, the Fr4 in the light chain/TCR-type mol-ecules has the motif Phe-Gly-X-Gly-Thr (FGXGT) with thenext residue characteristically positively charged (either R orK). These motifs are preserved throughout the evolution ofjawed vertebrates and are present in definitive form in livingrepresentatives of the most anciently emerged jawed verte-brates.

Figure 5 gives a comparative alignment of CL-type domainsincluding C\ and C|3 of humans and sharks. This figure in-cludes the P bands defined by x-ray crystallography of thehuman molecule. Ig and TCR domains consist of two 3 pleatedsheets, one made up of four f$ bands and the other comprisedof three. Both variable and constant domains have this basicstructure, but variable domains often have two extra (3 bandsdefining the second complementarity determining region.There is substantial homology within the 3 bands of these fourdomains, particularly in the 4—2 (B) band that is involved informing die contact surface with either CH1 or Cot, respec-

tively. In addition, the sentinel tryptophan (W) that is 14 resi-dues distal to the cysteine is uniformly conserved. The constantdomains of human and shark X chains form extremely similarpredicted 3-dimensional structures as illustrated by die simi-larity in p band size and sequence and length of interveningloops (Schluter et al., 1989).

Genomic organization

The genomic organization of Ig genes that incorporates V, D(in some cases), J, and C individual recombining gene seg-ments is essentially conserved throughout phylogeny. Despitediis conservation, however, there are a number of fundamentaldifferences (Hinds and Litman, 1986; Kokubu et al., 1988b;Shamblott and Litman, 1989b; Hohman et al., 1993; Marcha-lonis et al., 1993) in the 'macro' (clustering or translocon)structure of Ig genes. These are illustrated in Figure 6. Thebasic mammalian translocon system is exemplified by the Klight chain gene. A large array of discrete V segments areseparated from a small number of J segments which are fol-lowed in turn by a single C segment. The V region segmentsare flanked on their 3' end, and the J segments on their 5' end,by recombination signal sequences (RSS). The RSSs are thegenomic signature elements that allow the lymphocyte's re-combination machinery to specifically recognize and recom-bine the coding sequences. During the creation of an antibodygene, the V region is rearranged and fused to a small joining (J)segment. After rearrangement, the fused V(D)J segment istranscribed along witn a downstream constant (C) region gene

661

by guest on July 13, 2011glycob.oxfordjournals.org

Dow

nloaded from

JJ.Marchalonis et at

segment which determines the isotype and thereby the effectorfunction of the antibody. The intervening transcript is removedby mRNA splicing. This procedure is nearly identical for bothantibodies and TCRs.

In the case of a shark heavy chain, the 'micro'-organizationremains the same as stated for the mammalian system, that is,V segments rearrange to D and I segments. The macro-organization, however, is radically different. A single V regionis grouped into a cluster model of organization, wherein youmay find a V, followed by one or 2 D segments, followed bya single J segment which is nearly immediately followed by aC region (Kokubu et aL, 1988b; Shen et aL, 1996). This systemof organization seems to select for rearrangement within clus-ters as no evidence for rearrangement between any of the es-timated 100-200 V(D)JC VH clusters is yet to be identified(Hinds-Frey et aL, 1993). Interestingly, heavy chain clustershave also been found in fused and partially fused arrange-ments, that is, a V region is fused to a D segment in the genome(Shen et ai, 1996). In the case of the sandbar shark light chain(Hohman et ai, 1993), all genomic clusters are VJ fused in thegermline, followed by a C region. This has led to speculationthat the shark immune repertoire is limited, as the junctionaland combinatorial diversity seen in higher vertebrates seems toonly be partially utilized in the shark (Hinds and Litman,1986). However, in cDNA studies of shark light and heavychain V-regions, high levels of diversity is always present, withno two identical sequences ever being recovered (Hohman etai, 1995; Bernstein et ai, 1996b; Shen et aL, 1996).

Conclusions

The genetic elements necessary for production and diversifi-cation of Ig chains were established by the time of appearanceof true-jawed vertebrates, although they appear to be lacking inextant agnathans. Within the gnathostomes there is strikingconservation of key structural residues in light chains, heavychains, and TCR. Thus, the functional structures of antibodyand TCR proteins were stabilized by selection early in theevolution of vertebrates. The most profound distinctions be-tween the Igs of sharks and mammals relate to the macro-organization of VHDHJHCH or VLJLCL genetic segmentsinto individual clusters as opposed to extensive translocons.We tentatively consider the cluster organization to be the pri-mordial one, but emphasize that cluster and translocon arrange-ments can occur within the same species (Daggfeldt et aL,1993; Ghaffari and Lobb, 1993), and that other mechanismssuch as gene conversation acting upon modified transloconsplay major roles in chickens (Reynaud et aL, 1987, 1989), andrabbits (Becker and Knight, 1990). Although the existence ofIg gene segments are ancient, their configuration continues toevolve by comparison between distinct vertebrate classes (e.g.,sharks, birds, mammals) and even within classes (e.g., rabbitsversus mice).

AcknowledgementsThis research was supported in part by Grants to JJ.M. from the NationalScience Foundation (MCB9631846) and the National Institutes of Health(GM42437). We thank Ms. Diana Humphreys for valuable assistance in prepa-ration of the manuscript.

ReferencesAndersson.E. and Matsunaga,T. (1995) Evolution of immunoglobulin heavy

chain variable region genes: a VH family can last for 150-200 million yearsor longer. Immunogenetics, 41, 18—28.

Becker.R.S. and Knight.K.L. (1990) Somatic diversification of immunoglob-ulin heavy chain VDJ genes: evidence for somatic gene conversion in rab-bits. Cell, 63, 987-997.

Bernstein,R.M., Schluter.S.F., Lake.D.F. and MarchalonisJ.J. (1994) Evolu-tionary conservation and molecular cloning of the recombinase activatinggene 1. Biochem. Biophys. Res. Commun., 205, 687-692.

Bernstein,R.M., Schluter,S.F., Bernstein,H. and MarchalonisJJ (1996a) Pri-mordial emergence of the recombination activating gene 1 (RAGI): se-quence of the complete shark gene indicate homology to microbial inte-grases. Proc. Nail. Acad. Sci. USA, 93, 9454-9459.

Bernstein,R.M., Schluter.S.F., Shen.S. and MarchalonisJ.J. (1996b) A newhigh molecular weight immunoglobulin class from the carcharhine shark:implications for the properties of the primordial immunoglobulin. ProcNail. Acad. Sci. USA, 93, 3289-3293

Cook,G.P. and Tomlinson,I.M. (1995) The human immunoglobulin VH rep-ertoire. Immunol. Today, 16, 237-242.

Daggfeldt^A., Bengten.E and Pilstrom.L. (1993) A cluster type organization ofthe loci of the immunoglobulin light chain in Atlantic cod (Gadus morhuaL.) and rainbow trout (Oncorhynchus mykiss Walbaum) indicated by nucleo-tide sequences of cDNAs and hybridization analysis. Immunogeneiics, 38,199-209.

Edmundson.A B., Ely.K.R., Abola.E.E., Schiffer,M., Panagiotopoulos,N. andDeutsch,H.F. (1976) Conformational isomerism, rotational allomerism, anddivergent evolution in immunoglobulin light chains. Fed. Proc., 35, 2119-2123.

FelsensteinJ. (1996) PHYLIP (Phylogeny Inference Package) version 3.572c.Distributed by the author. Department of Genetics, University of Washing-ton, Seattle

Feng.D.F. and Doolittle.R.F. (1990) Progressive alignment and phylogenetictree construction of protein sequences Methods Enzymol., 183, 375—387.

Ghaffan.S.H. and Lobb.C.J (1993) Structure and genomic organization ofimmunoglobulin light chain in the channel catfish. An unusual genomicorganizational pattern of segmental genes J. Immunol, 151, 6900-6912.

Greenberg.A.S., Steiner.L., Kasahara,M. and Flajnik,M.F. (1993) Isolation ofa shark immunoglobulin light chain cDNA clone encoding a protein resem-bling mammalian kappa light chains: implications for the evolution of lightchains. Proc. Natl. Acad. Sci. USA, 90, 10603-10607.

Greenberg.A.S., Hughes,A.L., Guo.J., Avila.D., McKinney.M.F. andFlajnik.M.F. (1996) A novel chimeric antibody class in cartilaginous fish:IgM may not be the primordial immunoglobulin. Eur. J. Immunol., 26,1123-1129.

Greenhalgh.P., Olesen.C.E. and Steiner.L.A. (1993) Characterization and ex-pression of recombination activating genes (RAG-1 and RAG-2) in Xenopuslaevis. J. Immunol., 151, 3100-3110.

Hanley.PJ., HookJ.W., Raftos.D.A., Gooley,A.A., Trent,R. and Raison.R L(1992) Hagfish humoral defense protein exhibits structural and functionalhomology with mammalian complement components. Proc. Nail. Acad. Sci.USA, 89, 7910-7914.

Hill.R.L., Delaney.R., Fellows.R.E. and Lebovitz,H.E. (1966) The evolution-ary origins of the immunoglobulins. Proc. NatL Acad. Sci. USA, 56, 1762—1769.

Hinds,K.R. and Litman.G.W. (1986) Major reorganization of immunoglobulinVH segmental elements during vertebrate evolution. Nature, 320, 546-549.

Hinds-Frey.K.R., Nishikata,H., Litman.R.T. and Litman.G.W (1993) Somaticvariation precedes extensive diversification of germline sequences and com-binatorial joining in the evolution of immunoglobulin heavy chain diversity.J. Exp.Med., 178,815-824.

Hohman,V.S., Schluter.S.F. and Marchalonis J J. (1992) Complete sequence ofa cDNA clone specifying sandbar shark immunoglobulin light chain: geneorganization and implications for the evolution of light chains Proc. Nail.Acad. Sci. USA, 89, 276-280.

Hohman.V.S., Schuchman.D.B., Schluter.S.F. and MarchaJonisJ.J. (1993) Ge-nomic clone for sandbar shark lambda light chain: generation of diversity inthe absence of gene rearrangement. Proc. Natl. Acad. Sci. USA, 90, 9882-9886.

Hohman.V.S., Schluter.S.F. and MarchalonisJ.J. (1995) Diversity of Ig lightchain clusters in the sandbar shark (Carcharhinus plumbeus). J. Immunol.,155, 3922-3928.

Kabat,E.A., Wu.T.T., Perry.H.M., Gottesman.K.S. and Foeller.C. (1991). Se-quences of proteins of immunological interest. Washington, DC, USDHHS,Public Health Service, National Institute of Health.

KleinJ. (1989) Are invertebrates capable of anticipatory immune responses'7

Scand. J. Immunol., 29, 499-505.Kokubu.F.. Hinds.K., Litman.R.. Shamblott,M.J. and Litman.G.W. (1988a)

Complete structure and organization of immunoglobulin heavy chain con-

662

by guest on July 13, 2011glycob.oxfordjournals.org

Dow

nloaded from

Evolution of inununoglobulins

stant region genes in a phylogenetically primitive vertebrate. EMBO J., 7,1979-1988.

Kokubu.F., Litman,R., Shamblott,MJ., Hinds.K. and Litman.G.W. (1988b)Diverse organization of immunoglobulin VH gene loci in a primitive ver-tebrate. EMBO J., 7, 3413-3422.

Lai.E., Wilson,R.K. and Hood.L.E. (1989) Physical maps of the mouse andhuman immunoglobulin-like loci. Adv Immunol., 46, 1-59.

Litman.G.W., FinstatLFJ., HowellJ., Pollara.B.W. and God,R.A. (1970) Theevolution of the immune response. 3. Structural studies of the lampreyimmuoglobulin. /. Immunol., 105, 1278-1285.

MarchalonisJJ. and Edelman.G.M. (1968) Phylogenetic origins of antibodystructure. 3. Antibodies in the primary immune response of the sea lamprey,Petromyzon marinus. J Exp. Med., 127, 891-914.

MarchalonisJ.J. and Schluter.S.F. (1989) Evolution of variable and constantdomains and joining segments of rearranging immunoglobulins. FASEB J.,3, 2469-2479.

MarchaJonisJJ. and Schluter.S.F. (1990a) On the relevance of invertebraterecognition and defense mechanisms to the emergence of the immune re-sponse of vertebrates. Scand. J. Immunol, 32, 13—20.

MarchalonisJ.J. and SchJuter.S.F. (1990b) Origins of immunoglobulins andimmune recognition molecules. BioScience, 40, 758-768.

MarchalonisJ.J., Vasta,G.R., Warr.G.W. and Barker.W.C. (1984) Probing theboundaries of the extended immunoglobulin family of recognition mol-ecules: jumping domains convergence and minigenes. Immunol. Today, 5,133-142.

MarchalomsJJ., Hohman.V.S. and Schluter.S.F. (1993) Antibodies of sharks;novel methods of generation of diversity, lmmunologist, 1, 115—120.

Marchalonis.J.J., Hohman.V.S., Kaymaz.H., Schluter.S.F. and Edmund-son,A.B. (1994) Cell surface recognition and the immunoglobulin super-family. Ann. N.Y. Acad. Sa., 712, 20-33.

Nonaka.M. and Takahashi.M. (1992) Complete complementary DNA se-quence of the third component of complement of lamprey. Implication forthe evolution of thioester containing proteins. J. Immunol., 148, 3290-3295.

Raison.R.L., Hull.C.J. and Hildemann.W.H. (1978) Characterization of immu-noglobulin from the Pacific hagfish, a primitive vertebrate. Proc. Natl. Acad.Sci. USA, 75, 5679-5682

RastJ.P. and Litman.G.W. (1994) T-Cell receptor gene homologs are presentin the most primitive jawed vertebrates. Proc. Natl. Acad. Sci. USA, 91,9248-9252.

RastJ.P., Anderson,M.K., Ota,T., Litman.RT, Margittai.M., Shamblott.M.Jand Litman.G.W. (1994) Immunoglobulin light chain class multiplicity andalternative organizational forms in early vertebrate phylogeny. Immunoge-netics, 40, 83-99.

Reynaud.C.A., Anquez.V., Gnmal.H. and WeilU.C. (1987) A hyperconver-sion mechanism generates the chicken light chain preimmune repertoire.Cell, 48, 379-388.

Reynaud.C.A., Dahan.A., Anquez,V. and WeilU.C. (1989) Somatic hypercon-version diversifies the single Vh gene of the chicken with a high incidencein the D region. Cell, 59, 171-183.

Saitou.N. and Nei.M. (1987) The neighbor-joining method: a new method forreconstructing phylogenetic trees. Mol. Biol. Evolution, 4, 406-425

Schatz.D.G., Oettinger.M.A. and Baltimore.D (1989) The V(D)J recombina-tion activating gene, RAG-1. Cell, 59, 1035-1048.

Schatz,D.G., Oettinger.M.A. and Schlissel,M.S. (1992) V(D)J recombination:molecular biology and regulation. Annu. Rev. Immunol, 10, 359-383

Schluter.S.F., Hohman.V.S., Edmundson.A.B. and MarchalonisJJ. (1989)Evolution of immunoglobulin light chains: cDNA clones specifying sandbarshark constant regions. Proc. Natl. Acad. Sci. USA, 86, 9% 1-9965.

Shamblott,MJ. and Litman.G.W (1989a) Complete nucleotide sequence ofprimitive vertebrate immunoglobulin light chain genes. Proc. Natl Acad.Sci. USA, 86, 4684-4688

Shamblott.M.J. and Litman.G.W. (1989b) Genomic organization and se-quences of immunoglobulin light chain genes in a primitive vertebratesuggest coevolution of immunoglobulin gene organization. EMBO J., 8,3733-9.

Shen.S.Y., Bemstein.R.M., Schluter.S.F. and MarchalonisJJ. (1996) Heavychain variable regions in carcharhine sharks: development of a compre-hensive model for the evolution of VH domains among the gnathostomes.Immunol. Cell Biol. Aust., 74, 357-364.

Singer.SJ. and Doolittle.R.F. (1966) Antibody active sites and immunoglob-ulin molecules. Science, 153, 13-25.

Smith.L.C and Davidson.E.H. (1992) The echinoid immune system and thephylogenetic occurrence of immune mechanisms in deuterostomes. Immu-nol. Today, 13, 356-362.

Sun.S.C, Lindstrom.I., Boman.H.G., Faye.I. and Schmidt,O. (1990) Hemolin:

an insect-immune protein belonging to the immunoglobulin superfamily.Science, 250, 1729-1732.

Thompson.C.B. (1995) New insights into V(D)J recombination and its role inthe evolution of the immune system. Immunity, 3, 531-539.

Thompson^.D., HigginsJJ.G. and Gibson,TJ. (1994) CLUSTAL W. improv-ing the sensitivity of progressive multiple sequence alignment through se-quence weighting, position-specific gap penalties and weight matrix choice.Nucleic Acids Res., 22, 4673-^t680.

Vazquez.M., Mizuki.N., Flajnik.M.F., McKinney.E.C. and Kasahara.M.(1992) Nucleotide sequence of a nurse shark immunoglobulin heavy chaincDNA clone. Mol. Immunol., 29, 1157-1158.

Received on June 25, 1996; accepted on July 28, 1996

663

by guest on July 13, 2011glycob.oxfordjournals.org

Dow

nloaded from