Identification and characterization of large galactosyltransferase gene families:...

19
Review Identi¢cation and characterization of large galactosyltransferase gene families : galactosyltransferases for all functions 1 Margarida Amado a;b; *, Raquel Almeida a;b , Tilo Schwientek a , Henrik Clausen a a Faculty of Health Sciences, School of Dentistry, Copenhagen, Denmark b Institute of Molecular Pathology and Immunology of University of Porto, IPATIMUP, Rua Dr. R. Frias s/n, 4200 Porto, Portugal Received 15 March 1999; received in revised form 14 September 1999; accepted 14 September 1999 Abstract Enzymatic glycosylation of proteins and lipids is an abundant and important biological process. A great diversity of oligosaccharide structures and types of glycoconjugates is found in nature, and these are synthesized by a large number of glycosyltransferases. Glycosyltransferases have high donor and acceptor substrate specificities and are in general limited to catalysis of one unique glycosidic linkage. Emerging evidence indicates that formation of many glycosidic linkages is covered by large homologous glycosyltransferase gene families, and that the existence of multiple enzyme isoforms provides a degree of redundancy as well as a higher level of regulation of the glycoforms synthesized. Here, we discuss recent cloning strategies enabling the identification of these large glycosyltransferase gene families and exemplify the implication this has for our understanding of regulation of glycosylation by discussing two galactosyltransferase gene families. ß 1999 Elsevier Science B.V. All rights reserved. Keywords : Galactosyltransferase ; Glycosyltransferase ; Chromosome ; Gene ; Glycosylation ; Enzyme Contents 1. Introduction .......................................................... 36 2. Cloning strategies for glycosyltransferases ..................................... 36 3. Identi¢cation of a large family of human L4-galactosyltransferases .................. 40 3.1. Evolution of the L4Gal-T gene family involve gene duplication with subsequent diver- gence ............................................................ 41 3.2. L4Gal-Ts have di¡erent functions ....................................... 41 3.3. L4Gal-Ts are di¡erentially expressed ..................................... 42 3.4. L4-galactosyltransferase genes in C. elegans and bacteria ...................... 43 3.5. Are there additional L4Gal-Ts in man? ................................... 43 0304-4165 / 99 / $ ^ see front matter ß 1999 Elsevier Science B.V. All rights reserved. PII:S0304-4165(99)00168-3 * Corresponding author. Institute of Molecular Pathology and Immunology of University of Porto, IPATIMUP, Rua Dr. R. Frias s/n, 4200 Porto, Portugal. Fax : +351-2-557-0700 ; E-mail : [email protected] 1 This paper is dedicated to Drs. Harry Schachter and Akira Kobata on the occasion of their 65th birthdays. This paper constitutes part of the requirement for a Ph.D. thesis for Margarida Amado. Biochimica et Biophysica Acta 1473 (1999) 35^53 www.elsevier.com/locate/bba

Transcript of Identification and characterization of large galactosyltransferase gene families:...

Review

Identi¢cation and characterization of large galactosyltransferase genefamilies: galactosyltransferases for all functions1

Margarida Amado a;b;*, Raquel Almeida a;b, Tilo Schwientek a, Henrik Clausen a

a Faculty of Health Sciences, School of Dentistry, Copenhagen, Denmarkb Institute of Molecular Pathology and Immunology of University of Porto, IPATIMUP, Rua Dr. R. Frias s/n, 4200 Porto, Portugal

Received 15 March 1999; received in revised form 14 September 1999; accepted 14 September 1999

Abstract

Enzymatic glycosylation of proteins and lipids is an abundant and important biological process. A great diversity ofoligosaccharide structures and types of glycoconjugates is found in nature, and these are synthesized by a large number ofglycosyltransferases. Glycosyltransferases have high donor and acceptor substrate specificities and are in general limited tocatalysis of one unique glycosidic linkage. Emerging evidence indicates that formation of many glycosidic linkages is coveredby large homologous glycosyltransferase gene families, and that the existence of multiple enzyme isoforms provides a degreeof redundancy as well as a higher level of regulation of the glycoforms synthesized. Here, we discuss recent cloning strategiesenabling the identification of these large glycosyltransferase gene families and exemplify the implication this has for ourunderstanding of regulation of glycosylation by discussing two galactosyltransferase gene families. ß 1999 Elsevier ScienceB.V. All rights reserved.

Keywords: Galactosyltransferase; Glycosyltransferase; Chromosome; Gene; Glycosylation; Enzyme

Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2. Cloning strategies for glycosyltransferases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3. Identi¢cation of a large family of human L4-galactosyltransferases . . . . . . . . . . . . . . . . . . 403.1. Evolution of the L4Gal-T gene family involve gene duplication with subsequent diver-

gence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.2. L4Gal-Ts have di¡erent functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.3. L4Gal-Ts are di¡erentially expressed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.4. L4-galactosyltransferase genes in C. elegans and bacteria . . . . . . . . . . . . . . . . . . . . . . 433.5. Are there additional L4Gal-Ts in man? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

0304-4165 / 99 / $ ^ see front matter ß 1999 Elsevier Science B.V. All rights reserved.PII: S 0 3 0 4 - 4 1 6 5 ( 9 9 ) 0 0 1 6 8 - 3

* Corresponding author. Institute of Molecular Pathology and Immunology of University of Porto, IPATIMUP, Rua Dr. R. Frias s/n,4200 Porto, Portugal. Fax: +351-2-557-0700; E-mail : [email protected]

1 This paper is dedicated to Drs. Harry Schachter and Akira Kobata on the occasion of their 65th birthdays. This paper constitutespart of the requirement for a Ph.D. thesis for Margarida Amado.

BBAGEN 24916 17-11-99

Biochimica et Biophysica Acta 1473 (1999) 35^53www.elsevier.com/locate/bba

4. Identi¢cation of a large family of human L3-galactosyltransferases . . . . . . . . . . . . . . . . . . 454.1. Evidence for an evolutionary relationship involving gene duplication and subsequent

divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.2. L3Gal-Ts have di¡erent functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.3. L3Gal-Ts are di¡erentially expressed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.4. L3-galactosyltransferase genes in lower organisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5. Concluding remarks and future perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

1. Introduction

The Glycobiology ¢eld has evolved dramatically inthe last decade with the isolation, cloning, and ex-pression of recombinant forms of an increasing num-ber of the enzymes, glycosyltransferases, which cata-lyze the synthesis of complex oligosaccharides andglycoconjugates. From the initial start in 1986when the ¢rst mammalian glycosyltransferase, theUDP-Gal: LGlcNAc L1,4-galactosyltransferase(L4Gal-T) or lactose/lactosamine synthase wascloned [1^4], our knowledge of glycosyltransferasesand their genes has grown exponentially. It is ap-parent that past predictions of the number of existingglycosyltransferases in higher animals has beenunderestimated [5]. A major factor contributing tothis advancement has been the introduction of newstrategies of identifying genes without the traditionalisolation of the proteins [6]. In this minireview, wewill discuss the most recent strategy utilized in theglycosyltransferase ¢eld and address some of the newquestions raised by this strategy. A major issue forfuture study is the demonstration of large homolo-gous glycosyltransferase gene families that containmore genes than previously imagined. The discussionwill exemplify the application of this strategy foridenti¢cation and characterization of two non-ho-mologous galactosyltransferase gene families, theL4- and L3-galactosyltransferases.

2. Cloning strategies for glycosyltransferases

The traditional approach for identi¢cation of aglycosyltransferase gene was cumbersome and timeconsuming, involving isolation of the active protein

followed by either partial sequencing or the produc-tion of antibodies. Relevant cDNA libraries werescreened by hybridization with DNA probes, PCR,or immunoblotting with antibodies [1^3,7^10]. Thisstrategy has had its limitations in that puri¢cation ofmany glycosyltransferase activities pose severe prob-lems with regard to solubilization and stability. Onekey to success appears to be related to the quantityand activity of the proteolytically cleaved solubleforms of these type II transmembrane proteins. Inmost cases where puri¢cation of a glycosyltransferaseto homogeneity was achieved, truncated forms of theenzymes were isolated [3,7,11]. Furthermore, it is ap-parent that many enzyme activities are a result ofmultiple gene products. Thus, co-puri¢cation of sev-eral enzymes may interfere with the cloning strategy.Nevertheless, this initial strategy was and still is inmany cases crucial for identifying the ¢rst member ofglycosyltransferase gene families.

A major breakthrough for the access to the glyco-syltransferase genes was the introduction of thetransfection cloning strategy. This strategy is basedon availability of a cell line de¢cient in a glycosyl-transferase activity, expression of the relevant accept-or substrate (precursor structure) on the cell surface,and a reagent or functional assay for identi¢cation ofsuccessful complementation of the enzyme de¢ciency.The strategy was introduced to the glycobiology ¢eldby Lowe and coworkers in 1990 for the cloning ofthe histo-blood group H gene de¢ned K2-fucosyl-transferase [12,13]. It was subsequently used for theisolation of many other glycosyltransferase genes.Stanley and colleagues [14] prepared a lectin resistantCHO cell line de¢cient in GnTI activity to clone theGnTI gene. Fukuda and colleagues used co-transfec-tion with leukosialin in CHO cells to identify the

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5336

mucin-type core 2 synthase, C2GnT, and the selec-tion strategy was based on a combined carbohydrate/peptide epitope de¢ned by antibody T305 [15]. Re-cently, Sasaki et al. [16] expanded the use of thetransfection cloning strategy by demonstrating thatit is possible to use recipient cells expressing lowlevels of the gene in question. The major advantagesof the transfection strategy are: the enzyme proteinneed not be isolated; and, in principle, any novelglycosyltransferase gene can be isolated, providedsuitable recipient cells and screening strategies areavailable. However, these latter provisions maypresent serious limitations.

With the cloning of the L4Gal-T [1^4], the K2,6-sialyl-T [7], the K1,2Fuc-T [13], the bovine K3Gal-T[10], and the histo-blood group A K1,3GalNAc-T [8],it became clear that the primary sequences of glyco-syltransferases were entirely di¡erent. Paulson andColley [5] pointed out that these ¢rst cloned enzymesall had an N-terminal hydrophobic sequence, andpredicted that all Golgi glycosyltransferases wouldbe type II transmembrane proteins with a commondomain structure. The ¢rst attempts to identify geneshomologous to already identi¢ed glycosyltransferasesinvolved low stringency hybridization of cDNAprobes to genomic DNA on Southern blots.Although, these studies were not successful in iden-tifying homologous L4Gal-T genes, they have beenused with success in the identi¢cation of novel fuco-syltransferases [17^20] and pseudogenes [21,22]. Withthe isolation of three sialyltransferases it becameclear that some sequence similarity was found amongthese enzymes, and two `sialyl-motifs' were identi¢ed[23^26]. This ¢nding led to another advancement inthe cloning of glycosyltransferases as Paulson andcolleagues [24] introduced a reverse-transcriptase pol-ymerase reaction (RT-PCR) strategy in which theconserved regions were used to PCR amplify addi-tional members of homologous gene families. Thisstrategy relies on the design of primers, often degen-erate, for ampli¢cation of similar genes, and an ap-propriate mRNA tissue source where the novel geneis expressed. This approach has been extremely suc-cessful in identifying novel members of the large sia-lyltransferase gene family [25]. Several members ofthe polypeptide GalNAc-transferase have also beenidenti¢ed using this strategy [27^29], and a modi¢ca-tion with a pre-selection step including restriction

enzyme digestion of known gene products to enrichfor novel gene sequences was developed [27]. Themajor advantage of the RT-PCR strategy is theease and speed with which novel gene sequencescan be identi¢ed, and the major limitation involvesaccess to appropriate mRNA sources. All strategiesbased on sequence similarity require identi¢cation ofconserved regions that is usually based on informa-tion of multiple gene sequences or evolutionary con-servation.

The latest advance in cloning strategies of glyco-syltransferases comes from the initial phases of thegenome projects. The ¢rst phase of the genome proj-ect involved establishment of a physical map, whichincludes identi¢cation of transcribed regions and as-sembly of their relative positions on chromosomes[30,31]. Transcribed regions have been identi¢ed byESTs (expressed sequence tags) derived from mRNAof many di¡erent adult and fetal organs, cell linesand cancer tissues. ESTs are 5P- or 3P-sequences de-rived from oligo-dT primed cDNA clones (often sizeselected for 1^2 kbp inserts). Typically, single readsof 5P- and/or 3P-sequences are obtained with appro-priate vector primers from several thousand cDNAclones from a given library and deposited in data-bases. Each gene may be represented in multiplecDNA clones of di¡erent sizes thereby generating5P-ESTs covering di¡erent regions of transcripts.Since cDNA clones derived from the same gene inprinciple should contain the same 3P-sequence (unlessmultiple polyadenylation signals exist), 3P-ESTs canbe used to identify those cDNA clones. EST contigsrepresenting the same gene are clustered in UniGenedatabase (http://www.ncbi.nlm.nih.gov/UniGene/in-dex.html). Sequence-tagged sites (STSs) are uniquesequences derived from 3P-ESTs, and these are usedto map the positions of genes. Map positions aredetermined by PCR using radiation hybrid panelsor YAC libraries, with primers designed based onthe 3P-ESTs. The Unigene database includes usefulinformation of insert sizes of clones, tissue sources,accession numbers of 5P-EST sequences correspond-ing to assembled 3P-ESTs, analysis of sequences forpolyadenylation signals, potential protein similar-ities, and chromosomal mapping data. The humanEST database now contains more than 2 million se-quences and approximately 60 000 contigs have beenincluded in Unigene. It is estimated that the EST

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 37

database contains sequence information from ap-proximately 50% of human genes. The sequence in-formation accumulated may not contain the codingregion or only a part of it for a given transcript, andsince it is based on single reads it may contain minorerrors, although it is the experience of the authorsthat the sequence information in general is highlyreliable. Nevertheless, searches for sequences withsimilarity to known glycosyltransferase genes usingthe BLAST algorithm [32^34] (http://www.ncbi.nlm.nih.gov/BLAST/) have been very fruitful for severaldi¡erent gene families. The strategy, coined `com-puter cloning', was used by Tabak and colleagues[35] for identi¢cation of novel partial sequences ho-mologous to polypeptide GalNAc-transferases, andhas now been applied to the identi¢cation and clon-ing of many L4- and L3Gal-Ts [36^39], L6GlcNAc-Ts[40,41], polypeptide GalNAc-transferases [42], and

GlcNAc sulfotransferases [43], within the last 2years. An example of the application of an EST clon-ing strategy for L4Gal-T3 is discussed in detail inFig. 1. The major advantages of the strategy is thelarge amount of cDNA sequence information frommany di¡erent cell and tissues sources, and the factthat much sequence can be assembled without leav-ing the o¤ce. However, the strategy su¡ers from anunder-representation or lack of representation of se-quence information from coding regions of geneswith long 3P-UTRs or 3P-UTRs with complex struc-ture inhibiting the RT reaction, and this has been anobstacle for the cloning of several glycosyltransfer-ases in the past. An example of such is representedby the histo-blood group A/B K3Gal(NAc)-transfer-ase, where cloning was unsuccessful with oligo-dTprimed cDNA libraries, but clones were isolatedfrom a random primed library [8] ; however, the 3P-

Fig. 1. The `computer-cloning' strategy. The strategy utilized for identi¢cation and `computer-cloning' of L4Gal-T3 is illustrated [36].The EST database was searched by the tBLASTn algorithm using the coding sequence of L4Gal-T1. The 5P-ESTs, D61638 andH26858, were identi¢ed by the three sequence motifs later found to be highly conserved among all L4Gal-Ts (motifs and their positionrelative to the coding region of L4Gal-T1 are indicated by solid lines). The sequences identi¢ed from these ESTs were used for a sec-ond search using the BLASTn algorithm against the EST database to identify ESTs extending 5P and/or 3P of the conserved region.No ESTs overlapping with D61638 in the 5P-end were identi¢ed. In contrast, the entire 3P-coding region was mapped by overlappingESTs with the stop signal placed in R33249. The 5P-region was identi¢ed by using the information supplied in the Unigene database.The 5P-EST, R79676, was derived from the cDNA clone 146449, and the 3P-end of this clone was deposited as R79865 in the Unigenecluster, Hs.13476. The insert size of clone 146449 was reported as 0.5 kb, and several other clones including clone 184081 as illus-trated was listed with insert sizes of 2 kb. The 5P-sequence of clone 184081 (5P-EST H30715) contained nearly the entire 5P-sequencefrom start to within 20 bases from D61638. The identi¢ed EST clone 184081 contained the entire coding region of the gene. ESTclones are available through various distributors, and, in the present case, clone 184081 was received within 4 days from Genome Sys-tems [36]. Using cDNA from this clone expression constructs were prepared and the sequence of the entire coding region con¢rmed.

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5338

UTR sequence has not been characterized. Analysisof the EST databases now show that the coding re-gion of the K3Gal(NAc)-transferase is not repre-sented.

All the above-discussed cloning strategies haveshortcomings. One shortcoming is the requirementfor sequence information of homologous genes, sincegeneral sequence motifs for all glycosyltransferasesor transferases using the same sugar nucleotide(with the exception of the sialyltransferases [23,25])have not been identi¢ed. Searches to date have beenbased on higher animal sequences, but we also maytake advantage of information from lower eukaryoticorganisms or bacteria. Hagen and Nehrke [44] haverecently identi¢ed and characterized a family of pol-ypeptide GalNAc-transferases in Caenorhabditis ele-gans, based on sequence similarity to rodent and hu-man genes. Chen et al. [45] found three homologousgenes of the mannose L1,2GlcNAc-I transferase inthe same organism, although only one human homo-log has been characterized so far. The O-GlcNActransferase is highly conserved in all eukaryotic cells[46], and it has been suggested that a mammaliangene homologous to the yeast O-mannosyltransfer-ases exist, as well [47,48]. These data suggest thatat least for some of the more conserved glycosyl-transferase genes one can identify homologs acrossdistant eukaryotic species. This is important as the C.elegans genome has now been fully sequenced. An-other exciting possibility is that bacterial glycosyl-transferases can be identi¢ed by similarity with ani-mal glycosyltransferases. Martin et al. [49] identi¢edand cloned an K3-fucosyltransferase from the ge-nome database of Helicobacter pylori, based on ashort sequence motif found in animal K3-fucosyl-transferases. Imberty and colleagues [50] made exten-sive analysis of sequences for several glycosyltrans-ferase gene families and identi¢ed common motifs inrelated bacterial and animal glycosyltransferases. It istherefore possible that the completed bacterial ge-nomes and information of bacterial glycosyltransfer-ases may provide access to animal glycosyltransferasegenes for which no relevant sequence informationfrom higher animals is available for similaritysearches.

What does primary sequence similarity and shar-ing of motifs tell us regarding putative enzyme func-tions? Can we de¢ne features common for larger

classes of enzymes and can we predict functions/spe-ci¢cities of novel glycosyltransferase gene productswith this information? What is the level of geneticand functional redundancy in glycosyltransferases?These are important questions as we enter the nextphase of the human genome project where one of thechallenges will be to identify glycosyltransferasegenes among 105 genes in man. Analysis of homolo-gous animal glycosyltransferase families have re-vealed the following: (1) homologous glycosyltrans-ferases share speci¢city for donor nucleotide, but notnecessarily donor sugar moieties; (2) homologousglycosyltransferases mostly share reaction mecha-nism and either conserve or invert the donor sugarlinkage to the acceptor (K/L anomeric con¢guration);and (3) homologous glycosyltransferases often trans-fer to the same position of acceptor sugars, but ex-ceptions to this rule include the large sialyltransferasefamily. When larger gene families are analyzed, itbecomes clear that shared sequence identity involvesonly a few short sequence motifs spread over theputative catalytic domain (central and C-terminal re-gions), and these short motifs often align without theneed to introduce large gaps in the sequences. Fur-thermore, multiple sequence alignment analyses oftenreveal conservation of isolated hydrophobic andcharged residues between these conserved motifs,and a characteristic feature is conservation of cys-teine residues [51,52]. It appears that spacing of com-mon motifs and cysteines may be used to predict thecatalytic functions of putative homologous glycosyl-transferase genes. Thus, if a homologous member ofa gene family encodes a sequence that di¡ers fromothers in positions of motifs (gaps need to be intro-duced in multiple alignment analysis), and where oneor more conserved cysteine residues are missing, it islikely that this gene encodes a glycosyltransferasewith another donor and/or acceptor speci¢city, buthas a related enzyme activity [37]. The functionalsigni¢cance of the conserved sequence motifs has re-cently been established by the ¢rst X-ray crystallog-raphy analysis of an animal glycosyltransferase, theL4Gal-T1 [53]. All the conserved sequence motifsamong the seven L4Gal-T genes, illustrated in Fig.2, are found in association with the putative catalyticregion. Existence of more universal features for gly-cosyltransferases include the predicted type II trans-membrane topology with an N-terminal hydrophobic

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 39

signal anchor sequence. Furthermore, many glycosyl-transferases contain a DxD motif as identi¢ed byWiggens and Munro [54] and Breton et al. [50],and this motif appears to be involved in binding tothe metal-ion cofactor and donor substrate [53].

In the following sections, we will discuss the recentapplication of the computer-cloning strategy for theidenti¢cation, cloning, and characterization of largeL4- and L3-galactosyltransferase gene families.

3. Identi¢cation of a large family of humanLL4-galactosyltransferases

Several groups independently used the emergingEST database information in 1997 to identify agroup of human cDNA sequences with similaritiesto the classical L4Gal-T (designated L4Gal-T1)[36,55^57]. Within 1 year, ¢ve novel human L4Gal-T genes designated L4Gal-T2 to -T6 were identi¢ed,cloned, and enzymic functions of their recombinantproteins demonstrated [36,39,58,59]. The two genes,L4Gal-T5 and -T6, were identi¢ed by traditionalcloning strategies as well as computer cloning[58,59]. Recently, a seventh homologous gene desig-nated L4Gal-T7 was identi¢ed by the computer clon-ing strategy [60,61].

The existence of a family of L4Gal-Ts with similarenzymic functions was surprising, as early low strin-gency Southern hybridization studies with probes forL4Gal-T1 had not revealed additional genes. How-ever, several lines of evidence had suggested thatmore than one L4Gal-T activity existed. Furukawaet al. [62] demonstrated tissue speci¢c di¡erences inthe kinetic properties of L4Gal-T activity for donor

6

Fig. 2. Protein domain structure and function of the L4Gal-Tgene family: schematic depiction of human L4Gal-Ts. The num-bering used for the L4Gal-T genes is according to [39,57] (in[58] L4Gal-T5 was numbered L4Gal-TIV). Cysteine residues areindicated by the letter C, and aligned cysteine residues are con-nected by stippled lines. Potential N-glycosylation sites are indi-cated by trees. The putative transmembrane signal anchors areshown by solid boxing with numbers indicating amino acid resi-dues. Glycolipids as designated as follows: Lc3Cer, GlcNAcL1^3GalL1^4GlcL1-ceramide; nLc5Cer, GlcNAcL1^3GalL1^4Glc-NAcL1^3GalL1^4GlcL1-ceramide. The core 2 substrate is Gal-L1^3[GlcNAcL1^6]GalNAcK1-R. Information from [36,39,57,58,60,61,111].

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5340

and agalacto-glycoprotein substrates. Furthermore,analysis of recombinant L4Gal-T1 revealed that thisenzyme was distinct from the L4Gal-T activity in-volved in lactosylceramide synthesis [63]. Conclusiveevidence came from gene ablation studies of L4Gal-T1, where residual transferase activity and productswere found [64,65]. Finally, Shaper et al. [55,66] iden-ti¢ed and characterized the activity of two homolo-gous L4Gal-T genes in chicken.

A summary of the structural features of the sevenL4Gal-Ts and their genes is presented in Fig. 2. Mul-tiple sequence alignment (ClustalW) and phyloge-netic analysis of the predicted amino acid sequencesindicate that the genes cluster into subgroups as fol-lows: L4Gal-T1 and -T2, L4Gal-T3 and -T4, L4Gal-T5 and -T6, and L4Gal-T7 [57]. A total of four cys-teine residues are conserved in the putative catalyticdomains in the ¢rst six L4Gal-Ts, whereas L4Gal-T7apparently shares none of these. L4Gal-T5 and -T6exhibit a high number of N-linked consensus glyco-sylation sites around the stem region, and studies ofpuri¢ed L4Gal-T6 suggest that these are utilized to alarge extent [59]. One potential N-glycosylation site isconserved in the C-terminal regions of L4Gal-T2 to-T6. Each subgroup appears to have di¡erent lengthsof stem regions with L4Gal-T3 and -T4 having theshortest. L4Gal-T3 is an exception in that it containsan extended C-terminal region. L4Gal-T7 has onenon-conservative substitution (G/R) and one conser-vative (D/E) within the major conserved domain(WGWGREDDE).

3.1. Evolution of the L4Gal-T gene family involve geneduplication with subsequent divergence

Preliminary characterization of the human L4-ga-lactosyltransferase genes clearly con¢rm the evolu-tionary relationship indicated by the extensive se-quence similarities. The genomic organizations ofthe ¢rst four L4Gal-T genes are nearly identicalwith conservation of the position of ¢ve intronsplaced in the coding regions [36,39,67,68]. In addi-tion, L4Gal-T2 and -T3 have an intron in the 5P-UTR, which is not found in L4Gal-T1. Conservationof intron positions together with the sequence simi-larity strongly supports an evolutionary relationshipbased on gene duplication. Interestingly, the homol-ogous snail Lymnaea stagnalis L4GlcNAc-T also

shares ¢ve conserved intron/exon boundaries,although two additional exons exist [69]. Character-ization of the genomic organizations of L4Gal-T5and -T6 is ongoing. L4Gal-T5 and -T6 share espe-cially high sequence similarity (70% amino acid se-quence identity) and the similarity extends through-out the coding region. L4Gal-T7 contains thesequence motifs conserved within other L4Gal-Ts,but its genomic organization is entirely di¡erentfrom L4Gal-T1 to -T4 [61]. All the L4Gal-T genesare located at di¡erent chromosomal loci (Fig. 2).

The L4Gal-T gene family shows sequence similar-ity with UDP-GalNAc: polypeptide GalNAc-trans-ferases in the putative catalytic sites. The major con-served motif (WG(G/R)EDD(D/E)) and surroundingsequence in L4Gal-Ts is similar to the major con-served motif (WGGENxE) of 10 members of theGalNAc-transferase gene family [52]. Hagen et al.[70] pointed this out and showed that mutations inthe WGGENxE motif essentially inactivated Gal-NAc-transferase activity of GalNAc-T1.

3.2. L4Gal-Ts have di¡erent functions

A summary of the present knowledge of the en-zymic properties of L4-galactosyltransferases is pre-sented in Fig. 2. The conserved motifs found in theL4Gal-T homologous genes suggested that they en-coded galactosyltransferases with similar properties[36,55^59]. Several residues within the major motifswere identi¢ed as being important for the catalyticactivity of L4Gal-T1 [71,72] and a recent X-ray crys-tallography study of L4Gal-T1 demonstrated that theconserved motifs are all found in a groove believedto constitute the catalytic site containing donor andacceptor binding sites [53]. Importantly, Bakker et al.[73] identi¢ed a L4GlcNAc-transferase in L. stagnalisby cross hybridization with bovine L4Gal-T1, andthis transferase shared most of the conserved motifsfound in L4Gal-Ts. Thus, genes displaying homologyto the L4Gal-Ts were expected to include those en-coding transferases with di¡erent donor substratesand possibly di¡erent acceptor substrates as well.As shown in Fig. 2, all seven L4Gal-Ts appeared tohave exclusive speci¢city for the donor substrateUDP-Gal, and all transferred Gal in a L1^4 linkageto similar acceptor sugars: GlcNAc, Glc, and Xyl. Ithas been proposed that this gene family would in-

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 41

clude a L4-N-acetylgalactosaminyltransferase form-ing the GalNAcL1^4GlcNAc linkage found in glyco-proteins and hormones [74], but this gene has not yetbeen identi¢ed.

The reader is referred to the paper by Furukawa etal. [112] for a detailed discussion of the kinetic pa-rameters of the L4Gal-Ts. For the purpose of thisdiscussion, it su¤ces to say that the subgroup genepairs, L4Gal-T1 and -T2, L4Gal-T3 and -T4, andL4Gal-T5 and -T6, appear to exhibit similar kineticproperties and functions as evaluated by in vitro as-says. L4Gal-T1 and -T2 synthesize N-acetyllactos-amine in glycolipids and glycoproteins and show sim-ilar K-lactalbumin sensitive lactose synthase activity[36,55]; however, as discussed below, only L4Gal-T1is expressed in murine lactating mammary glands[57]. L4Gal-T3 and -T4 synthesize N-acetyllactos-amine and are not modulated by K-lactalbumin tosynthesize lactose [36,39]. L4Gal-T4 did not use sev-eral glycoprotein substrates and may function in gly-colipid biosynthesis; however, detailed comparativestudies of the relative activities of all enzymes withglycolipid and glycoprotein substrates have not beenpublished. Fukuda and colleagues [75] have providedevidence for a di¡erential function of these fourL4Gal-Ts in the synthesis of polylactosamine struc-tures on di¡erent glycoconjugates. L4Gal-T4 wasshown to be most e¤cient in galactosylating mucin-type core 2 branching [75], while L4Gal-T1 wasshown to be most e¤cient in galactosylating i/I struc-tures of glycoproteins [76]. The function of L4Gal-T5is not clear [58,77,78]. L4Gal-T6 is a lactosylceramidesynthase important for glycolipid biosynthesis, andexhibit only low activity with ovalbumin [59]. Thelevel and pattern of sequence similarity amongL4Gal-T5 and -T6 suggests that these may serve sim-ilar functions (Fig. 2). L4Gal-T5 show low activity(approximately 15 times lower than L4Gal-T1 [77])with some LGlcNAc saccharides, but no signi¢cantactivity with natural glycoconjugates have been re-ported yet [58,77]. One study suggested that L4Gal-T5 has functions in O-glycosylation (cores 2 and 6)[77] ; however, the relative activities found with di-and trisaccharide acceptors representing the mucin-type substrates, were only slightly increased com-pared to simple LGlcNAc monosaccharide acceptorsat the same acceptor concentration. In this study, theauthors also tested GlcL1-Cer and GlcNAcL1^

3GalL1^4GlcL1-Cer (Lc3Cer) glycolipid acceptors us-ing 0.5% Triton X-100 and found no signi¢cant ac-tivity. However, the assay was done with a secretedand tagged form of the enzyme. Nomura et al. [59]expressed the full coding sequence of L4Gal-T6 ininsect cells and demonstrated lactosylceramide syn-thase activity.L4Gal-T7 transfers Gal to XylL1-MU and only

shows very little activity with LGlcNAc terminatingsubstrates, and is therefore likely to represent theGal-I transferase activity involved in proteoglycancore biosynthesis [60,61]. Sequence analysis ofL4Gal-T7 from a ¢broblast cell line of a patientwith a progeroid syndrome and signs of the Ehlers^Danlos syndrome, previously shown to exhibit re-duced galactosyltransferase-I activity [79], revealedtwo inherited allelic variants, L4Gal-T7186D andL4Gal-T7206P, each with a single missense substitu-tion in the putative catalytic domain of the enzyme[61]. L4Gal-T7186D exhibited a 4-fold elevated Km forthe donor substrate, whereas essentially no activitywas demonstrated with L4Gal-T7206P. This con¢rmsthat L4Gal-T7 represents at least one proteoglycanGal-I transferase. Cell fractionation data suggest thatthis enzyme is located in cis-Golgi in contrast toGalL1^4GlcNAc synthase activities [80].

3.3. L4Gal-Ts are di¡erentially expressed

The regulatory mechanisms of L4Gal-T1 expres-sion have been extensively characterized by Shaperand colleagues [81,82]. A complex model with di¡er-ential expression of two transcripts with di¡erenttranslational e¤ciencies ensure that L4Gal-T1 func-tions both as a ubiquitously expressed enzyme atlower levels as well as an enzyme speci¢cally ex-pressed at high levels in lactating mammary glands.This level of information is not available at presentfor any other member of the L4Gal-T gene family,but Northern analyses indicate that the L4Gal-Tgenes are di¡erentially expressed. The most extensiveanalysis was performed by Lo et al. [57] in a com-parative analysis of six genes in adult and fetal hu-man organs. L4Gal-T1 is the most widely expressed.Interestingly, adult and fetal brain express very lowlevels of L4Gal-T1 [57,58], and in one study, it wasfound that the mRNA levels of L4Gal-T1 did notcorrelate with the L4Gal-T activity levels measured

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5342

in fetal and adult mouse brain [83]. L4Gal-T2 exhib-its a more restricted expression pattern with strongexpression in fetal brain [36,57]. L4Gal-T3 is morewidely expressed, similar to L4Gal-T1, but expressionlevels are more variable [36,57]. Notably, L4Gal-T3 isstrongly expressed in fetal brain, but weakly ex-pressed in adult brain. Furthermore, L4Gal-T3 showshigh levels in reproductive organs including testis,ovary, placenta, and pancreas [36]. L4Gal-T4 exhibitsa similar expression pattern as L4Gal-T3 at lowerlevels, but particularly strong expression is found inplacenta [39,57]. It is presently not clear how expres-sion patterns of the four L4Gal-Ts involved in N-acetyllactosamine synthesis relate to the control ofglycoconjugate synthesis in cells and tissues. Moredetailed studies of the expression patterns at thecell level by in situ hybridization and/or immunohis-tological methods are required to address this issue.The importance of the expression pattern is evidentfrom the studies of the L4Gal-T1 knock-outs [64,65],where glycoproteins from serum and some tissuesources were highly under-galactosylated, while gly-coproteins from other sources, e.g. salivary glands,were fully galactosylated. Furthermore, knock-outmice produced no lactose in milk, in agreementwith the ¢nding that only L4Gal-T1, and not -T2which also can function as a lactose synthase, is ex-pressed in lactating murine mammary glands [57].L4Gal-T5 is widely expressed while L4Gal-T6 ex-

hibit a more restricted expression pattern [57,58].L4Gal-T6 was puri¢ed and cloned from rat brain,and shows the highest expression levels in brain[59]. L4Gal-T6 is apparently not expressed in severalorgans, including lung and liver, suggesting that an-other lactosylceramide synthase must be active inthese organs as all cells produce glycolipids. As dis-cussed above L4Gal-T5 is a candidate gene for such asecond lactosylceramide synthase and its wide ex-pression pattern complements that of L4Gal-T6.

3.4. L4-galactosyltransferase genes in C. elegans andbacteria

Two C. elegans homologs of the L4Gal-T genefamily have been identi¢ed [84]. In a phylogeneticanalysis presented by Lo et al. [57] the geneW02B12.11 (designated C. elegans-2) (GenBank ac-cession number Z66521) clustered with the L4Gal-T5

and -T6 subgroup, while the gene R10E11.4 (desig-nated C. elegans-1) (GenBank accession numberZ29095) was entirely independent. If L4Gal-T7 is in-cluded in this analysis, the R10E11.4 gene andL4Gal-T7 form a separate cluster [61]. This conclu-sion is further supported by the ¢nding that two ofthe four intron positions in W02B12.11 align withthe conserved intron/exon boundaries in L4Gal-T1through -T4. The predicted coding region ofW02B12.11 includes the four conserved cysteine res-idues in L4Gal-T1 to -T6, whereas the coding regionof R10E11.4 only has one of the cysteine residues.The coding region of R10E11.4 is organized in sixexons and none of the intron/exon boundaries alignswith those of L4Gal-T1 to -T4 or L4Gal-T7.

The R10E11.4 gene corresponds to the sqv-3 genefound to play a role in vulval invagination in C.elegans [85]. Expression of a secreted construct ofR10E11.4/sqv-3 showed that this gene encodes a ga-lactosyltransferase activity with the same substratespeci¢city as L4Gal-T7, although with a lower tem-perature optimum (R. Almeida and H. Clausen, un-published observation). Another gene found to playa role in vulval invagination in C. elegans is sqv-8[85], which showed the highest sequence similarity tothe recently cloned L1,3-glucuronosyltransferase,which adds the fourth residue to the proteoglycancore tetrasaccharide (GlcAL1^3GalL1^3GalL1^4XylL1-O-Ser) [86].

Two L4Gal-T genes have been identi¢ed and func-tionally characterized from Neisseria gonorrhoeae(lgtB and lgtE, GenBank accession numberU14554). These function as N-acetyllactosamineand lactose synthases, respectively, in the synthesisof neolactotetraose structures in lipopolysaccharides[87,88]. There are no apparent sequence similaritiesbetween these prokaryotic enzymes and the eukary-otic L4Gal-Ts. Breton et al. [50] did ¢nd commonfeatures with L3Gal-Ts and the Drosphila gene Brai-niac (see Section 4) using hydrophobic cluster analy-sis. These approaches may in the future aid in theidenti¢cation of novel glycosyltransferase gene fami-lies in animals.

3.5. Are there additional L4Gal-Ts in man?

Database searches performed as of June 1999 donot suggest the existence of additional genes with

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 43

signi¢cant sequence similarities to the L4Gal-T fam-ily. However, it is estimated that sequence informa-tion is only available from approximately 50% of thehuman genes. Furthermore, it is important to recog-nize that only sequence information containing theconserved regions of homologous genes can be iden-ti¢ed by computer-cloning. Seko et al. [89] recentlyidenti¢ed and partially puri¢ed a L4Gal-T activitywith restricted speci¢city for GlcNAc-6-O-sulfate.The activity was clearly separable from LGlcNAcL4Gal-T activities, which indicates that neitherL4Gal-T1, -T2, -T3, nor -T4, encodes this enzymeactivity. It is unlikely that the L4Gal-T6 gene en-codes the activity, since it transfers e¤ciently toGlcL1-Cer. L4Gal-T5 and -T6 both exhibit low ac-tivity with LGlcNAc terminating saccharides andhence could represent candidate genes [58,59]. Thesynthesis of poly-N-acetyllactosamine structureswith C6 GlcNAc sulfation appear to involve sulfa-tion prior to galactosylation [89]. Identi¢cation ofthe gene encoding this L4Gal-T activity is importantfor understanding the biosynthesis and regulation of6-sulfo-sialyl-Lex=a structures, and it is possible thatthis gene may be another member of the L4Gal-Tgene family.

Another novel candidate member of the L4Gal-Tfamily may be a LGlcNAc L4GalNAc-transferase ac-tivity found to show K-lactalbumin sensitivity similarto L4Gal-T4 [90]. The L4GalNAc-transferase foundin bovine mammary glands is distinct from the hor-mone-related L4GalNAc-transferase, which showspeptide speci¢city [91].

It is important to note that a non-homologousmouse gene located at chromosome 5 (GenBank ac-cession number D37791) has been reported to haveLGlcNAc L4Gal-T activity [92,93]. The orthologoushuman gene appears to be located at 4q21 between

C

Fig. 3. Protein domain structure and function of the L3Gal-Tgene family: schematic depiction of L3Gal-Ts. Designations asFig. 2. 1Motifs shared by all genes apart from L3Gal-T9. Thechromosomal localizations were de¢ned in UniGene by the fol-lowing STS clusters: L3Gal-T2, G23904; L3Gal-T3, G07375;L3Gal-T4, G4027; L3Gx-T6, G29729; L3Gx-T7, G15673;L3Gx-T8, G15412. Glycolipids as designated as follows: Gb(globoside), GalNAcL1^3GalK1^4GalL1^4GlcL1-ceramide; GM2,GalNAcL1^4[NeuAcK2^3]GalL1^4GlcL1-ceramide; Gg3 (asialo-GM2), GalNAcL1^4GalL1^4GlcL1-ceramide. The core 3 sub-strate is GlcNAcL1^3GalNAcK1-R.

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5344

D4S400 and D4S1534 as mapped with contigstSG31846 (ncbi, Unigene), which is the correspond-ing homologous region of mouse chromosome 5.Only four human ESTs have been deposited in Uni-gene for this gene and these are derived from testisand lung libraries, which is in agreement with the¢nding that the murine gene is predominantly ex-pressed in testis [92]. The enzymatic properties ofthis gene have not been fully resolved, but expressionin Escherichia coli was reported to yield a proteinwith low L4Gal-T activity [92]. It is still unclear ifthis gene represents a true L4Gal-T that is evolution-ary unrelated to the L4Gal-T gene family. As will bediscussed later, two non-homologous LGal L3-N-ace-tylglucosaminyltransferases with apparent similarfunctions have also been identi¢ed [16,94].

4. Identi¢cation of a large family of humanLL3-galactosyltransferases

Several L3Gal-T activities that form GalL1^3Hex-(NAc)K/L linkages exist in animals. These includetype 1 chain synthase activity (GalL1^3GlcNAcL1-R), mucin-type core 1 synthase activity (GalL1^3Gal-NAcK1-R), several glycosphingolipid synthase activ-ities that form GM1, Gal-Gb4, the histo-blood groupA associated Gal-A glycolipids (GalL1^3GalNAcK/L1-R), and the Gal-II activity involved in formingthe core tetrasaccharide structure of proteoglycans(GalL1^3GalL1^4Xyl) [95]. Thus, more GalL1^3linkages are found than GalL1^4 structures, and itwas therefore expected that a large homologousL3Gal-T gene family would exist as well.

The ¢rst L3Gal-T gene cloned, L3Gal-T1, wasidenti¢ed by expression cloning using mRNA froma melanoma cell line WM266-4 to direct Lea andsialyl-Lea expression in Burkitt lymphoma NamalwaKJM-1 cells (Sasaki et al., Japanese patentJP1994181759-A/1, GenBank accession numberE07739). Expression constructs of the full coding re-gion and secreted forms of L3Gal-T1 were shown toyield L3Gal-T activity with simple saccharides [37],and activity with LGlcNAc terminating lactoseriesglycosphingolipids [37]. However, L3Gal-T1 showedno activity with ovalbumin [37] and only poor activ-ity with asialo-agalacto-fetuin [38]. Furthermore, thekinetic properties of L3Gal-T1 were much poorer

than those reported for L3Gal-T activity found inepithelial tissues and cell lines [96^98]. These ¢ndingsfurther indicated that additional L3Gal-Ts exist.

Application of the EST cloning strategy to searchfor members of this gene family has now led to theidenti¢cation and cloning of over 10 homologousgenes encoding putative type II transmembrane pro-teins containing a conserved DxD motif. The ¢rstreport of an additional homologous L3Gal-T genewas presented by Furukawa and colleagues [99]with the transfection cloning of a rat GM1 synthase.The orthologous human GM1 synthase, designatedL3Gal-T4, was also identi¢ed and cloned by theEST strategy [37]. Five distinct L3Gal-T have beenreported [37,38,100^102], and recently one homolo-gous member of the family was shown to represent aL3-N-acetylglucosaminyltransferase with speci¢cityfor GalL1^4GlcNAc [94].

A summary of the structural features of ten ho-mologous genes of the L3Gal-T family is presented inFig. 3. Based on multiple sequence alignment (Clus-talW) and phylogenetic analysis of the amino acidsequences, the genes may be clustered into subgroupsas follows: I, L3Gal-T1, -T2, -T3 and -T5; II, L3Gx-T6, -T7, and -T8; III, L3Gal-T4; IV, L3GnT; and V,L3Gx-T9 (the assignment of L3Gx-T6 through -T9 istentative as putative enzymic functions of these havenot been reported). Conservation and alignment ofcysteine residues are in agreement with this sub-grouping (Fig. 3). The sequence of L3Gal-T4 in-cludes all conserved motifs, but an extended se-quence segment (approximately 25 residues) isintroduced between the DxDxF/Y and GxGY/FI/VxS motifs (Fig. 3). One potential N-glycosylationsite is conserved in nine proteins in the centrally lo-cated sequence motif (NLS/TLK). Only the L3Gx-T9lacks this site and in fact encodes a protein with nopotential N-glycosylation sites. The N-glycosylationsites are generally located in the N-terminal region.L3Gal-T2 and L3Gx-T7, with the longest putativestem regions, have the most N-linked glycosylationsites.

4.1. Evidence for an evolutionary relationshipinvolving gene duplication and subsequentdivergence

The L3Gal-T gene family appears to have evolved,

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 45

similar to the L4Gal-T gene family, by gene duplica-tion. The ¢rst ¢ve human L3Gal-T genes character-ized have a simple genomic organization with theentire coding region placed in a single exon [37,38].The genomic organization of three additional genesis identical, but several genes with one or two in-tronic sequences have also been identi¢ed (M.A. Jen-sen, M. Amado, H. Clausen, unpublished). All geneswith available chromosomal mapping data are lo-cated at di¡erent loci (Fig. 3) [37,38].

4.2. L3Gal-Ts have di¡erent functions

A summary of the present knowledge of the enzy-matic properties of six members of the L3Gal-T genefamily are presented in Fig. 3. Members of theL3Gal-T gene family show diverse enzymatic func-tions. Enzyme activities using di¡erent donor sub-strates (UDP-Gal and UDP-GlcNAc) and di¡erentacceptor sugars (GlcNAc, Gal, and GalNAc) havebeen characterized, but all form the L1^3 linkage.Furthermore, acceptor substrate speci¢city studieswith natural glycoconjugates (glycoproteins and gly-colipids) reveal marked di¡erences in biological func-tions.

Di¡erences in biological functions are particularevident from studies of the ¢rst subgroup of genes(L3Gal-T1, -T2, -T3, and -T5). This subgroup isunique in that more than one member has been char-acterized, and all members appear to have a functionin the synthesis of type 1 chain structures (GalL1^3GlcNAcL1-R). The conserved features of this sub-group suggested that they were enzymes with similarfunctions. In fact, all four enzymes transfer Gal tothe C3 position of LGlcNAc terminating structures[37,38,100^103]. The kinetic constants for UDP-Galand simple LGlcNAc saccharide acceptors are di¡er-ent, but qualitative di¡erences were found with gly-coprotein acceptors (hen egg albumin, asialo-agalac-to-fetuin, and bovine submaxillary mucin (BSM)).L3Gal-T1 is nearly inactive with agalacto-glycopro-tein acceptors, and L3Gal-T2 is active with N-linkedglycoproteins but inactive with BSM [38]. Expressionof the human L3Gal-T3 gene has not resulted indemonstrable enzyme activity [37], but the mouseortholog has been shown to have very poor activitywith LGlcNAc-p-Nph [101]. L3Gal-T5, in contrast,was active with BSM, but inactive with N-linked gly-

coproteins [38]. L3Gal-T5 is highly active with themucin-type core 3 disaccharide (GlcNAcL1^3Gal-NAcK1-p-Nph) [38,103], whereas both L3Gal-T1and -T2 are inactive with this disaccharide [38].BSM contains approximately 10% unsubstitutedcore 3 structures [104]. These data suggest thatL3Gal-T2 functions in glycosylation of N-linked gly-coproteins, while L3Gal-T5 functions in mucin gly-cosylation. Interestingly, none of these L3Gal-Ts areactive with the mucin-type core 2 trisaccharide(GlcNAcL1^6(GalL1^3)GalNAcK1-p-Nph) [38], andthis is in agreement with structural studies of mucinoligosaccharides where core 2 is always elongated bytype 2 chain N-acetyllactosamine.

Additional diversity in functions of members ofthe ¢rst subgroup of L3Gal-Ts are found with glyco-lipid substrates. In the lactoseries pathway (Lc3Cerand nLc5Cer substrates), L3Gal-T1 and -T5 have a2^3-fold preference for the shorter glycolipid sub-strate, while L3Gal-T2 show no preference. Similardi¡erences in relative activities are found among theL4Gal-Ts (see above). L3Gal-T1 and -T2 show ap-proximately 50% relative activity with glucosylcera-mide compared to Lc3Cer, while L3Gal-T5 showsless than 1% relative activity with this substrate[37,38]. In man, glycosphingolipid biosynthesis isbased on GalL1^4GlcL1-Cer. Although, a directcomparison of puri¢ed and quanti¢ed enzyme pro-teins has not been performed, L3Gal-T5 appears tohave superior catalytic e¤ciency with glycolipids,and substrates are easily and quantitatively con-verted in preparative reactions, which was not thecase with L3Gal-T1 and -T2 [37]. Further di¡erencesin the properties of these enzymes are found withother glycolipid substrates, and these reactions mayrepresent in vitro artifacts. L3Gal-T1 and -T2 showsome activity with the nLc4Cer substrate, and appearto transfer Gal to Gal, however, the structure of theproducts has not been determined. In contrast,L3Gal-T5 has signi¢cant activity with globoside (ap-proximately 15% of that with Lc3Cer) formingGalL1^3Gb4, and also low activity with Gg3 [38].In summary, the data indicate that formation ofGalL1^3GlcNAcL1-R linkages is a di¡erentiallyregulated process, where the control point is the ki-netic properties and acceptor substrate speci¢cities ofthe L3Gal-Ts isoforms expressed. Thus, the L3Gal-Tgene family may provide independent regulation of

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5346

type 1 chain synthesis on N-linked glycoproteins, O-linked (at least core 3) glycoproteins, and di¡erentglycolipid species.

No functions have been reported yet for membersof the second and fourth subgroups of genes (L3Gx-T6, -T7, and -T8; L3Gx-T9). The function of L3Gal-T4 (subgroup III) is limited to ganglioseries glyco-lipid biosynthesis, and the linkage formed is GalL1^3GalNAcL1-4GalL1-R found in GA1, GM1, and GD1

[37,99]. Although, the linkage in GalL1^3Gb4 is verysimilar, no activity with Gb4 or other non-ganglio-series glycolipids was detected.

Recently, it was shown that the homologous genedesignated L3GnT (Fig. 3, subgroup IV), functionsas an i-L3GlcNAc-transferase [94]. The ¢nding thatone member of this family transfers GlcNAc to Galin a L1^3 linkage representing an iGlcNAc-transfer-ase is puzzling. Fukuda and colleagues [16] originallycloned an iGlcNAc-transferase designated iGnT by atransfection cloning strategy. The two genes are non-homologous, show no apparent sequence similarity,but have very similar functions. They appear to havedi¡erent kinetic properties [16,94]. The iGnT enzymehas been shown to function in the formation of N-acetyl-polylactosamine structures on mucin-type core2 in combination with L4Gal-T4 [75], and evidencefor its in vivo function as an i synthase is providedby the cloning strategy. The function of the L3GnTgene homologous with L3Gal-Ts has not been exten-sively studied. It may have a speci¢c function in theformation of a particular glycoconjugate or oligosac-charide structure.

4.3. L3Gal-Ts are di¡erentially expressed

Northern analysis of members of subgroups I, III,and IV of the L3Gal-T gene family have been re-ported. The type 1 chain synthases of subgroup Iare di¡erentially expressed, but surprisingly severalof the genes are mainly or exclusively expressed inbrain. L3Gal-T1 was originally cloned using cDNAfrom a melanoma cell expressing type 1 chain struc-tures. Northern analysis of human organs indicateexclusive expression in brain [37,100]; however,Northern analysis of mouse organs indicated ubiqui-tous low levels in all organs in addition to the highlevels in brain [101]. Northern analysis with L3Gal-T2 showed the same expression pattern (restricted to

brain), although heart also expressed high levels[37,100,101]. L3Gal-T3 is more widely expressed,and is found in brain, pancreas, kidney, and repro-ductive organs, but no expression was detected inlung, colon, liver, or stomach, which all containtype 1 chain glycoconjugates [37,101]. L3Gal-T5,with similar kinetic properties as those found fortrachea epithelium L3Gal-T activity [105], is ex-pressed in epithelial tissues including pancreas andintestine [102,103]. Isshiki et al. [102] elegantly corre-lated the expression of L3Gal-T5 with expression ofsialyl-Lea antigen in gastrointestinal and pancreaticcancer cell lines. The data suggest that it is not asingle L3Gal-T enzyme that controls type 1 chainsynthesis in epithelial tissues, and it is likely thatadditional genes encoding type 1 chain synthases ex-ist.

The GM1 synthase, L3Gal-T4, appears to be ubiq-uitously expressed, although higher levels were foundin rat thymus and spleen [37,99]. Several transcriptsare found and these are probably derived from di¡er-ent usages of polyadenylation signals. The L3GnTgene is ubiquitously expressed with multiple tran-scripts [94]. The same pattern is found for the non-homologous iGnT [16], indicating that the two genescan be expressed simultaneously. This supports thehypothesis that these have di¡erent functions, as sug-gested above.

4.4. L3-galactosyltransferase genes in lowerorganisms

Several C. elegans genes with weak similarity toL3Gal-Ts exist, but the low degree of similarity pre-cludes grouping according to the human L3Gal-Tsubgroups and functions (not shown). Yuan et al.[106] demonstrated that the Drosophila gene desig-nated brainiac was homologous to several membersof the human L3Gal-T gene family. Based on se-quence similarities with bacterial glycosyltransfer-ases, it was suggested that brainiac encoded a se-creted glycosyltransferase, but so far no reportshave demonstrated enzymatic activity for brainiac.

5. Concluding remarks and future perspectives

The discovery of the L4- and L3Gal-T gene fami-

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 47

lies are direct consequences of the human genomeproject [30]. The computer-cloning strategy has sofar only been applied to a few glycosyltransferasegene families, but the results as discussed here withthe Gal-T gene families clearly illustrate the powerand speed with which homologous genes can be iden-ti¢ed and cloned. This strategy will undoubtedlyidentify other large gene families in the near future.Deciphering the functions of such novel genes iden-ti¢ed by primary sequence similarities represents amajor task for the future. It is clear that it is possi-ble, to some extent, to predict the enzymatic functionof the products of homologous genes. However, it isalso likely that genes with functions unrelated to gly-cosyltransferases may be included in the search re-sults, and as discussed functions have not been de-termined for several apparent homologous membersof the L3Gal-T gene family. Disregarding this prob-lem, it is still quite complicated to develop in vitroand/or in vivo glycosyltransferase assays for di¡erentmembers of a homologous gene family, as the kineticproperties vary markedly.

Our understanding of synthesis and genetic regu-lation of some of the most common linkages in gly-coconjugates has increased signi¢cantly as a result ofthe genome project. The most spectacular ¢ndingmay be the large number of enzymes that, despiteperforming seemingly similar reactions with simplesaccharide acceptors, appear to have di¡erent func-tions in the biosynthesis of di¡erent oligosaccharidestructures and types of glycoconjugates. Combinedwith the marked di¡erences in tissue expression pat-terns it is suggested that these isoenzymes do notprovide full functional redundancy. The existenceof multiple enzymes covering the synthesis ofGalL1^3/4GlcNAc linkages was largely unexpectedfrom past analysis of L3- and L4Gal-T activities incells and tissues. The synthesis of linear andbranched poly-N-acetyllactosamine structures, di¡er-ent mucin-type core structures, and di¡erent glyco-lipid and glycoprotein types, is emerging as a farmore complex and regulated process than previouslyrecognized. It appears that the repertoire of galacto-syltransferases in a cell can regulate the synthesis oftype 1 or type 2 chain structures independently ondi¡erent glycoconjugates, and for di¡erent branch-arms of a branched mucin-type structure. We envi-

sion that future in vitro and especially in vivo studieswill extend these initial ¢ndings, and demonstrate thefull complexity of this highly di¡erentiated biosyn-thetic pathway. `A galactosyltransferase for all func-tions' may not be an exaggeration.

New avenues are available for research into theregulation of important developmental and cancer-associated carbohydrate antigens. The Lewis histo-blood group associated antigens, sialosyl-Lea andsialosyl-Lex, are carried on type 1 or type 2 chainpoly-N-acetyllactosamine structures [107,108]. Giventhe new knowledge of the regulation of these corechains, it is conceivable that changes in the expres-sion of L3- and L4Gal-Ts play an important role inthe expression of cancer antigens. A well documentedexample is the oncofetal expression of type 2 chainstructures in colon. Normal colon expresses type 1chain structures, while fetal colon and colonic tu-mors express both type 1 and type 2 chain structures.Holmes et al. [109,110] originally demonstrated thatthe lack of biosynthesis of type 2 chain glycolipids innormal colonic mucosa was a result of relatively lowconcentrations of acceptor substrates for L3- andL4Gal-T activities. The Km of L4Gal-T activity innormal colon for the Lc3Cer glycolipid acceptorwas higher than that of L3Gal-T activity. Incubationof frozen sections of normal human colon withLc3Cer glycolipid revealed the expression of L4Gal-T activity in spite of the fact that colonic epitheliumnormally synthesizes only type 1 chain glycolipids.These results suggested that the basis for oncofetalexpression of type 2 chain structures was increasedL3GlcNAc-transferase activity, and, in fact, such in-creased activity was demonstrated [110]. However,the present information suggests that it is possiblethat speci¢c changes in expression of isoforms ofeither L3Gal-Ts, L4Gal-Ts, and/or L3GlcNAc-trans-ferases contribute to this e¡ect.

As the human genome project turns to the ¢nalstages (the entire genomic sequence expected to beavailable year 2004) it is important to develop betterstrategies to identify and predict glycosyltransferasegenes and their functions. The strategy discussed hereconsiders primary sequence similarity, but otherfeatures common for glycosyltransferases may po-tentially be used to broaden the search in the fu-ture.

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5348

Acknowledgements

The authors are grateful to Drs. Steven B. Levery,E. H. Holmes, and M.A. Hollingsworth for impor-tant contributions to the work described and theircritical review of the manuscript. We are indebtedto Drs. Louis Gastinel and Yves Bourne for theirsharing of data on the structure of L4Gal-T1.Work performed in the authors' laboratory was sup-ported by the Danish Cancer Society, the MizutaniFoundation for Glycoscience, the Ingeborg RoikjerFoundation, the Velux Foundation, the DanishMedical Research Council, the Danish Natural Sci-ence Research Council, the Novo Nordisk Founda-tion, NIH 1 RO1 CA66234, and funds from the EUBiotech 4th Framework.

References

[1] N.L. Shaper, J.H. Shaper, J.L. Meuth, J.L. Fox, H. Chang,I.R. Kirsch, G.F. Hollis, Bovine galactosyltransferase: iden-ti¢cation of a clone by direct immunological screening of acDNA expression library, Proc. Natl. Acad. Sci. USA 83(1986) 1573^1577.

[2] H. Narimatsu, S. Sinha, K. Brew, H. Okayama, P.K. Qasba,Cloning and sequencing of cDNA of bovine N-acetylglucos-amine (L1^4)galactosyltransferase, Proc. Natl. Acad. Sci.USA 83 (1986) 4720^4724.

[3] G. D'Agostaro, B. Bendiak, M. Tropak, Cloning of cDNAencoding the membrane-bound form of bovine L1,4-galacto-syltransferase, Eur. J. Biochem. 183 (1989) 211^217.

[4] N.L. Shaper, G.F. Hollis, J.G. Douglas, I.R. Kirsch, J.H.Shaper, Characterization of the full length cDNA for murineL-1,4- galactosyltransferase. Novel features at the 5P-end pre-dict two translational start sites at two in-frame AUGs,J. Biol. Chem. 263 (1988) 10420^10428.

[5] J.C. Paulson, K.J. Colley, Glycosyltransferases. Structure,localization, and control of cell type-speci¢c glycosylation,J. Biol. Chem. 264 (1989) 17615^17618.

[6] M. Fukuda, M.F. Bierhuizen, J. Nakayama, Expressioncloning of glycosyltransferases, Glycobiology 6 (1996) 683^689.

[7] J. Weinstein, E.U. Lee, K. McEntee, P.H. Lai, J.C. Paulson,Primary structure of L-galactoside K2,6- sialyltransferase.Conversion of membrane-bound enzyme to soluble formsby cleavage of the NH2-terminal signal anchor, J. Biol.Chem. 262 (1987) 17735^17743.

[8] F. Yamamoto, J. Marken, T. Tsuji, T. White, H. Clausen, S.Hakomori, Cloning and characterization of DNA comple-mentary to human UDP-GalNAc:Fuc K1-2Gal K1-3Gal-NAc transferase (histo-blood group A transferase) mRNA,J. Biol. Chem. 265 (1990) 1146^1151.

[9] M. Sarkar, E. Hull, Y. Nishikawa, R.J. Simpson, R.L. Mo-ritz, R. Dunn, H. Schachter, Molecular cloning and expres-sion of cDNA encoding the enzyme that controls conversionof high-mannose to hybrid and complex N-glycans:UDP-N-acetylglucosamine:K-3-D-mannoside L-1,2-N-acetylglucos-aminyltransferase I, Proc. Natl. Acad. Sci. USA 88 (1991)234^238.

[10] D.H. Joziasse, J.H. Shaper, D.H. Van den Eijnden, A.J. VanTunen, N.L. Shaper, Bovine K1-3-galactosyltransferase: iso-lation and characterization of a cDNA clone. Identi¢cationof homologous sequences in human genomic DNA, J. Biol.Chem. 264 (1989) 14290^14297.

[11] H. Clausen, T. White, K. Takio, K. Titani, M. Stroud, E.Holmes, J. Karkov, L. Thim, S. Hakomori, Isolation tohomogeneity and partial characterization of a histo-bloodgroup A de¢ned FucK1-2Gal K1-3-N-acetylgalactosaminyl-transferase from human lung tissue, J. Biol. Chem. 265(1990) 1139^1145.

[12] L.K. Ernst, V.P. Rajan, R.D. Larsen, M.M. Ru¡, J.B.Lowe, Stable expression of blood group H determinantsand GDP-L-fucose:L-D-galactoside 2-K-L-fucosyltransferasein mouse cells after transfection with human DNA, J. Biol.Chem. 264 (1989) 3436^3447.

[13] V.P. Rajan, R.D. Larsen, S. Ajmera, L.K. Ernst, J.B. Lowe,A cloned human DNA restriction fragment determines ex-pression of a GDP-L-fucose:L-D-galactoside 2-K-L-fucosyl-transferase in transfected cells. Evidence for isolation andtransfer of the human H blood group locus, J. Biol. Chem.264 (1989) 11158^11167.

[14] R. Kumar, J. Yang, R.D. Larsen, P. Stanley, Cloning andexpression of N-acetylglucosaminyltransferase I, the medialGolgi transferase that initiates complex N-linked carbohy-drate formation, Proc. Natl. Acad. Sci. USA 87 (1990)9948^9952.

[15] M.F. Bierhuizen, K. Maemura, M. Fukuda, Expression of adi¡erentiation antigen and poly-N- acetyllactosaminyl O-gly-cans directed by a cloned core 2 L-1,6-N-acetylglucosaminyl-transferase, J. Biol. Chem. 269 (1994) 4473^4479.

[16] K. Sasaki, K. Kurata-Miura, M. Ujita, K. Angata, S. Na-kagawa, S. Sekine, T. Nishi, M. Fukuda, Expression cloningof cDNA encoding a human L-1,3-N-acetylglucosaminyl-transferase that is essential for poly-N-acetyllactosaminesynthesis, Proc. Natl. Acad. Sci. USA 94 (1997) 14294^14299.

[17] B.W. Weston, R.P. Nair, R.D. Larsen, J.B. Lowe, Isolationof a novel human K(1,3)fucosyltransferase gene and molec-ular comparison to the human Lewis blood group b (1,3/1,4)fucosyltransferase gene. Syntenic, homologous, nonallelicgenes encoding enzymes with distinct acceptor substrate spe-ci¢cities, J. Biol. Chem. 267 (1992) 4152^4160.

[18] J.B. Lowe, J.F. Kukowska-Latallo, R.P. Nair, R.D. Larsen,R.M. Marks, B.A. Macher, R.J. Kelly, L.K. Ernst, Molec-ular cloning of a human fucosyltransferase gene that deter-mines expression of the Lewis x and VIM-2 epitopes but notELAM-1-dependent cell adhesion, J. Biol. Chem. 266 (1991)17467^17477.

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 49

[19] R.J. Kelly, L.K. Ernst, R.D. Larsen, J.G. Bryant, J.S. Rob-inson, J.B. Lowe, Molecular basis for H blood group de¢-ciency in Bombay (Oh) and para-Bombay individuals, Proc.Natl. Acad. Sci. USA 91 (1994) 5843^5847.

[20] S. Rouquier, J.B. Lowe, R.J. Kelly, A.L. Fertitta, G.G. Len-non, D. Giorgi, Molecular cloning of a human genomic re-gion containing the H blood group K(1,2)fucosyltransferasegene and two H locus- related DNA restriction fragments.Isolation of a candidate for the human Secretor blood grouplocus, J. Biol. Chem. 270 (1995) 4632^4639.

[21] R.J. Kelly, S. Rouquier, D. Giorgi, G.G. Lennon, J.B.Lowe, Sequence and expression of a candidate for the hu-man Secretor blood group K(1,2)fucosyltransferase gene(FUT2). Homozygosity for an enzyme-inactivating nonsensemutation commonly correlates with the non-secretor pheno-type, J. Biol. Chem. 270 (1995) 4640^4649.

[22] J.A. Meurer, R.F. Drong, F.L. Homa, J.L. Slightom, A.P.Elhammer, Organization of a human UDP-GalNAc:poly-peptide, N- acetylgalactosaminyltransferase gene and a re-lated processed pseudogene, Glycobiology 6 (1996) 231^241.

[23] A.K. Datta, J.C. Paulson, The sialyltransferase `sialylmotif'participates in binding the donor substrate CMP-NeuAc,J. Biol. Chem. 270 (1995) 1497^1500.

[24] B.D. Livingston, J.C. Paulson, Polymerase chain reactioncloning of a developmentally regulated member of the sialyl-transferase gene family, J. Biol. Chem. 268 (1993) 11504^11507.

[25] S. Tsuji, A.K. Datta, J.C. Paulson, Systematic nomenclaturefor sialyltransferases, Glycobiology 6 (1996) v^vii.

[26] D.X. Wen, B.D. Livingston, K.F. Medzihradszky, S. Kelm,A.L. Burlingame, J.C. Paulson, Primary structure ofGalL1,3(4)GlcNAc K2,3- sialyltransferase determined bymass spectrometry sequence analysis and molecular cloning.Evidence for a protein motif in the sialyltransferase genefamily, J. Biol. Chem. 267 (1992) 21011^21019.

[27] E.P. Bennett, H. Hassan, H. Clausen, cDNA cloning andexpression of a novel human UDP-N-acetyl-K-D-galactos-amine. Polypeptide N-acetylgalactosaminyltransferase, Gal-NAc-T3, J. Biol. Chem. 271 (1996) 17006^17012.

[28] E.P. Bennett, H. Hassan, U. Mandel, E. Mirgorodskaya, P.Roepstor¡, J. Burchell, J. Taylor-Papadamitriou, M.A. Hol-lingsworth, G. Merkx, A. Geurts van Kessel, H. Eiberg, R.Ste¡ensen, H. Clausen, Cloning of a human UDP-N-acetyl-K-D-galactosamine:polypeptide N-acetylgalactosaminyltrans-ferase that complements other GalNAc-transferases in com-plete O-glycosylation of the MUC1 tandem repeat, J. Biol.Chem. 273 (1998) 30472^30481.

[29] T.K.G. Hagen, F.K. Hagen, M.M. Balys, T.M. Beres, B.Van Wuyckhuyse, L.A. Tabak, Cloning and expression ofa novel, tissue speci¢cally expressed member of the UDP-GalNAc:polypeptide n-acetylgalactosaminyltransferase fam-ily, J. Biol. Chem. 273 (1998) 27749^27754.

[30] M.D. Adams, A.R. Kerlavage, R.D. Fleischmann, R.A.Fuldner, C.J. Bult, N.H. Lee, E.F. Kirkness, K.G. Wein-stock, J.D. Gocayne, O. White, Initial assessment of human

gene diversity and expression patterns based upon 83 millionnucleotides of cDNA sequence, Nature 377 (1995) 3^174.

[31] J.R. Korenberg, X.N. Chen, M.D. Adams, J.C. Venter, S.Karlin, S.F. Altschul, Toward a cDNA map of the humangenome: applications and statistics for multiple high-scoringsegments in molecular sequences, Genomics 29 (1995) 364^370.

[32] S.F. Altschul, T.L. Madden, A.A. Scha¡er, J. Zhang, Z.Zhang, W. Miller, D.J. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search pro-grams, Nucleic Acids Res. 25 (1997) 3389^3402.

[33] S. Karlin, P. Bucher, V. Brendel, S.F. Altschul, Statisticalmethods and insights for protein and DNA sequences,Annu. Rev. Biophys. Biophys. Chem. 20 (1991) 175^203.

[34] Z. Zhang, A.A. Scha¡er, W. Miller, T.L. Madden, D.J. Lip-man, E.V. Koonin, S.F. Altschul, Protein sequence similaritysearches using patterns as seeds, Nucleic Acids Res. 26(1998) 3986^3990.

[35] F.K. Hagen, C.A. Gregoire, L.A. Tabak, Cloning and se-quence homology of a rat UDP-GalNAc:polypeptide N-ace-tylgalactosaminyltransferase, Glycoconjugate J. 12 (1995)901^909.

[36] R. Almeida, M. Amado, L. David, S.B. Levery, E.H.Holmes, G. Merkx, A.G. van Kessel, H. Hassan, E.P. Ben-nett, H. Clausen, A Family of human L4-galactosyltransfer-ases: cloning and expression of two novel UDP-galactose:L-n-acetylglucosamine L1,4-galactosyltransferases, L4Gal-T2and L4Gal-T3, J. Biol. Chem. 272 (1997) 31979^31992.

[37] M. Amado, R. Almeida, F. Carneiro, S.B. Levery, E.H.Holmes, M. Nomoto, M.A. Hollingsworth, H. Hassan, T.Schwientek, P.A. Nielsen, E.P. Bennett, H. Clausen, A fam-ily of human L3-galactosyltransferases: characterisation offour members of a UDP-galactose:L-N-acetylglucosamine/L-N-acetylgalactosamine L1,3galactosyltransferase family,J. Biol. Chem. 273 (1998) 12770^12778.

[38] M. Amado, E.H. Holmes, S.B. Levery, M. Nomoto, T.Schwientek, M.A. Jensen, M.A. Hollingsworth, F. Carneiro,H. Clausen, Identi¢cation of a novel member of the L3Ga-lactosyltransferase gene family that represent an epithelialtype of L3Galactosyltransferase, manuscript submitted.

[39] T. Schwientek, R. Almeida, S.B. Levery, E. Holmes, E.P.Bennett, H. Clausen, Cloning of a novel member of theUDP-galactose:L-N-acetylglucosamine L1,4-galactosyltrans-ferase family, L4Gal-T4, involved in glycosphingolipid bio-synthesis, J. Biol. Chem. 273 (1998) 29295^29305.

[40] T. Schwientek, M. Nomoto, S.B. Levery, G. Merkx, A.G.van Kessel, E.P. Bennett, M.A. Hollingsworth, H. Clausen,Control of O-glycan branch formation. Molecular cloning ofhuman cDNA encoding a novel L1,6-N-acetylglucosaminyl-transferase forming core 2 and core 4, J. Biol. Chem. 274(1999) 4504^4512.

[41] J.C. Yeh, E. Ong, M. Fukuda, Molecular cloning and ex-pression of a novel L-1, 6-N-acetylglucosaminyltransferasethat forms core 2, core 4, and I branches, J. Biol. Chem.274 (1999) 3215^3221.

[42] E.P. Bennett, H. Hassan, U. Mandel, M.A. Hollingsworth,

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5350

N. Akisawa, Y. Ikematsu, G. Merkx, A.G. van Kessel, S.Olofsson, H. Clausen, Cloning and characterization of aclose homologue of human UDP-N-acetyl-K-D-galactos-amine:polypeptide N-acetylgalactosaminyltransferase-T3,designated GalNAc-T6. Evidence for genetic but notfunctional redundancy, J. Biol. Chem. 274 (1999) 25362^25370.

[43] N.W. Schworak, J. Liu, L.M. Petros, L. Zhang, M. Kobaya-shi, N.G. Copeland, N.A. Jenkins, R.D. Rosenberg, Multi-ple isoforms of heparan sulfate D-glucosaminyl 3-O-sulfo-transferase isolation, characterization, and expression ofhuman cDNAs and identi¢cation of distinct genomic loci,J. Biol. Chem. 274 (1999) 5170^5184.

[44] F.K. Hagen, K. Nehrke, cDNA cloning and expression of afamily of UDP-N-acetyl-D-galactosamine:polypeptide N-acetylgalactosaminyltransferase sequence homologs fromCaenorhabditis elegans, J. Biol. Chem. 273 (1998) 8268^8277.

[45] S. Chen, S. Zhou, M. Sarkar, A.M. Spence, H. Schachter,Expression of three Caenorhabditis elegans N-acetylglucos-aminyltransferase I genes during development, J. Biol.Chem. 274 (1999) 288^297.

[46] L.K. Kreppel, M.A. Blomberg, G.W. Hart, Dynamic glyco-sylation of nuclear and cytosolic proteins. Cloning and char-acterization of a unique O-GlcNAc transferase with multipletetratricopeptide repeats, J. Biol. Chem. 272 (1997) 9308^9315.

[47] S. Strahl-Bolsinger, M. Gentzsch, W. Tanner, Protein O-mannosylation, Biochim. Biophys. Acta 1426 (1999) 297^307.

[48] A. Chiba, K. Matsumura, H. Yamada, T. Inazu, T. Shimizu,S. Kusunoki, I. Kanazawa, A. Kobata, T. Endo, Structuresof sialylated O-linked oligosaccharides of bovine peripheralnerve K-dystroglycan. The role of a novel O-mannosyl-typeoligosaccharide in the binding of K-dystroglycan with lam-inin, J. Biol. Chem. 272 (1997) 2156^2162.

[49] S.L. Martin, M.R. Edbrooke, T.C. Hodgman, D.H. Van denEijnden, M.I. Bird, Lewis X biosynthesis in Helicobacterpylori. Molecular cloning of an K(1,3)-fucosyltransferasegene, J. Biol. Chem. 272 (1997) 21349^21356.

[50] C. Breton, E. Bettler, D.H. Joziasse, R.A. Geremia, A. Im-berty, Sequence^function relationships of prokaryotic andeukaryotic galactosyltransferases, J. Biochem. 123 (1998)1000^1009.

[51] K. Drickamer, A conserved disulphide bond in sialyltrans-ferases, Glycobiology 3 (1993) 2^3.

[52] H. Clausen, E.P. Bennett, A family of UDP-GalNAc:poly-peptide N-acetylgalactosaminyl-transferases control the ini-tiation of mucin-type O-linked glycosylation, Glycobiology 6(1996) 635^646.

[53] L.N. Gastinel, C. Cambillau, Y. Bourne, Crystal structuresof the bovine L4galactosyltransferase catalytic domain andits complex with uridine diphosphogalactose, EMBO J. 18(1999) 3546^3557.

[54] C.A.R. Wiggins, S. Munro, Activity of the yeast MNN1 K-1,3-mannosyltransferase requires a motif conserved in many

other families of glycosyltransferases, Proc. Natl. Acad. Sci.USA 95 (1998) 7945^7950.

[55] N.L. Shaper, J.A. Meurer, D.H. Joziasse, T.-D.D. Chou,E.J. Smith, R.A. Schnaar, J.H. Shaper, The chicken genomecontains two functional nonallelic L1,4-galactosyltransferasegenes: chromosomal assignment to syntenic regions tracksfate of the two gene lineages in the human genome, J. Biol.Chem. 272 (1997) 31389^31399.

[56] I. Van Die, H. Bakker, D.H. Van den Eijnden, Identi¢cationof conserved amino acid motifs in members of the L1-4-ga-lactosyltransferase gene family, Glycobiology 7 (1997) v^ix.

[57] N.-W. Lo, J.H. Shaper, J. Pevsner, N.L. Shaper, The ex-panding L4-galactosyltransferase gene family: messagesfrom the databanks, Glycobiology 8 (1998) 517^526.

[58] T. Sato, K. Furukawa, H. Bakker, D.H. Van den Eijnden, I.Van Die, Molecular cloning of a human cDNA encoding anovel L-1,4-galactosyltransferase with 37% identity to themammalian UDP-Gal:GlcNAc L-1,4-galactosyltransferase,Proc. Natl. Acad. Sci. USA 95 (1998) 472^477.

[59] T. Nomura, M. Takizawa, J. Aoki, H. Arai, K. Inoue, E.Wakisaka, N. Yoshizuka, G. Imokawa, N. Dohmae, K. Ta-kio, M. Hattori, N. Matsuo, Puri¢cation, cDNA cloning,and expression of the UDP-Gal:glucosylceramide L-1,4-ga-lactosyltransferase from rat brain, J. Biol. Chem. 273 (1998)13570^13577.

[60] T. Okajima, K. Yoshida, T. Kondo, K. Furukawa, Humanhomolog of Caenorhabditis elegans sqv-3 gene is galactosyl-transferase I involved in the biosynthesis of the glycosami-noglycan-protein linkage region of proteoglycans, J. Biol.Chem. 274 (1999) 22915^22918.

[61] R. Almeida, S.B. Levery, U. Mandel, H. Kresse, T. Schwien-tek, E.P. Bennett, H. Clausen, Cloning and expression of aproteoglycan UDP-galactose:L-xylose L1,4-galactosyltrans-ferase I. A seventh member of the human L4-galactosyltrans-ferase gene family, J. Biol. Chem. 274 (1999) 26165^26171.

[62] K. Furukawa, K. Matsuta, F. Takeuchi, E. Kosuge, T.Miyamoto, A. Kobata, Kinetic study of a galactosyltransfer-ase in the B cells of patients with rheumatoid arthritis, Int.Immunol. 2 (1990) 105^112.

[63] K. Nakazawa, K. Furukawa, H. Narimatsu, A. Kobata,Kinetic study of human L-1,4-galactosyltransferase expressedin E. coli, J. Biochem. 113 (1993) 747^753.

[64] Q. Lu, P. Hasty, B.D. Shur, Targeted mutation in L1,4-ga-lactosyltransferase leads to pituitary insu¤ciency and neo-natal lethality, Dev. Biol. 181 (1997) 257^267.

[65] M. Asano, K. Furukawa, M. Kido, S. Matsumoto, Y. Ume-saki, N. Kochibe, Y. Iwakura, Growth retardation and earlydeath of L-1,4-galactosyltransferase knockout mice with aug-mented proliferation and abnormal di¡erentiation of epithe-lial cells, EMBO J. 16 (1997) 1850^1857.

[66] J.H. Shaper, D.H. Joziasse, J.A. Meurer, T.-D.D. Chou,R.A. Schnaar, N.L. Shaper, The chicken genome containstwo functional non-allelic L1,4-galactosyltransferase genes,Glycoconjugate J. 12 (1995) 477^477.

[67] G.F. Hollis, J.G. Douglas, N.L. Shaper, J.H. Shaper, J.M.Sta¡ord-Hollis, R.J. Evans, I.R. Kirsch, Genomic structure

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 51

of murine L1,4-galactosyltransferase, Biochem. Biophys.Res. Commun. 162 (1989) 1069^1075.

[68] L. Mengle-Gaw, M.F. McCoy-Haman, D.C. Tiemeier, Ge-nomic structure and expression of human L-1,4- galactosyl-transferase, Biochem. Biophys. Res. Commun. 176 (1991)1269^1276.

[69] H. Bakker, A. Vantering, M. Agterberg, A.B. Smit, D.H.van de Eijnden, I.R. Van Die, Deletion of two exons fromthe Lymnaea stagnalis L1,4-N-acetylglucosaminyltransferasegene elevates the kinetic e¤ciency of the encoded enzyme forboth UDP sugar donor and acceptor substrates, J. Biol.Chem. 272 (1997) 18580^18585.

[70] F.K. Hagen, B. Hazes, R. Ra¡o, D. deSa, L.A. Tabak,Structure^function Analysis of the UDP-N-acetyl-D-galac-tosamine:polypeptide N-acetylgalactosaminyltransferase. Es-sential residues lie in a predicted active site cleft resemblinga lactose repressor fold, J. Biol. Chem. 274 (1999) 6797^6803.

[71] Y. Wang, S.S. Wong, M.N. Fukuda, H. Zu, Z. Liu, Q.Tang, H.E. Appert, Identi¢cation of functional cysteine res-idues in human galactosyltransferase, Biochem. Biophys.Res. Commun. 204 (1994) 701^709.

[72] H. Zu, M.N. Fukuda, S.S. Wong, Y. Wang, Z. Liu, Q.Tang, H.E. Appert, Use of site-directed mutagenesis to iden-tify the galactosyltransferase binding sites for UDP-galac-tose, Biochem. Biophys. Res. Commun. 206 (1995) 362^369.

[73] H. Bakker, M. Agterberg, A. Van Tetering, C.A. Koeleman,D.H. Van den Eijnden, I. Van Die, A Lymnaea stagnalisgene, with sequence similarity to that of mammalianL1C4-galactosyltransferases, encodes a novel UDP-GlcNAc:GlcNAc L-R L1C4-N-acetylglucosaminyltransfer-ase, J. Biol. Chem. 269 (1994) 30326^30333.

[74] D.H. Van den Eijnden, Novel pathways in complex-typeoligosaccharide synthesis new vistas opened by studies oninvertebrates, Biochem. Soc. Trans. 25 (1997) 887^893.

[75] M. Ujita, J. McAuli¡e, T. Schwientek, R. Almeida, O.Hindsgaul, H. Clausen, M. Fukuda, Synthesis of poly-N-acetyllactosamine in core 2 branched O-glycans: the require-ment of novel L-1,4-galactosyltransferase IV and L-1,3-N-acetylglucosaminyltransferase, J. Biol. Chem. 273 (1998)34843^34849.

[76] M. Ujita, J. McAuli¡e, M. Suzuki, O. Hindsgaul, H. Clau-sen, M.N. Fukuda, M. Fukuda, Regulation of I-branchedpoly-N-acetyllactosamine synthesis. Concerted actions by I-extension enzyme, I-branching enzyme, and L1,4-galactosyl-transferase I, J. Biol. Chem. 274 (1999) 9296^9304.

[77] I. Van Die, A. Van Tetering, W.E. Schiphorst, T. Sato, K.Furukawa, D.H. Van den Eijnden, The acceptor substratespeci¢city of human L4-galactosyltransferase V indicates itspotential function in O-glycosylation, FEBS Lett. 450 (1999)52^56.

[78] T. Sato, N. Aoki, T. Matsuda, K. Furukawa, Di¡erentiale¡ect of K-lactalbumin on L-1,4-galactosyltransferase IV ac-tivities, Biochem. Biophys. Res. Commun. 244 (1998) 637^641.

[79] E. Quentin, A. Gladen, L. Roden, H. Kresse, A genetic

defect in the biosynthesis of dermatan sulfate proteoglycan:galactosyltransferase I de¢ciency in ¢broblasts from a pa-tient with a progeroid syndrome, Proc. Natl. Acad. Sci.USA 87 (1990) 1342^1346.

[80] G. Sugumaran, M. Katsman, J.E. Silbert, E¡ects of brefel-din A on the localization of chondroitin sulfate-synthesizingenzymes. Activities in subfractions of the Golgi from chickembryo epiphyseal cartilage, J. Biol. Chem. 267 (1992) 8802^8806.

[81] B. Rajput, N.L. Shaper, J.H. Shaper, Transcriptional regu-lation of murine L1,4- galactosyltransferase in somatic cells.Analysis of a gene that serves both a housekeeping and amammary gland-speci¢c function, J. Biol. Chem. 271 (1996)5131^5142.

[82] M. Charron, J.H. Shaper, N.L. Shaper, The increased levelof L1,4-galactosyltransferase required for lactose biosynthe-sis is achieved in part by translational control, Proc. Natl.Acad. Sci. USA 95 (1998) 14805^14810.

[83] D. Zhou, C. Chen, S. Jiang, Z. Shen, Z. Chi, J. Gu, Expres-sion of L1,4-galactosyltransferase in the development ofmouse brain, Biochim. Biophys. Acta 1425 (1998) 204^208.

[84] R. Wilson, R. Ainscough, K. Anderson et al., 2.2 Mbofcontigous nucleotide sequence from chromosome III of C.elegans, Nature 368 (1994) 32^38.

[85] T. Herman, R.H. Horvitz, Three proteins involved in Cae-norhabditis elegans vulval invagination are similar to com-ponents of a glycosylation pathway, Proc. Natl. Acad. Sci.USA 96 (1999) 974^979.

[86] H. Kitagawa, Y. Tone, J. Tamura, K.W. Neumann, T. Oga-wa, S. Oka, T. Kawasaki, K. Sugahara, Molecular cloningand expression of glucuronyltransferase I involved in thebiosynthesis of the glycosaminoglycan-protein linkage regionof proteoglycans, J. Biol. Chem. 273 (1998) 6615^6618.

[87] E.C. Gotschlich, Genetic locus for the biosynthesis of thevariable portion of Neisseria gonorrhoeae lipooligosaccha-ride, J. Exp. Med. 180 (1994) 2181^2190.

[88] W. Wakarchuk, A. Martin, M.P. Jennings, E.R. Moxon,J.C. Richards, Functional relationships of the genetic locusencoding the glycosyltransferase enzymes involved in expres-sion of the lacto-N-neotetraose terminal lipopolysaccharidestructure in Neisseria meningitidis, J. Biol. Chem. 271 (1996)19166^19173.

[89] A. Seko, S. Hara-Kuge, S. Yonezawa, K. Nagata, K. Yama-shita, Identi¢cation and characterization of N-acetylglucos-amine-6-O-sulfate-speci¢c L1,4-galactosyltransferase in hu-man colorectal mucosa, FEBS Lett. 440 (1998) 307^310.

[90] I.M. Van den Nieuwenhof, W.M. Schiphorst, I. Van Die,D.H. Van den Eijnden, Bovine mammary gland UDP-Gal-NAc:GlcNAcL-R L1C4-N-acetylgalactosaminyltransferaseis glycoprotein hormone nonspeci¢c and shows interactionwith K-lactalbumin, Glycobiology 9 (1999) 115^123.

[91] B.J. Mengeling, S.M. Manzella, J.U. Baenziger, A cluster ofbasic amino acids within an K-helix is essential for K-subunitrecognition by the glycoprotein hormone N-acetylgalactos-amine, Proc. Natl. Acad. Sci. USA 92 (1995) 502^506.

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^5352

[92] K. Uehara, T. Muramatsu, Molecular cloning and charac-terization of L-1,4-galactosylatransferase expressed inmouse testis, Eur. J. Biochem. 244 (1997) 706^712.

[93] T. Kaname, K. Uehara, K. Abe, T. Muramatsu, K. Yama-mura, Testis L-1,4-galactosyltransferase gene maps tomouse chromosome 5, Genomics 53 (1998) 117^118.

[94] D. Zhou, A. Dinter, R.G. Gallego, J.P. Kamerling, J.F.G.Vliegenthart, E.G. Berger, T. Hennett, A L-1,3-N-acetylglu-cosaminyltransferase with poly-N-acetyllactosamine syn-thase activity is structurally related to L-1,3-galactosyltrans-ferases, Proc. Natl. Acad. Sci. USA 96 (1999) 406^411.

[95] L. Kjellen, U. Lindahl, Proteoglycans: structures and inter-actions, Annu. Rev. Biochem. 60 (1991) 443^475.

[96] B.T. Sheares, J.T. Lau, D.M. Carlson, Biosynthesis of ga-lactosyl-L1,3-N-acetylglucosamine, J. Biol. Chem. 257(1982) 599^602.

[97] E.H. Holmes, Characterization and membrane organizationof L1-3- and L1-4-galactosyltransferases from human colon-ic adenocarcinoma cell lines Colo 205 and SW403: basis forpreferential synthesis of type 1 chain lacto-series carbohy-drate structures, Arch. Biochem. Biophys. 270 (1989) 630^646.

[98] A. Valli, S. Gallanti, M. Bozzaro, Trinchera, L-1,3-galacto-syltransferase and K-1,2-fucosyltransferase involved in thebiosynthesis of type-1-chain carbohydrate antigens in hu-man colon adenocarcinoma cell lines, Eur. J. Biochem. 256(1998) 494^501.

[99] H. Miyaki, S. Fukumoto, M. Okada, T. Hasegawa, K.Furukawa, Expression cloning of rat cDNA encodingUDP-galactose G(D2) L1,3 galactosyltransferase that deter-mines the expression of G(D1b)/G(M1)/G(A1), J. Biol.Chem. 272 (1997) 24794^24799.

[100] F. Kolbinger, M.B. Strei¡, A.G. Katopodis, Cloning of ahuman UDP-galactose:2-acetamido-2-deoxy-D-glucose 3L-galactosyltransferase catalysing the formation of type 1chains, J. Biol. Chem. 273 (1998) 433^440.

[101] T. Hennett, A. Dinter, P. Kuhnert, T.S. Mattu, P.M. Rudd,E.G. Berger, Genomic cloning and expression of three mur-ine UDP-galactose:L-N-acetylglucosamine L1,3-galactosyl-transferase genes, J. Biol. Chem. 273 (1998) 58^65.

[102] S. Isshiki, A. Togayachi, T. Kudo, S. Nishihara, M. Wata-nabe, T. Kubota, M. Kitajima, N. Shiraishi, K. Sasaki, T.Andoh, H. Narimatsu, Cloning, expression, and character-ization of a novel UDP-galactose:L-N-acetylglucosamineL1,3-galactosyltransferase (L3Gal-T5) responsible for syn-

thesis of type 1 chain in colorectal and pancreatic epitheliaand tumor cells derived therefrom, J. Biol. Chem. 274(1999) 12499^12507.

[103] D. Zhou, E.G. Berger, T. Hennett, Molecular cloning of ahuman UDP-galactose:GlcNAcL1,3GalNAc L1,3 galacto-syltransferase gene encoding an O-linked core3-elongationenzyme, Eur. J. Biochem. 263 (1999) 571^576.

[104] S. Ma®rtensson, S.B. Levery, T. Fang, B. Bendiak, Neutralcore oligosaccharides of bovine submaxillary mucin. Use oflead tetraacetate in the cold for establishing branch posi-tions, Eur. J. Biochem. 258 (1998) 603^622.

[105] B.T. Sheares, D.M. Carlson, Characterization of UDP-ga-lactose:2-acetamido-2-deoxy-D-glucose 3L-galactosyltrans-ferase from pig trachea, J. Biol. Chem. 258 (1983) 9893^9898.

[106] Y.P. Yuan, J. Schultz, M. Mlodzik, P. Bork, Secretedfringe-like signaling molecules may be glycosyltransferases,Cell 88 (1997) 9^11.

[107] S. Hakomori, Aberrant glycosylation in tumors and tumor-associated carbohydrate antigens, Adv. Cancer Res. 52(1989) 257^331.

[108] S. Hakomori, Tumor malignancy de¢ned by aberrant gly-cosylation and sphingo(glyco)lipid metabolism, Cancer Res.56 (1996) 5309^5318.

[109] E.H. Holmes, G.K. Ostrander, H. Clausen, N. Graem, On-cofetal expression of Lex carbohydrate antigens in humancolonic adenocarcinomas. Regulation through type 2 corechain synthesis rather than fucosylation, J. Biol. Chem. 262(1987) 11331^11338.

[110] E.H. Holmes, S. Hakomori, G.K. Ostrander, A. Singhal,Synthesis of type 1 and 2 lacto series glycolipid antigens inhuman colonic adenocarcinoma and derived cell lines is dueto activation of a normally unexpressed L1-3N-acetylgluco-saminyltransferase Molecular changes in carbohydrate anti-gens associated with cancer, J. Biol. Chem. 262 (1987)15649^15658.

[111] K.A. Masri, H.E. Appert, M.N. Fukuda, Identi¢cation ofthe full-length coding sequence for human galactosyltrans-ferase (L-N-acetylglucosaminide:L1,4- galactosyltransfer-ase), Biochem. Biophys. Res. Commun. 157 (1988) 657^663.

[112] K. Furukawa, T. Sato, L-1,4-Galactosylation of N-glycansis a complex process, Biochim. Biophys. Acta 1473 (1999)54^66.

BBAGEN 24916 17-11-99

M. Amado et al. / Biochimica et Biophysica Acta 1473 (1999) 35^53 53