Murine serine proteases MASP-1 and MASP-3, components of the lectin pathway activation complex of...

11
Murine serine proteases MASP-1 and MASP-3, components of the lectin pathway activation complex of complement, are encoded by a single structural gene CM Stover 1 , NJ Lynch 1 , MR Dahl 1 , S Hanson 1 , M Takahashi 2 , M Frankenberger 3 , L Ziegler-Heitbrock 1,3 , I Eperon 4 , S Thiel 5 and WJ Schwaeble 1 1 Department of Microbiology and Immunology, University of Leicester, UK; 2 Department of Biochemistry, Fukushima Medical University, Japan; 3 Clinical Cooperation Group on Aerosols in Medicine, GSF-Institute for Inhalationbiology and Asklepios Fachkliniken Gauting, Germany; 4 Department of Biochemistry, University of Leicester, UK; 5 Department of Medical Microbiology and Immunology, University of Aarhus, Denmark Activation of the lectin pathway of complement is initiated by the binding to microbial carbohydrate structures of a multimolecular fluid-phase complex composed of a carbohydrate recognition subcomponent that associates with three specific serine proteases and an enzymatically inert protein of 19 kDa. The first carbohydrate recognition subcomponent of the lectin pathway identified was mannan-binding lectin (MBL), hence the serine proteases were named MBL-associated serine proteases (MASPs) and numbered according to the sequence of their discovery. Here we describe the primary structures of the two distinct serine proteases MASP-1 and MASP-3 in the rat (and of MASP-3 in the mouse), show their association with plasma MBL complexes, and demonstrate that in rat and mouse, as in man, MASP-1 and MASP-3 are encoded by a single structural gene. For both species, we present the genomic region and regulatory elements responsible for the processing of either MASP-1 or MASP-3 mRNA by alternative splicing/alternative polyadenylation. Furthermore, we demonstrate the evolutionary conservation of MASP-3 mRNA in cDNA transcripts from guinea pig, rabbit, pufferfish, and cow. Genes and Immunity (2003) 4, 374–384. doi:10.1038/sj.gene.6363970 Keywords: innate immunity; complement serine proteases; lectin pathway; alternative splicing/polyadenylation Introduction Complement is an essential component of the innate humoral immune defence system. Activation of comple- ment (via the classical, lectin, and alternative pathways) leads to proinflammatory and antimicrobial activities including recruitment of immune cells, facilitation of phagocytosis and cytotoxicity towards microbial organ- isms and targeted cells. C1q, the recognition subunit of the classical activation pathway, is composed of six heterotrimeric subunits. Binding to immune complexes via the globular heads of C1q effects a conformational change of its collagenous stalks which in turn leads to activation of the associated serine protease tetramer, C1r 2 –C1s 2 . The fourth compo- nent of complement, C 4 , is the substrate of activated C1s. For the lectin pathway of complement activation, mannan-binding lectin (MBL), L-ficolin and H-ficolin have been described as distinct recognition molecules with different specificities for polysaccharides. 1–3 They are oligomers of homotrimeric subunits. A C-type lectin domain forms the carbohydrate recognition structure of MBL. Ficolin oligomers bind to carbohydrates via fibrinogen-like domains. As C1q, MBL and ficolins have structural similarities through the presence of a collage- nous stalk region. They are part of a family of pattern recognition molecules defined as defence collagens. MBL, L-ficolin, and H-ficolin have in common that they form complexes with the serine proteases MASP-1, MASP-2, and MASP-3. 2,4–7 Of these, MASP-2 mediates complement activation by cleaving the fourth and second complement components, C4 and C4b bound C2. 8 The involvement of MASP-1 and MASP-3 in complement activation is presently unclear. Recent reports described a direct cleavage of the third comple- ment component C3 by MASP-1 when using MASP-1 preparations from human plasma. 8,9 Such C3 cleavage, however, could not be observed when using full-length MASP-1 which was expressed in the baculovirus insect cell system. 10 Likewise, in a previous study, we observed Correspondence: Dr W Schwaeble, Department of Microbiology and Immunology, University of Leicester, University Road, Leicester LE1 9HN, UK. E-mail: [email protected] Novel sequence data presented herein are available from the EMBL/GenBank/DDBJ databanks under the accession numbers: Rat MASP-1 cDNA (clone prm1): AJ457084; partial rat MASP-3 cDNA (clones 14, 71, clone prm3 0 RACE): AJ487623, AJ487622, AJ487624; mouse MASP-3 cDNA: AB049755; partial guinea pig MASP-3 cDNA: AJ457086; partial rabbit MASP-3 cDNA: AJ457085; partial rat MASP1/3 gene: AY135528-135530; and partial mouse MASP1/3 gene: AY135525-135527. This work was presented in parts at the International Workshop on C1 and Collectins, Seeheim-Jugenheim, Germany, 26–28 October 2001. Genes and Immunity (2003) 4, 374–384 & 2003 Nature Publishing Group All rights reserved 1466-4879/03 $25.00 www.nature.com/gene

Transcript of Murine serine proteases MASP-1 and MASP-3, components of the lectin pathway activation complex of...

Murine serine proteases MASP-1 and MASP-3,components of the lectin pathway activation complex ofcomplement, are encoded by a single structural gene

CM Stover1, NJ Lynch1, MR Dahl1, S Hanson1, M Takahashi2, M Frankenberger3, L Ziegler-Heitbrock1,3,I Eperon4, S Thiel5 and WJ Schwaeble1

1Department of Microbiology and Immunology, University of Leicester, UK; 2Department of Biochemistry, Fukushima MedicalUniversity, Japan; 3Clinical Cooperation Group on Aerosols in Medicine, GSF-Institute for Inhalationbiology and AsklepiosFachkliniken Gauting, Germany; 4Department of Biochemistry, University of Leicester, UK; 5Department of Medical Microbiology andImmunology, University of Aarhus, Denmark

Activation of the lectin pathway of complement is initiated by the binding to microbial carbohydrate structures of amultimolecular fluid-phase complex composed of a carbohydrate recognition subcomponent that associates with three specificserine proteases and an enzymatically inert protein of 19 kDa. The first carbohydrate recognition subcomponent of the lectinpathway identified was mannan-binding lectin (MBL), hence the serine proteases were named MBL-associated serineproteases (MASPs) and numbered according to the sequence of their discovery. Here we describe the primary structures of thetwo distinct serine proteases MASP-1 and MASP-3 in the rat (and of MASP-3 in the mouse), show their association with plasmaMBL complexes, and demonstrate that in rat and mouse, as in man, MASP-1 and MASP-3 are encoded by a single structuralgene. For both species, we present the genomic region and regulatory elements responsible for the processing of eitherMASP-1 or MASP-3 mRNA by alternative splicing/alternative polyadenylation. Furthermore, we demonstrate the evolutionaryconservation of MASP-3 mRNA in cDNA transcripts from guinea pig, rabbit, pufferfish, and cow.Genes and Immunity (2003) 4, 374–384. doi:10.1038/sj.gene.6363970

Keywords: innate immunity; complement serine proteases; lectin pathway; alternative splicing/polyadenylation

Introduction

Complement is an essential component of the innatehumoral immune defence system. Activation of comple-ment (via the classical, lectin, and alternative pathways)leads to proinflammatory and antimicrobial activitiesincluding recruitment of immune cells, facilitation ofphagocytosis and cytotoxicity towards microbial organ-isms and targeted cells.

C1q, the recognition subunit of the classical activationpathway, is composed of six heterotrimeric subunits.Binding to immune complexes via the globular heads ofC1q effects a conformational change of its collagenous

stalks which in turn leads to activation of the associatedserine protease tetramer, C1r2–C1s2. The fourth compo-nent of complement, C4, is the substrate of activated C1s.

For the lectin pathway of complement activation,mannan-binding lectin (MBL), L-ficolin and H-ficolinhave been described as distinct recognition moleculeswith different specificities for polysaccharides.1–3 Theyare oligomers of homotrimeric subunits. A C-type lectindomain forms the carbohydrate recognition structure ofMBL. Ficolin oligomers bind to carbohydrates viafibrinogen-like domains. As C1q, MBL and ficolins havestructural similarities through the presence of a collage-nous stalk region. They are part of a family of patternrecognition molecules defined as defence collagens.

MBL, L-ficolin, and H-ficolin have in common thatthey form complexes with the serine proteases MASP-1,MASP-2, and MASP-3.2,4–7 Of these, MASP-2 mediatescomplement activation by cleaving the fourth andsecond complement components, C4 and C4b boundC2.8 The involvement of MASP-1 and MASP-3 incomplement activation is presently unclear. Recentreports described a direct cleavage of the third comple-ment component C3 by MASP-1 when using MASP-1preparations from human plasma.8,9 Such C3 cleavage,however, could not be observed when using full-lengthMASP-1 which was expressed in the baculovirus insectcell system.10 Likewise, in a previous study, we observed

Correspondence: Dr W Schwaeble, Department of Microbiology andImmunology, University of Leicester, University Road, Leicester LE19HN, UK. E-mail: [email protected] sequence data presented herein are available from theEMBL/GenBank/DDBJ databanks under the accession numbers:Rat MASP-1 cDNA (clone prm1): AJ457084; partial rat MASP-3cDNA (clones 14, 71, clone prm30RACE): AJ487623, AJ487622,AJ487624; mouse MASP-3 cDNA: AB049755; partial guinea pigMASP-3 cDNA: AJ457086; partial rabbit MASP-3 cDNA: AJ457085;partial rat MASP1/3 gene: AY135528-135530; and partial mouseMASP1/3 gene: AY135525-135527.This work was presented in parts at the International Workshop onC1 and Collectins, Seeheim-Jugenheim, Germany, 26–28 October2001.

Genes and Immunity (2003) 4, 374–384& 2003 Nature Publishing Group All rights reserved 1466-4879/03 $25.00

www.nature.com/gene

that the lectin pathway activation complex in H2-Bf/C2-deficient mice (lacking the serine proteases Factor B andC2) does not convert detectable quantities of C3 onactivating mannan-coated surfaces,11 which suggests thatdirect cleavage of C3 by MASP-1 may be insufficient inthe absence of the lectin pathway and alternativepathway C3 convertases, C4b2a and C3bBb. Apart fromthe serine proteases, an additional protein, termedMAp19 or sMAP,2,12,13 is associated with both MBL andficolin activation complexes. Its function, however,remains to be resolved. Less than 10% of total MASP-1,MASP-2, and MAp19 contained in plasma are foundcomplexed to MBL, while the majority may either becomplexed with ficolins7 or are present in the form ofcirculating MASP-2 homodimers and MASP-1/MAp19heterodimers.14 The stoichiometry and composition oflectin pathway complexes appears also to vary with thedegree of polymerisation of its recognition molecules.6

The serine proteases of the activation complexes of theclassical and lectin pathways show identical modularstructure. They are composed of an N-terminal portionconsisting of five structural domains (CUB1, EGF-like,CUB2, CCP1, and CCP2) and a C-terminal portionrepresenting a serine protease domain. Upon conversionof the zymogen to the active form, these two portions aresplit to yield a disulphide-linked A-chain and a smallerB-chain representing the serine protease domain.

The alternative pathway does not possess a recogni-tion subunit and is initiated by direct interaction of C3bwith surface structures.

We have previously shown that in rat, mouse, andhuman, one structural MASP2 gene gives rise, byalternative polyadenylation/splicing, to a 2.6 kb MASP-2 mRNA transcript, encoding the 76 kDa MASP-2 serineprotease, and a more abundant 1.0 kb MAp19 mRNAtranscript, encoding a 19 kDa MASP-2-related plasmaprotein devoid of a serine protease domain.12,13,15

We now present the primary structure of rat MASP-1and MASP-3 mRNA, and show that in rat and in mouse,as in humans, a single structural gene encodes MASP-3and MASP-1 mRNA transcripts, which are generatedfrom the primary transcript by an alternative splicing/polyadenylation process. Both mRNA species are abun-dantly expressed in liver. A comparative analysis ofMASP-3 homologues from different species shows asurprising degree of evolutionary conservation amongmembers of Murinae (mouse and rat), Caviidae (guineapig), Leporidae (rabbit), Bovinae (cow), and Tetraodontidae(pufferfish).

Results

Characterisation of rat MASP-1 and rat MASP-3 mRNA

A rat liver cDNA library was screened by conventionalphage hybridisation to isolate the rat homologue ofhuman MASP-1. As a probe, we used full-length humanMASP-1 cDNA, generated by RT-PCR from human liver,based on sequence deposited in the Genebanks(NM_001879). Four clones were isolated in a rescreeningprocedure using a cDNA subfragment specific for thesequence encoding the MASP-1 serine protease domain.Their cDNA inserts, ranging between 2.5 and 5.2 kb inlength, were characterised further. The longest cDNAtranscript, clone prm1 (5112 bp), had an open reading

frame of 2109 bp, followed by a stop codon (TGA), and a30 untranslated region of 2897 bp containing severalpolyadenylation initiation signals. These may be usedalternatively as the remaining three clones isolated fromthe same library represented differentially polyadeny-lated MASP-1 mRNA transcripts. Their overlappingsequences showed complete identity. Clone prm1 ex-tends the published, partial rat MASP-1 cDNA se-quences, AJ277423, generated by amplifyingoverlapping fragments from a rat liver cDNA libraryusing PCR16 by 107 bp and AF004661, cloned fromhepatic stellate cells using differential mRNA displaytechnology17 by 133 bp, respectively. 50 RACE performedon rat liver cDNA using MASP-1-specific primers didnot extend the 50 sequence of clone prm1. It is possiblethat this sequence contains the transcription initiationsite. In fact, sequence at position 24–30 of the 50

untranslated region in prm1, CCACACC, fits theconsensus for a transcription initiation sequence, C/TC/T A -1 N T/A C/T C/T where N is the predicted startsite.18

Partial rat homologues of human MASP-3 wereisolated by the same procedure using a human MASP-3-B-chain-specific cDNA probe.6 Four clones werecharacterised further and were found to representincompletely spliced transcripts. The retained intronsare: in clone 14 (2.7 kb), the intron preceding the exonencoding the serine protease domain (partial, 1870 bp); inclones 11 (2.7 kb), 13 (2.6 kb), and 71 (3.9 kb), the intronbetween the exons encoding the N- and C-terminal partsof CCP2 (350 bp). The primary structure of MASP-3mRNA transcripts in clones 11 and 13 is fully containedin clone 71. This clone encompasses coding sequence forthe N-terminal part of CUB1, the EGF-like domain, CUB2domain, CCP1 and CCP2 domains, and the serineprotease domain. It has a 30 untranslated region of1503 bp, including a poly(A) tail. The positions of theretained introns correspond to those of orthologousintrons determined for the human MASP1/3 gene.19

Completely processed mRNA encompassing codingsequence for the MASP-3 serine protease domain andthe entire CCP2 and CCP1 domains were obtained by 50

RACE of rat liver cDNA (clone prm30RACE, 464 bp). Thecoding and noncoding sequences in all clones wereidentical in their corresponding overlapping parts.

Sequence comparison of rat MASP-1 (full length) andMASP-3 mRNA (incomplete 50 end) revealed identical 50

sequence coding for the CUB1 domain, EGF-like domain,CUB2 domain, and the two CCP modules (CCP1 andCCP2). The 50 end of rat MASP-3 cDNA clone 71 maps tobase pair position 208 of the rat MASP-1 cDNA cloneprm1 (amino-acid position 37). As described for humanMASP-1 and MASP-3 mRNA,6 rat MASP-1 and MASP-3mRNA differ in the sequence encoding their respectiveserine protease domains including the link regionpreceding the predicted activation site (arg/ile cleavage).The serine protease domains are only 32% identical at theamino-acid level (Figure 1). Table 1 shows a separatecomparison of derived amino-acid sequences for theN-terminal (CUB1 through CCP2) and C-terminal (serineprotease domain including link region) portions ofhuman, mouse, and rat MASP-1 and MASP-3 (Accessionnumbers: human MASP-1, BAA04477; mouse MASP-1,BAA03944; rat MASP-1, AJ457084, this work. HumanMASP-3: AAK84071, mouse MASP-3: this work

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

375

Genes and Immunity

(AB049755, see below), rat MASP-3: this work). Itappears that across the species, the MASP-1-specificserine protease domains are less conserved than theCUB1 through CCP2 modules common to MASP-1- andMASP-3- and the MASP-3-specific serine proteasedomains.

MASP-1 and MASP-3 are constituents of therat MBL complex

A lectin preparation was prepared from rat plasma andanalysed under reducing conditions on Western blotsusing antiserum raised against human MASP-1- andMASP-3-specific B-chain peptides (their sequences arealigned to the rat MASP-1 and MASP-3 serine proteasedomain sequences, respectively, in Figure 1). As seenin Figure 2, MASP-1-specific bands appear at approxi-mately 90 and 25 kDa, while MASP-3-specific bands are

visible at approximately 110 and 48 kDa, each represent-ing unactivated and activated MASP-1 and MASP-3,respectively. The calculated molecular weight is 28 276Da for the MASP-1-specific B-chain and 30 517 Da forthe MASP-3-specific B-chain. There are three potentialN-glycosylation sites contained in the MASP-3 B-chainpeptide sequence that are not present in the MASP-1

Human M3 peptide IIGGRNAEPGLFPWQALIVV |||||||| |||||||||||

RatM3 ser. prot. CGQPSRALPNLVKRIIGGRNAELGLFPWQALIVVEDTSRIPNDKWF-GSGALLSESWILTA 60 RatM1 ser. prot. CGLP-KFSRKHISRIFNGRPAQKGTTPWIAML-----SQLNGQPF-CG-GSLLGSNWVLTA 53 ** * : : :.**:.** *: * ** *:: *:: .: : * *:**...*:***

RatM3 ser. prot. AHV-LRSQRR-DNTVIP----VSKDHVTVYLGLHDVR-DKSGAVNSSAARVVLHPDFNIQN 114 RatM1 ser. prot. AH-CLHHPLDPEEPILHNSHLLSPSDFKIIMGKHWRRRSDEDEQHLHVKHIMLHPLYNPST 113 ** *: ::.:: :* ....: :* * * .... : . :::*** :* ..

RatM3 ser. prot. YNHDIALVQLQEPVPLGAHVMPICLPR-PEPEGPAPHMLGLVAGWGISNPNVTVDEIIIS 173 RatM1 ser. prot. FENDLGLVELSESPRLNDFVMPVCLPEHPSTEG----TMVIVSGWG--------KQFLQR 160 :::*:.**:*.*. *. .***:***. *..** : :*:*** .:::

RatM3 ser. prot. GTRTLSDVLQYVKLPVVSHAECKASYESRSGNYSVTENMFCAGYYEGGKDTCLGDSGGAF 233 RatM1 ser. prot. ----LPENLMEIEIPIVNYHTCQEAYTPLG--KKVTQDMICAGEKEGGKDACAGDSGGPM 215

*.: * :::*:*.: *: :* . . .**::*:*** *****:* *****.:

RatM3 ser. prot. VIFDEMSQRWVAQGLVSWGGPEECGSKQVYGVYTKVSNYVDWLLEEMNSPRGVRELQVER 292 RatM1 ser. prot. VTKDAERDQWYLVGVVSWG--EDCGKKDRYGVYSYIYPNKDWIQRVT----GVRN----- 263 * * ::* *:**** *:**.*: ****: : **: . ***:

|||||||||||| |||||Human M1 peptide GKKDRYGVYSYIHHNKDWI

Figure 1 Alignment of the deduced amino acid sequence of the serine protease domains represented by rat MASP-1 and MASP–3 mRNA,respectively, using CLUSTAL W (1.8) multiple sequence alignment programme (http://www.ebi.ac.uk/clustalw/). "*" highlights identicalresidues in the aligned sequences, ":" denotes conserved substitutions, and "." indicates semi-conserved substitutions (i.e. conservation of oneof the following amino acid groups: CSA, ATV SAG, STNK, STPA, SGND, SNDEQK, NDEQHK, NEQHRK, FVLIM, HFY). The active sitetriade is in bold (histidine, aspartic acid, serine); conserved cysteine residues are boxed. The cysteine residues important for formation of thehistidine loop20 in MASP-1 (and chymotrypsin/trypsin21) are underlined and in italic. Potential N-glycosylation sites are underlined. Gapswere introduced to maximise the alignment. The arrow indicates the site of cleavage used in the course of enzymatic activation to generatethe B-chain. The human peptides used for generation of the antisera employed for Figure 2 are aligned.

Table 1 Cross-species comparison of deduced CUB1 throughCCP2 sequences (common to both, MASP-1 and MASP-3) and ofdeduced serine protease domains (including distinct link regions),% identities

Species Rat

MASP-1/-3CUB1-CCP2

MASP-1 serine prot.domain incl. linker

MASP-3 serine prot.domain incl. linker

Mouse (%) 96 92 96Human (%) 87 80 91

Figure 2 Western blot analyses of a lectin preparation from ratplasma and of recombinant rat MASP-3 serine protease domain. Analiquot of a lectin preparation from rat plasma (lanes 1–4) andrecombinant rat MASP-3 serine protease domain (lanes 5 and 6) areassayed using anti-human MASP-1 (lane 2) and anti-humanMASP-3 (lanes 4 and 6) antipeptide antisera in comparison withpreimmune sera (lanes 1, 3 and 5).

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

376

Genes and Immunity

B-chain peptide sequence (Figure 1). This most likelyaccounts for the higher than expected molecular weightof MASP-3-specific bands seen in these analyses.Recombinant rat MASP-3, representing most of theserine protease domain expressed in a prokaryoticsystem (amino acid position 4–292, Figure 1), wasassayed in parallel as a control for specificity of theantiserum and reacts as a band of approximately 35 kDa(Figure 2).

Characterisation of the genomic region transcribed togenerate two gene products from a single structuralMASP1/3 gene in rat and mouseLong-range PCR amplification of rat genomic DNAusing oligonucleotides specific for sequence encodingthe CCP2 domain, the MASP-3-specific serine proteasedomain, and part of the MASP-1-specific serine proteasedomain generated two overlapping PCR products;primer pair SCRII_2 and BCR1 yielded a 6 kb productspanning the 30 end of the exon encoding the C-terminalportion of CCP2 (80 bp), an intron of approximately5.5 kb and 502 bp of the exon for the serine proteasedomain of MASP-3 including the link region. Intronicsequence (1870 bp) contained in rat MASP-3 cDNA clone14 (50 of the sequence coding for the MASP-3 serineprotease domain) is identical to the correspondingsequence contained in the amplification product of6 kb. Primers BCF1 and M1_SP gave a 5.5 kb productcomprising 783 bp from the exon for the serine proteasedomain of MASP-3, the 30 untranslated region of MASP-3(1496 bp), an intron of 3.1 kb and the start of the exon(47 bp) encoding the link region and the first amino acidof the MASP-1 B-chain.

Using mouse genomic DNA, primers SCRII_S1 andBCR1 gave a 6 kb product comprising the 30 end of theexon encoding the C-terminal of CCP2 (55 bp), a 1370 bpintron, the exon for the N-terminal portion of CCP2(78 bp), a 4 kb intron, and the first 500 bp of the MASP-3serine protease domain exon. Primers BCF1 and M1_SPgave a 5.5 kb product comprising 788 bp from the exonfor the serine protease domain of MASP-3, the 30

untranslated region of MASP-3 (1484 bp), an intron of3.1 kb and the start of the exon (91 bp) encoding the linkregion and the first 16 amino acids of the MASP-1B-chain.

As in humans, the rat and mouse MASP-3 serineprotease domains are encoded by a single exon. Thelength of this exon within the genomic amplificationproducts of 5.5 kb each is determined by the sequence ofthe 30 untranslated regions contained in rat MASP-3cDNA clones 11, 13, 14, and 71 and in the full-lengthmouse MASP-3 cDNA clone characterised in this work(AB049755), respectively. An overview of the genomicstructure and the exon–intron boundaries of this regionthat harbours information for two distinct serine pro-teases, MASP-1 and MASP-3, in rat and mouse is givenin Figure 3, taking into account information obtainedfrom hnRNA transcript 14 isolated in this work (seeabove). It appears that the exon/intron structure of thegenomic regions transcribed to generate these twoalternative gene products of the MASP1/3 gene in ratand in mouse are identical to that of the human MASP1/3gene.6 In addition, from sequence analysis of the isolatedcDNA clones, it appears that in rat, as in human andmouse, the sequence for the CCP2 domain is split by one

intron. The size of this intron is 350 bp in the rat gene,1923 bp in the human gene, and 1370 bp in the mousegene. It has recently been proposed that the exon for theMASP-3 serine protease domain (AGY codon for activesite serine and absence of cysteines in the B-chain to forma histidine loop) represents a processed retrotransposonoriginating from the primordial type of split exon for theMASP-1-type serine protease domain (TCN codon foractive site serine and presence of cysteines in the B-chainto form a histidine loop).22 Comparing the cDNA andgenomic sequences presented herein from rat, mouse,and human, we found neither a significant similaritybetween the 30 untranslated regions of MASP-1 andMASP-3 mRNA within one species nor a vestige of apoly(A) tail 30 of the MASP-3 serine protease exon thatwould argue in favour of a direct retrotransposition

Mouse

Splice site Donor Acceptor

A (between CCP2 N’

and CCP2 C’)

AAGATGCTTCACAATACCACA K M L H N T T

G/gtaagctctcc

ctctcccacatgttcactcag/

GTGTATATACGTGTTCTGCT G V Y T C S A

B (between CCP2 C’

and the MASP-3

serine protease domain)

AGCCTGCCCACCTGCCTTCCA S L P T C L P

G/gtacctcacccaccaa

cacttcttccttcttatag/

TATGTGGGCAGCCCTCCCGC V C G Q P S R

B’ (between CCP2 C’ &

1st exon of the MASP-1

serine protease domain)

As above cctccctcccttggctacag/

TGTGTGGTCTCCCCAAGTTC V C G V P K F

Rat

Splice site Donor Acceptor

A (between CCP2 N’

and CCP2 C’)

AAGATGCTTCACAATACCACA K M L H N T T

G/gtaagctctcc

cccttcatatgtctgag/

GTGTATATACATGTTCTGCT G V Y T C S A

B (between CCP2 C’

and the MASP-3

serine protease domain)

AGCCTGCCCACCTGCCTTCCA S L P T C L P

G/gtacctcacc

ccacttcttccttcttatag/

TATGTGGGCAGCCCTCCCGC V C G Q P S R

B’ (between CCP2 C’ &

1st exon of the MASP-1

serine protease domain)

As above cctccctcccttggctacag/

TGTGTGGTCTCCCCAAGTTC V C G L P K F

a

b

Figure 3 Structure of the alternative splicing/polyadenylation unitwithin the rat and mouse MASP-3 gene. (a) Schematic representa-tion of the splicing events which give rise to MASP-1 and MASP-3mRNA (not to scale, see text for exon and intron sizes). Splice sitesare denoted by arrows and the 30 untranslated region of the MASP-3mRNA is shaded. (b) Sequences of the intron/exon boundariesshown in (a).

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

377

Genes and Immunity

event originating from the present MASP-1 serineprotease encoding region interrupted by several introns.

A comparison of the RNA processing signals ofMASP-1 and MASP-3 in all three species studied (humanAC007920, rat AY135528-135530, mouse AY135525-135527) reveals several conserved features. The poly-adenylation site for MASP-3 is followed by a sequencerich in (G+U) stretches, suggesting that it is probably anefficient site. The 30 splice site of MASP-3 is preceded byquite extensive polypyrimidine tracts, comprising 29–32nucleotides with three interrupting purines, whereas thepolypyrimidine preceding the competing 30 splice site ofthe MASP-1 serine protease domain exon comprises only14–18 nucleotides. Similarly, the probable branch site ofMASP-3 (about 40 nucleotides upstream of the 30 splicesite) is pyrimidine–TGAC–pyrimidine, which is a rea-sonably close fit to the optimal mammalian sequence,23,24

whereas the corresponding sequence for MASP-1 serineprotease domain is purine–TGAC–pyrimidine in rat andmouse.

Characterisation of abundance of hepatic expressionof MASP-1 and MASP-3 mRNANorthern blot analysis of liver mRNA from adult ratsrevealed abundant MASP-1-specific mRNA (approxi-mately 5 kb) and MASP-3-specific mRNA (approxi-mately 3.5 kb) (Figure 4a). In order to determine therelative abundance of MASP-1 and MASP-3 mRNAtranscripts in rat liver, a quantitative RT-PCR wasperformed using a LightCycler apparatus which followsthe incorporation of SYBR Green I into PCR products inreal-time. A standard curve was prepared for each of thetwo cDNAs using serial dilutions of a plasmid contain-ing the appropriate cDNA as templates for the PCR(Figure 4b). It was found that adult rat liver contained110 000 (7 10 800) copies of MASP-1 and 75 000 (7 16600) copies of MASP-3 mRNA per microgram total RNA(means7 s.d., n¼ 4) (Figure 4c). Although the absoluteconcentrations of MASP-1 and MASP-3 mRNAs differedsomewhat from sample to sample, the ratio of theconcentrations within an RNA sample was notablyconsistent, being 1.497 0.18 MASP-1 vs MASP-3.

Characterisation of MASP-3 homologues in mouse,guinea pig, rabbit, Tetraodon nigroviridis, and cowMASP-3 homologues were cloned from mouse, guineapig, and rabbit and compared against the rat MASP-3cDNA sequence presented above and the recentlypublished human MASP-3 cDNA sequence.6 A mouse(BALB/c) MASP-3-specific phage library isolate andseparate amplification product were found to overlap,

encompassing sequence for the serine protease domain,and were identical to the corresponding part of the full-length mouse (C57Bl6) MASP-3 transcript isolated in thiswork (Figure 5). Sequence for a MASP-3 homologue in

0

20000

40000

60000

80000

100000

120000

MASP-1 MASP-3

Cop

ies/

µg to

tal R

NA

20

22

24

26

28

30

32

34

36

38

40

2 43 5 6 7Log10 conc. (Copies)

MASP-1

MASP-3

b

c

Cro

ssin

g po

int (

Cyc

le #

)

a

"

Figure 4 MASP-1 and MASP-3 mRNA expression in rat liver(a) Northern blot analysis of rat liver RNA using labeled cDNAprobes specific for mRNA encoding the respective serine proteasedomains of MASP-1 and MASP-3 shows expression of 5 kb MASP-1mRNA (lane 1) and 3.5 kb MASP-3 mRNA (lane 2). (b) Standardcurves were prepared for LightCycler Analysis of MASP-1 andMASP-3 mRNA expression using known concentrations of MASP-1and MASP-3 cDNAs cloned in pBSSK� as templates for the PCR.(c) Quantitative RT PCR analysis of MASP-1 and MASP-3 mRNAexpression in rat liver. Results are expressed as copies permg RNA, and are the means of 4 separate experiments. Error barsrepresent SD.

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

378

Genes and Immunity

the pufferfish is contained in AL187061 (supplied by theTetraodon nigroviridis genome sequencing project). PartialC-terminal coding sequence of a bovine MASP-3homologue was identified in AW658440 using blastcomparison (http://www.ncbi.nlm.nih.gov/BLAST/).Alignment of the amino-acid sequences deduced fromthe cDNAs characterised in this work against the humanMASP-3 protein sequence shows a high degree ofidentity between the species studied (Figure 5). For thearea shown, there is 96% identity between mouse andrat, 87–88% between guinea pig and rat/mouse, 84%between rabbit and rat/mouse, and 85% between guineapig and rabbit. Cysteines important for domain foldingare conserved in all species studied, including cysteineswhich give rise to a cystine bond around a methionineresidue to form a so-called methionine loop (Figure 5).The catalytic triad composed of histidine, aspartic acid,and serine (Ser 195) is conserved. In addition, a serine(Ser 214) relevant to substrate binding25 is conservedin guinea pig, mouse, rat, and rabbit, and a tyrosine(Tyr 225) typically found in Na+-activated allostericenzymes26 is found in rat, mouse, and cow C-terminalsequences. All potential N-glycosylation sites (charac-terised by the following sequence of four amino acids:Asn X(no Pro), Thr/Ser/Cys, no Pro) present in thehuman MASP-3 protein sequence are conserved inthe overlapping sequences of all species studied (seeFigure 5).

Discussion

In this work, we present the cloning of the rat homologueof MASP-1 cDNA and of the rat and mouse homologuesof MASP-3 cDNA and show that there is a high degree ofconservation of their encoded proteins between human,rat, and mouse. Human and rat MASP-1 and MASP-3 are83 and 87% identical, respectively. For comparison,human and rat MASP-2, the complement activatingcomponent of the lectin pathway complex, are 80%identical.15

MASP-1 was first described in mouse and humanplasma as a component of the so-called Ra-reactivefactor, a complex exhibiting the potential to activatecomplement after binding to RA and R2 core poly-saccharides shared by certain strains of Gram negativebacteria.27,28 Since then it has become clear that MBLpurified from human plasma forms complexes withthree different serine proteases, MASP-1, MASP-2, andMASP-3 as well as with another protein, MAp19. Wehave shown that MASP-2 and MAp1915 and MASP-1 andMASP-3 (this work) associate with MBL complexesisolated from rat plasma. In humans, MASP-1 andMASP-3 were shown to be encoded by a single structuralMASP1/3 gene.6 Likewise, we have shown that MASP-2and MAp19 are alternative splice/polyadenylation pro-ducts of a single structural MASP2 gene of approxi-mately 20 kb.12,29 In the rat, Southern blot analysisindicated that MASP-2 and MAp19 mRNA are encodedby a single gene of less than 15 kb.15 The position ofthe alternative splice/polyadenylation exons within theMASP2 and MASP1/3 genes differs; only MASP-2,MASP-1, and MASP-3 possess protease domains.MAp19 represents a truncated form of MASP-2 and iscomposed of the first two N-terminal structural domains,

that is, the CUB1 and the EGF-like domain only. Weshow here that the splicing unit that generates MASP-3and MASP-1 mRNA in the rat and mouse is containedwithin a 13 kb subfragment of the correspondingMASP1/3 genes. The overall sizes of the rat and mousegenes are at least 27 and 32 kb, respectively, as judgedfrom Southern blot analysis (data not shown). Thehuman MASP1/3 gene was shown to span more than50 kb.19 It was localised to human chromosome 3q27–q28,while the mouse gene, termed Crarf (for complement Rareactive Factor), was localised to mouse chromosome 16B2–B3 by FISH analysis using full-length MASP-1 cDNAas a probe30 (containing sequence for the N-terminalportions common to both MASP-1 and MASP-3), and byPCR-RFLP analysis using a set of MASP-1-specificprimers.31 These mapping results also point to thepresence of one single MASP1/3 gene in the mouse. Itthus appears that the genomic structure underlying thegeneration of MASP-1 and MASP-3 mRNA transcriptsis conserved between human, rat, and mouse. Thesespecies make use of an alternative splicing/polyadenyla-tion mechanism to produce two distinct proteases,MASP-3 and MASP-1, which differ in their allostericcentres, from one single structural gene. The codonphases within the partial genes characterised for rat andmouse, including the alternative splice/polyadenylationexon, are conserved as in the orthologue in human(codon phase I) and support a common evolutionarygenesis of protein domain architecture such as exonshuffling.32

Previously, MASP-1 mRNA expression was detectedby Northern blot analysis in rat and mouse liver.17 Inboth species, transcripts of 4.0 and 5.4 kb hybridised witha MASP-1 cDNA probe containing 50 sequence commonto MASP-1 and MASP-3 cDNA, while a probe corre-sponding to the 30 untranslated region of MASP-1detected the 5.4 kb transcript only. Our present resultsnow indicate that the 5.4 kb mRNA species is MASP-1-specific, while the 4.0 kb mRNA species described inKnittel et al17 is specific for the MASP-3 mRNA describedhere. The abundance of MASP-1 mRNA exceeds thatof MASP-3 mRNA by a factor of approximately 1.5 innormal liver as shown using real-time PCR analysis. Thisobservation is remarkable because both transcripts areproduced from one gene under the control of onepromoter and therefore suggests regulation at theprocessing level, for example, preferential splicing orpolyadenylation that favours the larger, MASP-1hnRNA. Cross-species comparison of primary structuresimportant for splice site selection indicated a potentiallymore efficient branchpoint sequence in the intronpreceding the MASP-3 serine protease domain encodingexon and longer polypyrimidine tracts prior to the 30

splice sites of this exon compared to prior to thecompeting 30 splice site of the MASP-1 serine proteasedomain. Branch sites are needed for stable binding ofthe U2 snRNP particle during splicing.33 This binding isaided by U2AF, a splicing factor, that binds to poly-pyrimidine tracts.33 Further, the intron preceding the firstMASP-1 serine protease domain exon contains a shortsequence (GGXTpurine) that separates the polypyrimi-dine tract from the 30 splice site (GGATGCAG in human,GGCTACAG in mouse and rat). Such sequences havebeen found to act negatively in other examples ofregulated splicing.34 These sequence criteria, together

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

379

Genes and Immunity

MR----W-LLLYYALCFSLSKASAHTVELNNMFGQIQSPGYPDSYPSDSEVTWNITVPDGFRIKLYFMHFNLESSYLCEYDYVK human QSPGYPDSYPSDSEVTWNITVPE-FRVQLYFMHFNLESSYLCEYDYVK rat

MRFLSFWRLLLYHALCLALPEVSAHTVELNEMFGQIQSPGYPDSYPSDSEVTWNITVPEGFRIKLYFMHFNLESSYLCEYDYVK mouse** * **** *** * ******** *************************** ** ********************

CUB1 | VETEDQVLATFCGRETTDTEQTPGQEVVLSPGSFMSITFRSDFSNEERFTGFDAHYMAVDVDECKEREDEELSCDHYCHNYIGG humanVETEDQVLATFCGRETTDTEQTPGQEVVLSPGSFMSVTFRSDFSNEERFTGFDAHYMAVDVDECKEREDEELSCDHYCHNYIGG ratVETEDQVLATFCGRETTDTEQTPGQEVVLSPGTFMSVTFRSDFSNEERFTGFDAHYMAVDVDECKEREDEELSCDHYCHNYIGG mouse******************************** *** *********************************************** EGF-like |

YYCSCRFGYILHTDNRTCRVECSDNLFTQRTGVITSPDFPNPYPKSSECLYTIELEEGFMVNLQFEDIFDIEDHPEVPCPYDYI humanYYCSCRFGYILHTDNRTCRVECSGNLFTQRTGTITSPDYPNPYPKSSECSYTIDLEEGFMVTLHFEDIFDIEDHPEVPCPYDYI ratYYCSCRFGYILHTDNRTCRVECSGNLFTQRTGTITSPDYPNPYPKSSECSYTIDLEEGFMVSLQFEDIFDIEDHPEVPCPYDYI mouse

PKSSECFYTIELEEGFMVSLQFEDLFDIEDHPEVPCPYDFI guinea pig DHPEVPCPYDYI rabbit

*********************** ******** ***** ********** *** ******* * *** ************** * CUB2 |

KIKVGPKVLGPFCGEKAPEPISTQSHSVLILFHSDNSGENRGWRLSYRAAGNECPELQPPVHGKIEPSQAKYFFKDQVLVSCDT humanKIKAGSKVWGPFCGEKSPEPISTQSHSIQILFRSDNSGENRGWRLSYRAAGNECPKLQPPVYGKIEPSQAVYSFKDQVLISCDT ratKIKAGSKVWGPFCGEKSPEPISTQTHSVQILFRSDNSGENRGWRLSYRAAGNECPKLQPPVYGKIEPSQAVYSFKDQVLVSCDT mouseKIKAGSNVFGPFCGEKAPELISTQSHSVQILFHSDNSGENRGWRLSYRAAGNECPGLRPPVHGKIEPLQTTYSFKDQVLVSCDT guinea pigKIKVGENVWGPYCGEKAPEAISTQSHSVQILFRSDNSGENRGWRLSYRAAGNECPELQPPDQGKIEPLQAKYSFKDQVLVSCDT rabbit*** * * ** **** ** **** ** *** ********************** * ** ***** * * ****** ****

CCP1 |GYKVLKDNVEMDTFQIECLKDGTWSNKIPTCKIVDCRAPGELEHGLITFSTRNNLTTYKSEIKYSCQEPYYKMLNNNTGIYTCS humanGYKVLKDNEVMDTFQIECLKDGAWSNKIPTCKIVDCGVPAVLKHGLVTFSTRNNLTTYKSEIRYSCQQPYYKMLHNTTGVYTCS ratGYKVLKDNGVMDTFQIECLKDGAWSNKIPTCKIVDCGAPAGLKHGLVTFSTRNNLTTYKSEIRYSCQQPYYKMLHNTTGVYTCS mouseGYKVLKDNVMMDTFQIECLKDGTWSNKIPTCQIVDCGAPTELEHGLVTFSSMNNLTTYKSEIRFSCQHPYYKMLHNITGIYTCS guinea pigGYKVLKDEVEMDTFQIECLKDGTWSNKIPTCKMVDCGVPAELEHGLLTFSSRSNLTTYASEVTYSCQQPYYRLLHNVSGVYTCS rabbit********************** ******** *** * * *** *** ***** ** *** *** * * * ****

CCP2 | linker AQGVWMNKVLGRSLPTCLPECGQPSRSLPSLVKRIIGGRNAEPGLFPWQALIVVEDTSRVPNDKWFGSGALLSASWILTAAHVL humanAHGTWTNEVLKRSLPTCLPVCGQPSRALPNLVKRIIGGRNAELGLFPWQALIVVEDTSRIPNDKWFGSGALLSESWILTAAHVL ratAHGTWTNEVLKRSLPTCLPVCGQPSRALPNLVKRIIGGRNAELGLFPWQALIVVEDTSRVPNDKWFGSGALLSESWILTAAHVL mouseAQGIWMNEVLGRSLPICLPVCGQPSRSLPNLVKRIIGGRNAEPGLFPWQALIVVEDTSRVPNDKWFGSGALLSESWILTAAHVL guinea pigAQGIWTNEVLGRSLPTCIPVCGQPSRSLPSLIKRIIGGRNAEPGLFPWQALIVVEDTSRVPNDKWFGSGA rabbit

CGRPSRPLPALMKRIVGGRSAEPGLFPWQVLLSVEDLSRVPKDRWFGSGALLSESWVLTAAHVL Tetraodon* * * * ** **** * * ** *** ** * *** *** ** ****** * *** ** * * ********* ** *******

serine protease domain RSQRRDTTVIPVSKEHVTVYLGLHDVRDKSGAVNSSAARVVLHPDFNIQNYNHDIALVQLQEPVPLGPHVMPVCLPRLEPEGPA humanRSQRRDNTVIPVSKDHVTVYLGLHDVRDKSGAVNSSAARVVLHPDFNIQNYNHDIALVQLQEPVPLGAHVMPICLPRPEPEGPA ratRSQRRDNTVIPVSKEHVTVYLGLHDVRDKSGAVNSSAARVILHPDFNIQNYNHDIALVQLQKPVPLGAHVMPICLPRPEPEGPA mouseRSQRRGNTVVPVSKEHITVYLGLHDVRDKSGAVNSSAARVVLHPNFNIQSYNHDIALVQLQEPVPLGPHVMPVCLPQPEPEGLA guinea pigRSQRRDASVIP Tetraodon***** * **** * *********************** *** **** *********** ***** **** *** **** *

methionine loop ser 195

PHMLGLVAGWGISNPNVTVDEIISSGTRTLSDVLQYVKLPVVPHAECKTSYESRSGNYSVTENMFCAGYYEGGKDTCLGDSGGA humanPHMLGLVAGWGISNPNVTVDEIIISGTRTLSDVLQYVKLPVVSHAECKASYESRSGNYSVTENMFCAGYYEGGKDTCLGDSGGA rat PHMLGLVAGWGISNPNVTVDEIILSGTRTLSDVLQYVKLPVVSHAECKASYESRSGNYSVTENMFCAGYYEGGKDTCLGDSGGA mousePHMLGLVAGWGISSPNVTVDDIISSGTRTLSDILQYVKLPVVPHAECKASYESRSGNYSVTENMFCAGYYEGGKDTCLGDSGGA guinea pig************* ****** ** ******** ********* ***** ***********************************

ser 214 tyr 225

FVIFDDLSQRWVVQGLVSWGGPEECGSKQVYGVYTKVSNYVDWVWEQMGLPQSVVEPQVER human FVIFDEMSQRWVAQGLVSWGGPEECGSKQVYGVYTKVSNYVDWLLEEMNSPRGVRELQVER rat FVIFDEMSQHWVAQGLVSWGGPEECGSKQVYGVYTKVSNYVDWLWEEMNSPRAVRDLQVER mouse FVILDDLSQHWVAQGLVSWAGPEECGSKQVYGVYT guinea pig WGGPEECGSKQVYGVYTKVSNYVDWVWEQMGLPQSVVEPQVER cow *** * ** ** ****** *********************** * * * * ****

Figure 5 Alignment of deduced amino acid sequences from human MASP-3 cDNA6 and from MASP-3 specific cDNA isolated or identified inthis work for rat, mouse, guinea pig, rabbit, Tetraodon nigroviridis, and cow. Putative signal peptides are indicated in italic for human and ratMASP-3 based on data obtained for human MASP-121. Conserved cysteines are boxed, the location of the methionine loop is indicated, andhistidine, aspartic acid, and serine residues (catalytic triad) are in bold. The predicted arginine/isoleucine cleavage site is marked with anarrow. Potential N-glycosylation sites are underlined. Conserved residues throughout the species studied are marked with an asterisk.Residues important for lineage designation of the gene products (see text) are indicated (Ser 195, Ser 214, Tyr 225; numbers denote positionsin chymotrypsinogen as of the start of the mature protein).

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

380

Genes and Immunity

with recognition of the polyadenylation signal in theMASP-3 hnRNA, would indicate (in the absence ofinhibitors) that there should be a bias towards MASP-3mRNA synthesis in liver.

The regulation of processing to favour the productionof fully processed MASP-1 mRNA may well dependcritically upon the following factors: suppression ofrecognition of the MASP-3-specific polyadenylationsignal in the hepatic environment and competition ofthe 30 splice sites of MASP-3 and MASP-1 serine proteasedomains for splicing to the 50 splice site of the exoncoding for the C-terminal portion of SCR2. It seems likelythat, in liver, repression of polyadenylation of the MASP-3 intermediate is accompanied by repression or deactiva-tion of its 30 splice site. It is consistent with the loss of useof the MASP-3 serine protease domain 30 splice site thatthe cDNA clones containing the MASP-3-specific exonisolated from a rat acute phase liver library retained theintron preceding this exon, although they had beenpolyadenylated at the 30 end of MASP-3 as usual. Anequal number of clones were isolated that representedfully processed MASP-1 mRNA. This further suggeststhat the recognition of the 30 splice site and thepolyadenylation site of MASP-3 may be regulatedseparately. This is interesting because previously, thepolypyrimidine tract, the 30 splice site, and the poly-adenylation initiation signal (and downstream elementsthereof) as a unit were shown to be significant for thecoupling of splicing and polyadenylation.35

A regulation at the post-transcriptional level, forexample, increased stability of MASP-1 mRNA relativeto that of the MASP-3 mRNA transcript to account forthe higher abundance of MASP-1 mRNA in liver, is anadditional possibility and cannot be exluded.

Recently, it was proposed that the genes encodingMASP-2, C1r, and C1s originate from a commonancestral MASP1/3-like gene.36 Analysis based on pri-mary structure motifs surrounding the active siteresidues as well as nonactive site evolutionary markers,however, reveals an evolutionary scenario according towhich trypsin is the parent enzyme for three distinctdaughter groups, of which two give rise to MASP-1 via aone single-marker transition and to the group encom-passing MASP-2, MASP-3, C1r, C1s, hamster CASP, andhuman thrombin via two separate single-marker transi-tions.37 Comparison of amino-acid sequences among thegroup MASP-2, MASP-3, C1r and C1s, and of MASP-2,the serine protease domain only of MASP-3, C1r, and C1svs MASP-1 reveals a similar level of identity (36–43 and34–42%) and may reflect a degree of convergentevolution. Comparison of MASP-2, MASP-3, C1r, C1s,and MASP-1 to human thrombin reveals a level ofidentity of 33–35%.

In this work, we also demonstrate a very high degreeof conservation for the MASP-3-specific serine proteasedomain between human, rat, mouse, rabbit, guinea pig,cow, and pufferfish (see Figure 5; rat vs pufferfish 77%,rat vs cow 79%, rat vs rabbit 90%, and rat vs guinea pig92% identical; see also Table 1). The divergence of mouseand rat is estimated at 33 million years, that of humanand rodent at 96 million years.38 Although divergenceof fish from a common ancestor is much older, theevolutionary rate is estimated to be significantly lowercompared to mammals and may have allowed identifica-tion of the transcript encoding this protein in the

pufferfish. Human, mouse, rabbit, guinea pig, and cowhave been shown previously to have a functional lectinpathway in vitro.11,39–43

The evolutionary relation of a diversity of serineproteases may be analysed using conserved residuesand their codons as markers. According to theseanalyses, MASP-1 belongs to the ‘Ser 195: TCN/Ser214:TCN/Tyr 225’ lineage. Applying the criteria to MASP-3,it belongs to the ‘Ser 195: AGY/Ser214: TCN/Tyr 225’lineage. Both, MASP-1 and MASP-3, are members of one‘evolutionary clan’ and have diverged separately fromthe ‘Ser 195: TCN/Ser214: TCN/Pro 2250 lineage repre-sented by trypsin and chymotrypsin.37 Ser 195 forms partof the catalytic triad that is composed of histidine,aspartic acid, and serine (bold in Figures 1 and 5) andis the characteristic, conserved residue by whose codonMASP-1- and MASP-3-specific protein sequences andthereby their homologues in other species can bedifferentiated. It is encoded by AGC/T in all MASP-3peptide sequences presented in this work. Ser 214 formsbonds with water molecules in the active site cleft and isalso referred to as the fourth member of the catalytictriad.44 Tyr 225 enables Na+ binding near the so-calledprimary specificity site, thus enhancing catalytic activity,and may have proven advantageous when proteaseswere secreted into extracellular fluid that had higher Na+

concentration than that found in the cytoplasm.45 Otherconserved residues include cysteins that are important inprotein folding. In particular, cysteins involved in theformation of a so-called methionine loop are conservedin all MASP-3 peptide sequences characterised in thiswork. The loop probably has a steric role in promotingefficient catalysis by bringing certain residues in closerproximity to each other.46

The conservation in both secondary and deducedtertiary structures among members of distant species(human; murid (rat, mouse) and nonmurid rodents(guinea pig); lagomorphs: rabbit; ruminants: cow; andacantomorphs: pufferfish) points to evolutionary pres-sures for conservation of the function of this domain. Apotentially inhibiting function of MASP-3 on the C4cleaving activity of MASP-2 was recently discussed,6 butmay not explain the evolutionary pressure leading to theexceptionally high degree of interspecies conservation.Notably, the degree of identity between MASP-3 homo-logues exceeds that between MASP-1 homologues.

Materials and methods

Isolation of MASP-1- and MASP-3-specific cDNAtranscripts from a rat liver cDNA library

cDNAs specific for full-length MASP-1 and the serineprotease domain of MASP-3 were obtained from humanliver by RT-PCR, using oligonucleotides derived fromthe corresponding sequences in the public databases(NM_001879 for MASP-1 cDNA and AF284421 forMASP-3 cDNA). These cDNAs were used to screen acDNA library constructed from acute-phase rat liver(Sprague–Dawley rats, Stratagene (Cambridge, UK),Catalog no. 936512) cloned in the phage vector lZAP.Hybridising phages were subjected to two more roundsof screening. Plasmid pBluescript SK� was rescued byin vivo excision and clones were analysed further bySouthern blot hybridisation and restriction analysis. A

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

381

Genes and Immunity

BALB/c mouse liver cDNA library (Clontech Labora-tories UK Ltd. (Basingstoke, UK), Catalog no. ML5007t)was screened with a mouse-specific probe generated byRT-PCR (see below). Again, hybridising phages weresubjected to two more rounds of screening. Then, insertswere amplified using lTriplEx specific amplimers(according to the manufacturers protocol) and subclonedin pGEMTeasy (Promega, Southampton, UK) afterSouthern blot analysis. All clones were sequenced onboth strands using Cy5/Cy5.5 Dye Terminator CycleSequencing Kit (Amersham-Bioscience, Uppsala, Swe-den) and Long Reader Tower from Visible Genetics Inc.(Toronto, Canada).

Cloning of partial cDNA transcripts of MASP-3homologues by reverse transcriptional amplificationRNA was prepared from livers explanted from guineapig, mouse (BALB/c), and rabbit. Total RNA (5 mg) wasused for first-strand synthesis. PCR amplification wasperformed using human MASP-3-specific primers(50 TCGAGAGTGCCAAATGACAAGTG 30, 50 TGTGTA-GACTCCATAGACCTGCTTGCTGCC 30). The followingtouchdown PCR programme was used: 941C 1 min, 941C15 s, 721C 30 s (�11C per cycle), 721C 1 min for 12 cycles,941C 15 s, 601C 30 s, 721C 1 min for 25 cycles, and 721C5 min. MASP-3-specific products were confirmed bySouthern blot analysis using a a- 32P-dCTP radiolabelledhuman MASP-3 cDNA product as a probe (generated byRT-PCR using the above primers, thus representingB-chain sequence). Products of the expected size ofapproximately 800 bp were subcloned into pGEMTeasyand sequenced on both strands.

In parallel, a 519 bp MASP-3 cDNA fragment wasobtained by nested and degenerated PCRs using mRNAprepared from the liver of a C57Bl6 mouse: For primaryPCR, the following primer set was chosen, 50 GATGCTT-CACAATACCACAG 30 and 50 GCNCCNCCRCTR-TCNCC 30. In a second round of PCR, 50 TNYTNACNGCNGCNCAYGT 30 and 50 GCNCCNCCRCTRTCNCC 30

were used. The fragment was subcloned into pGEM-Teasy and sequenced.

RACEInitiation of transcription of the rat MASP1/3 gene wasmapped by RACE technology. The MASP-1/3-mRNA-specific antisense oligonucleotide was 50 GGCTGGTGATTGTGCCTG 30 (bp pos. 693–710 of clone prm1); thenested MASP-1/3 mRNA-specific oligonucleotide was 50

GGACAGTAATATTCCAGGTC 30 (bp pos. 253–272 ofclone prm1). Distinct amplification products of 240 and270 bp were obtained, subcloned in pGEMTeasy vectorand analysed by double-stranded cycle sequencing.

To extend sequence obtained for MASP-3 homologues,both 50 and 30 RACE experiments were performed. Theoligonucleotides were chosen in conserved nucleotidestretches; for 50 RACE, the antisense oligonucleotide was50 CCAGGAYKCAGAGAGCAG 30, the nested oligonu-cleotide was 50 GCBCCRCTCCCRAACCACTTGTC 30.For 30 RACE, the MASP-3-specific oligonucleotide was 50

CAAGGCCTRGTST CYTGGGSR 30; the nested MASP-3-specific oligonucleotide was 50 GCBCCRCTCCCRAAC-CACTTGTC 30. Amplification products were analysed bySouthern blotting using a cDNA probe specific for ratMASP-3 serine protease domain coding sequence (seebelow), then cloned in pGEMTeasy and sequenced.

Characterisation of the genomic region responsible forthe transcription of two MASP1/3 gene products in ratand mouseGenomic DNA was prepared from Wistar rats and SV129mice. Long-range PCR was carried out using Extensorpolymerase (ABgene, Epsom, Surrey, UK). Oligonucleo-tide design was based on rat sequence available fromcDNA cloning, following the hypothesis that the murineMASP1/3 genes are organised, like the human MASP1/3gene, to include an alternative splice exon for the serineprotease domain of MASP-3 and its 30 untranslatedregion prior to the MASP-1 serine proteasedomain-specific coding sequence (SCRII_S1: 50 AGA-TAAGGTACTCCTGCCAGCA 30; SCRII_S2: 50 CACAGGTGTATATACATGTTCTG; BCF1: 50 CATAGTGGTGGAAGACACATCA 30; BCR1: 50 CTGTCACATTAGGGTTG-GAGA 30; M1_SP: 50 CATGGCAATCCATGGAGTGGTACC 30). A total of 35 cycles of amplification wereperformed: during the first 10 cycles, the annealingtemperature was reduced from 68 to 601C, and theextension step was 4 min at 681C; during the remaining25 cycles, the annealing temperature was held constant at601C and the extension time was increased by 10 s foreach cycle. PCR products were purified using SephadexG-50 MicroSpin columns (Amersham-Bioscience) andsequenced.

Northern blot analysisTotal RNA was isolated from the liver of healthy adultWistar rats. Approximately 2mg of Poly(A+)-purifiedRNA per lane was separated on a denaturing 0.8% (w/v)agarose gel and transferred to Hybond N membrane(Amersham-Bioscience). The blots were hybridised witha- 32P-dCTP labelled cDNA probes specific for sequencecoding for the serine protease domains of rat MASP-1 orMASP-3.

Real-time, quantitative PCRTotal RNA was prepared from the livers of adultSprague–Dawley rats using Trizol reagent (Life Technol-ogies, Paisley, UK). Approximately 10mg of each RNAsample was digested with RNase-free DnaseI, extractedonce with phenol : chloroform : isoamylalcohol (25 : 24 : 1),precipitated and washed with ethanol, then redissolvedin water. The concentrations of the RNA samples weredetermined photometrically (260 nm). Each RNA sample(2mg) was primed with oligo (dT)12–18 and single-stranded cDNAs were synthesised. Quantitative PCRwas performed in a LightCycler instrument (Roche),using the FastStart DNA Master SYBR Green I kit(Roche), and results were analysed using the Fit Pointsoption in the LDCA software supplied with theapparatus. A 135 bp fragment of rat MASP-1 wasamplified using primers SCRII_S2 (50 CACAGGTGTA-TATACATGTTCTG 30) and M1SP_R1 (50 GTCCATT-GAAGATCCTGGAGA 30). A 158 bp fragment ofMASP-3 was amplified using the same sense primerSCRII_S2 and a MASP-3-specific antisense primer,M3SP_R1 (50 GAGGCCAAGCTCAGCGTTC 30). Each20ml PCR reaction contained 1/40th of the original cDNAsynthesis reaction (corresponding to 50 ng of total RNA),3.5 mM MgCl2, 0.5 mM of each primer and 2 ml of themaster mix supplied with the kit. A total of 45 cycles ofamplification were performed: the annealing tempera-ture was reduced from 72 to 601C during the first

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

382

Genes and Immunity

15 cycles, and held constant at 601C thereafter. Thefluorescence signal was detected at the end of each cycle.Melting curve analysis was used to confirm thespecificity of the products.47 A standard curve wasproduced for MASP-1 and MASP-3 mRNA transcriptsusing serial dilutions of the corresponding cDNAscloned into pBluescriptSK- (rat MASP-1 and MASP-3cDNA clones isolated in this work).

Prokaryotic expression of recombinant rat MASP-3

Recombinant protein was produced using a T7-promo-ter-driven prokaryotic expression system using pRSETcloning vector (Invitrogen, Leek, The Netherlands).Primers including restriction sites were designed togenerate an expression cassette encompassing the serineprotease domain of rat MASP-3 (rM3 fwd BamHI: 50

ggatccAGCCCTCCCGCGCCCTG 30, rM3 rev EcoRI: 50

gaattcTCACCGTTCCACCTGGAGCT 30). The amplifica-tion product was cloned in frame with the pRSETN-terminal his-tag fusion protein. Expression and dena-turing protein purification were carried out according tothe manufacturer’s instruction.

Western blot analysis of rat MASP-3

Anti-MASP-3-specific antiserum was raised in rats byimmunising with a synthetic peptide representingthe first 20 amino acids of the human MASP-3B-chain, IIGGRNAEPGLFPWQALIVV.6 Anti-MASP-1-specific antiserum was raised in rabbits using asynthetic peptide representing the C-terminus of thehuman MASP-1-specific B-chain, GKKDRYGVY-SYIHHNKDWI.5

A lectin preparation was produced from rat plasma asfollows. Mannose was coupled to TSK-75 beads (Merck,Darmstadt, Germany) as previously described.48 Man-nose-TSK beads (500 ml) were mixed with 4 ml rat plasmaand 26.5 ml TBS/Tween/Ca2+ (10 mM Tris, 150 mM NaCl,0.1% NaN3, pH 7.4, 0.05% (v/v) Tween 20, 5 mM CaCl2)and incubated overnight at 41C. The beads were washedwith TBS/Tween/Ca2+ and then 500ml TBS with 20 mM

EDTA and 375ml SDS-PAGE loading buffer were addedto the beads.

An aliquot of recombinant rat MASP-3 serine proteasedomain preparation (20 mg) and aliquots of the rat lectinpreparation were run on SDS-PAGE (6–18% acrylamide)under reducing conditions (samples were boiled for3 min in 10% v/v DTT (0.6 M), 10% Iodoacetamide (1.4 M)and 25% v/v loading buffer) together with 10ml of aprestained molecular weight marker (BioRad proteinprecision marker). After electrophoresis, the gel wasblotted onto Hybond P membrane (Amersham-Bioscience). The membrane was blocked for 30 min inTBS/0.1% Tween 20 (v/v), then cut in strips for parallelanalysis of preimmune and immune sera. The followingbuffer was used to dilute primary (1 : 3000) andsecondary (rabbit anti-rat or goat anti-rabbit, HRPconjugated; DAKO, Glostrup, Denmark; 1 : 3000) anti-bodies: TBS, 0.05% Tween 20, 1 mM EDTA, 1 mg HSA/ml,100mg normal human IgG/ml, 100mg heat-aggregatedHSA/ml, 100mg heat-aggregated normal human IgG/ml, 0.35 M NaCl; pH 8. Blots were developed using theenhanced chemiluminescence technique (ECL kit, supersignal, Pierce Biotechnology Inc., Rockford, IL, USA).

Acknowledgements

We gratefully acknowledge the technical expertise andassistance of Silke Roscher, Yu Sato, and Takashi Yazawa.We thank Dr Jens Jensenius for valuable comments onthe manuscript.

This study was supported by the Wellcome Trust(grant number: 060574).

References

1 Holmskov UL. Collectins and collectin receptors in innateimmunity. APMIS 2000; 100(Suppl): 1–59.

2 Matsushita M, Endo Y, Fujita T. Cutting edge: complement-activating complex of ficolin and mannose-binding lectin-associated serine protease. J Immunol 2000; 164: 2281–2284.

3 Matsushita M, Fujita T. Ficolins and the lectin complementpathway. Immunol Rev. 2001; 180: 78–85.

4 Takayama Y, Takada F, Takahashi A, Kawakami M. A 100-kDaprotein in the C4-activating component of Ra-reactive factor isa new serine protease having module organization similar toC1r and C1s. J Immunol 1994; 152: 2308–2316.

5 Thiel S, Vorup-Jensen T, Stover CM et al. A second serineprotease associated with mannan-binding lectin that activatescomplement. Nature 1997; 386: 506–510.

6 Dahl MR, Thiel S, Matsushita M et al. MASP-3 and itsassociation with distinct complexes of the mannan-bindinglectin complement activation pathway. Immunity 2001; 15:127–135.

7 Matsushita M, Kuraya M, Hamasaki N et al. Activation of thelectin complement pathway by H-ficolin (Hakata antigen).J Immunol. 2002; 168: 3502–3506.

8 Matsushita M, Thiel S, Jensenius JC et al. Proteolytic activitiesof two types of mannose-binding lectin-associated serineprotease. J Immunol 2000; 165: 2637–2642.

9 Matsushita M, Fujita T. Cleavage of the third component ofcomplement (C3) by mannose-binding protein-associatedserine protease (MASP) with subsequent complement activa-tion. Immunobiology 1995; 194: 443–448.

10 Rossi V, Cseh S, Bally I et al. Substrate specificities ofrecombinant mannan-binding lectin-associated serine pro-tease-1 and -2. J Biol Chem 2001; 276: 40880–40887.

11 Celik I, Stover C, Botto M et al. Role of the classical pathway ofcomplement activation in experimentally induced polymicro-bial peritonitis. Infect Immunity 2001; 69: 7304–7309.

12 Stover CM, Thiel S, Thelen M et al. Two constituents of theinitiation complex of the mannan-binding lectin activationpathway of complement are encoded by a single structuralgene. J Immunol 1999; 162: 3481–3490.

13 Takahashi M, Endo Y, Fujita T, Matsushita M. A truncatedform of mannose-binding lectin-associated serine protease(MASP)-2 expressed by alternative polyadenylation is acomponent of the lectin complement pathway. Int Immunol1999; 11: 859–863.

14 Thiel S, Petersen SV, Vorup-Jensen T et al. Interaction of C1qand mannan-binding lectin (MBL) with C1r, C1s, MBL-associated serine proteases 1 and 2, and the MBL-associatedprotein MAp19. J Immunol 2000; 165: 878–887.

15 Stover CM, Thiel S, Lynch NJ, Schwaeble WJ. The rat andmouse homologues of MASP-2 and MAp19, components ofthe lectin activation pathway of complement. J Immunol 1999;163: 6848–6859.

16 Wallis R, Dodd RB. Interaction of mannose-binding proteinwith associated serine proteases: effects of naturally occurringmutations. J Biol Chem 2000; 275: 30962–30969.

17 Knittel T, Fellmer P, Neubauer K et al. The complement-activating protease P100 is expressed by hepatocytes and isinduced by Il-6 in vitro and during the acute phase reactionin vivo. Lab Invest 1997; 77: 221–230.

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

383

Genes and Immunity

18 Javahery R, Khachi A, Lo K, Zenzie-Gregory B, Smale ST.DNA sequence requirements for transcriptional initiatoractivity in mammalian cells. Mol Cell Biol 1994; 14: 116–127.

19 Endo Y, Takahashi M, Nakao M et al. Two lineages ofmannose-binding lectin-associated serine protease (MASP) invertebrates. J Immunol 1998; 161: 4924–4930.

20 Smillie LB, Hartley BS. Histidine sequences in the activecentres of some ‘serine’ proteinases. Biochem J 1966; 101:232–241.

21 Sato T, Endo Y, Matsushita M, Fujita T. Molecular character-ization of a novel serine protease involved in activation of thecomplement system by mannose-binding protein. Int Immunol1994; 6: 665–669.

22 Nonaka M, Miyazawa S. Evolution of the initiating enzymesof the complement system. Genome Biol 2002; 3: REVIEWS1001.

23 Reed R, Maniatis T. The role of the mammalian branchpointsequence in pre-mRNA splicing. Genes Dev 1988; 2: 1268–1276.

24 Lund M, Tange TO, Dyhr-Mikkelsen H, Hansen J, Kjems J.Characterization of human RNA splice signals by iterativefunctional selection of splice sites. RNA 2000; 6: 528–544.

25 Perona JJ, Craik CS. Structural basis of substrate specificity inthe serine proteases. Protein Sci 1995; 4: 337–360.

26 Guinto ER, Caccia S, Rose T, Futterer K, Waksman G, Di CeraE. Unexpected crucial role of residue 225 in serine proteases.Proc Natl Acad Sci USA 1999; 96: 1852–1857.

27 Takada F, Takayama Y, Hatsuse H, Kawakami M. A newmember of the C1s family of complement proteins found in abactericidal factor, Ra-reactive factor, in human serum.Biochem Biophys Res Commun 1993; 196: 1003–1009.

28 Takahashi A, Takayama Y, Hatsuse H, Kawakami M. Presenceof a serine protease in the complement-activating componentof the complement-dependent bactericidal factor, RaRF, inmouse serum. Biochem Biophys Res Commun 1993; 190: 681–687.

29 Stover C, Endo Y, Takahashi M et al. The human gene formannan-binding lectin-associated serine protease-2 (MASP-2),the effector component of the lectin route of complementactivation, is part of a tightly linked gene cluster onchromosome 1p36.2-3. Genes Immun 2001; 2: 119–127.

30 Takada F, Seki N, Matsuda YI, Takayama Y, Kawakami M.Localization of the genes for the 100-kDa complement-activating components of Ra-reactive factor (CRARF andCrarf) to human 3q27–q28 and mouse 16B2–B3. Genomics 1995;25: 757–759.

31 Lawson PR, Reid KBM. A novel PCR-based technique usingexpressed sequence tags and gene homology for murinegenetic mapping: localization of the complement genes. IntImmunol 2000; 12: 231–240.

32 Kolkman J, Stemmer P. Directed evolution of proteins by exonshuffling. Nat Biotechnol 2001; 19: 423–428.

33 Ruskin B, Zamore PD, Green MR. A factor, U2AF, is requiredfor U2 snRNP binding and splicing complex assembly. Cell1988; 52: 207–219.

34 Helfman DM, Roscigno RF, Mulligan GJ, Finn LA, Weber KS.Identification of two distinct intron elements involved inalternative splicing of beta-tropomyosin pre-mRNA. GenesDev 1990; 4: 98–110.

35 Cooke C, Hans H, Alwine JC. Utilization of splicing elementsand polyadenylation signal elements in the coupling ofpolyadenylation and last-intron removal. Mol Cell Biol 1999;19: 4971–4979.

36 Fujita T. Evolution of the lectin-complement pathway and itsrole in innate immunity. Nat Rev Immunol 2002; 2: 346–353.

37 Krem MM, Di Cerca E. Molecular markers of serine proteaseevolution. EMBO J 2001; 20: 3036–3045.

38 Nei M, Xu P, Glazko G. Estimation of divergence times frommultiprotein sequences for a few mammalian species andseveral distantly related organisms. Proc Natl Acad Sci USA2001; 98: 2497–2502.

39 Matsushita M, Fujita T. Activation of the classical complementpathway by mannose-binding protein in association with anovel C1s-like serine protease. J Exp Med 1992; 176: 1497–1502.

40 Takahashi M, Miura S, Ishii N et al. An essential role of MASP-1 in activation of the lectin pathway. Immunopharmacology2000; 49: 3 (Abstract 003).

41 Zhang Y, Suankratay C, Zhang X, Jones DR, Lint TF, GewurzH. Calcium-independent haemolysis via the lectin pathway ofcomplement activation in the guinea-pig and other species.Immunology 1999; 97: 686–692.

42 Kawai T, Suzuki Y, Eda S et al. Molecular and biologicalcharacterization of rabbit mannan-binding protein (MBP).Glycobiology 1998; 8: 237–244.

43 Holmskov U, Laursen SB, Malhotra R et al. Comparative studyof the structural and functional properties of a bovine plasmaC-type lectin, collectin-43, with other collectins. Biochem J 1995;305: 889–896.

44 Pletnev VZ, Zamolodchikova TS, Pangborn WA, Duax WL.Crystal structure of bovine duodenase, a serine protease, withdual trypsin and chymotrypsin-like specificities. Proteins 2000;41: 8–16.

45 Smith LC, Clow LA, Terwilliger DP. The ancestral complementsystem in sea urchins. Immunol Rev 2001; 180: 16–34.

46 Brown JR, Hartley BS. Location of disulphide bridges bydiagonal paper electrophoresis. The disulphide bridges ofbovine chymotrypsinogen A. Biochem J 1966; 101: 214–228.

47 Ririe KM, Rasmussen RP, Wittwer CT. Product differentiationby analysis of DNA melting curves during the polymerasechain reaction. Anal Biochem. 1997; 245: 154–160.

48 Fornstedt N, Porath J. Characterization studies on a new lectinfound in seeds of Vicia ervilia. FEBS Lett. 1975; 57: 187–191.

One gene encodes MASP-1 and -3 in mouse and ratCM Stover et al

384

Genes and Immunity