The Mammalian 2′-5′ Oligoadenylate Synthetase Gene Family: Evidence for Concerted Evolution of...

15
The Mammalian 2¢-5¢ Oligoadenylate Synthetase Gene Family: Evidence for Concerted Evolution of Paralogous Oas1 Genes in Rodentia and Artiodactyla Andrey A. Perelygin, 1 Andrey A. Zharkikh, 2 Svetlana V. Scherbik, 1 Margo A. Brinton 1 1 Biology Department, Georgia State University, P.O. Box 4010, Atlanta, GA 30302-4010, USA 2 Bioinformatics Department, Myriad Genetics, Inc., Salt Lake City, UT 84108, USA Received: 27 March 2006 / Accepted: 12 June 2006 [Reviewing Editor: Dr. Martin Kreitman] Abstract. Multiple 2¢-5¢ oligoadenylate (2-5A) syn- thetases are important components of innate immu- nity in mammals. Gene families encoding these proteins have previously been studied mainly in hu- mans and mice. To reconstruct the evolution of this gene family in mammals, a search for additional 2-5A synthetase genes was performed in rat, cattle, pig, and dog. Twelve 2¢-5¢ oligoadenylate synthetase (Oas) genes were identified in the rat genome, including eight Oas1 genes, two Oas1 pseudogenes, single copies of Oas2 and Oas3, and two Oas-like genes, Oasl1 and Oasl2. Four OAS genes were detected in the pig genome and five OAS genes were found in both the cattle and dog genomes. An OAS3 gene was not found in either the cattle or the pig genome. While two tandemly duplicated OAS-like (OASL) genes were identified in the dog genome, only a single OASL orthologue was found in both the cattle and the pig genomes. The bovine and porcine OASL genes contain premature stop codons and encode truncated proteins, which lack the typical C-terminal double ubiquitin domains. The cDNA sequences of the rat, cattle, pig, and dog OAS genes were ampli- fied, sequenced and compared with each other and with those in the human, mouse, horse, and chicken genomes. Evidence of concerted evolution of paral- ogous 2¢-5¢ oligoadenylate synthetase 1 genes was obtained in rodents (Rodentia) and even-toed ungu- lates (Artiodactyla). Calculations using the non- parametric Kolmogorov-Smirnov test suggested that the homogenization of paralogous OAS1 sequences was due to gene conversion rather than stabilizing selection. Key words: Concerted evolution — Gene conver- sion — Paralogous genes — Gene family — 2¢-5¢ Oligoadenylate synthetase Introduction 2¢-5¢ Oligoadenylate (2-5A) synthetases are important components of an interferon-mediated antiviral pathway but have also been reported to be involved in other cellular processes such as apoptosis, cell growth and differentiation, gene regulation, DNA replication, and RNA splicing (Justesen et al. 2000). The functional 2¢-5¢A synthetases are activated by binding to double-stranded RNA (dsRNA) and uti- lize ATP for synthesis of 2-5A that has the general formula pppA(2¢ p5¢ A) n with n 1 (Hovanessian et al. 1977). The only known function of 2-5A is to bind to and activate a latent endoribonuclease (RNase L) that can degrade single-stranded RNA of either viral or cellular origin (Clemens and Williams 1978). The sequences of most members of the mouse 2¢- 5¢A synthetase gene family and all members of the human gene family have previously been reported (Justesen et al. 2000; Kakuta et al. 2002; Perelygin et al. 2002). However, analyses of this gene family in other species have been incomplete or nonexistent. Correspondence to: A. A. Perelygin; email: [email protected] J Mol Evol (2006) 63:562–576 DOI: 10.1007/s00239-006-0073-3

Transcript of The Mammalian 2′-5′ Oligoadenylate Synthetase Gene Family: Evidence for Concerted Evolution of...

The Mammalian 2¢-5¢ Oligoadenylate Synthetase Gene Family: Evidence for

Concerted Evolution of Paralogous Oas1 Genes in Rodentia and Artiodactyla

Andrey A. Perelygin,1 Andrey A. Zharkikh,2 Svetlana V. Scherbik,1 Margo A. Brinton1

1 Biology Department, Georgia State University, P.O. Box 4010, Atlanta, GA 30302-4010, USA2 Bioinformatics Department, Myriad Genetics, Inc., Salt Lake City, UT 84108, USA

Received: 27 March 2006 / Accepted: 12 June 2006 [Reviewing Editor: Dr. Martin Kreitman]

Abstract. Multiple 2¢-5¢ oligoadenylate (2-5A) syn-thetases are important components of innate immu-nity in mammals. Gene families encoding theseproteins have previously been studied mainly in hu-mans and mice. To reconstruct the evolution of thisgene family in mammals, a search for additional 2-5Asynthetase genes was performed in rat, cattle, pig, anddog. Twelve 2¢-5¢ oligoadenylate synthetase (Oas)genes were identified in the rat genome, includingeight Oas1 genes, two Oas1 pseudogenes, singlecopies of Oas2 and Oas3, and two Oas-like genes,Oasl1 and Oasl2. Four OAS genes were detected inthe pig genome and five OAS genes were found inboth the cattle and dog genomes. An OAS3 gene wasnot found in either the cattle or the pig genome.While two tandemly duplicated OAS-like (OASL)genes were identified in the dog genome, only a singleOASL orthologue was found in both the cattle andthe pig genomes. The bovine and porcine OASLgenes contain premature stop codons and encodetruncated proteins, which lack the typical C-terminaldouble ubiquitin domains. The cDNA sequences ofthe rat, cattle, pig, and dog OAS genes were ampli-fied, sequenced and compared with each other andwith those in the human, mouse, horse, and chickengenomes. Evidence of concerted evolution of paral-ogous 2¢-5¢ oligoadenylate synthetase 1 genes wasobtained in rodents (Rodentia) and even-toed ungu-lates (Artiodactyla). Calculations using the non-parametric Kolmogorov-Smirnov test suggested that

the homogenization of paralogous OAS1 sequenceswas due to gene conversion rather than stabilizingselection.

Key words: Concerted evolution — Gene conver-sion — Paralogous genes — Gene family — 2¢-5¢Oligoadenylate synthetase

Introduction

2¢-5¢ Oligoadenylate (2-5A) synthetases are importantcomponents of an interferon-mediated antiviralpathway but have also been reported to be involvedin other cellular processes such as apoptosis, cellgrowth and differentiation, gene regulation, DNAreplication, and RNA splicing (Justesen et al. 2000).The functional 2¢-5¢A synthetases are activated bybinding to double-stranded RNA (dsRNA) and uti-lize ATP for synthesis of 2-5A that has the generalformula pppA(2¢ p5¢ A)n with n ‡ 1 (Hovanessian etal. 1977). The only known function of 2-5A is to bindto and activate a latent endoribonuclease (RNase L)that can degrade single-stranded RNA of either viralor cellular origin (Clemens and Williams 1978).

The sequences of most members of the mouse 2¢-5¢A synthetase gene family and all members of thehuman gene family have previously been reported(Justesen et al. 2000; Kakuta et al. 2002; Perelygin etal. 2002). However, analyses of this gene family inother species have been incomplete or nonexistent.Correspondence to: A. A. Perelygin; email: [email protected]

J Mol Evol (2006) 63:562–576DOI: 10.1007/s00239-006-0073-3

The murine (Mus musculus; Mm) 2-5A synthetasefamily is located on mouse chromosome 5 (MMU5)and includes eight small MmOas1 genes (MmOas1a–MmOas1h), a small MmOas1i pseudogene, a mediumMmOas2 gene, and a large MmOas3 gene, as wellas two 2-5A synthetase-like loci, MmOasl1 andMmOasl2, which contain two tandemly repeatedubiquitin-like sequences at their 3¢-ends (Eskildsenet al. 2002; Ichii et al. 1986; Kakuta et al. 2002;Perelygin et al. 2002; Rutherford et al. 1991; Shibataet al. 2001; Smith et al. 2002; Tiefenthaler et al. 1999).Two recombinant MmOas1 proteins are functional2-5A synthetases. Recombinant MmOas1a, andMmOas1g proteins expressed in bacteria oligomer-ized ATP, while MmOas1c, MmOas1d, MmOas1e,and MmOas1h did not (Kakuta et al. 2002; Shibata etal. 2001). The presence of a premature stop codon inthe MmOas1b gene results in translation of a trun-cated protein. The truncated MmOas1b gene productdoes not exhibit 2-5A synthetase activity (Kakuta etal. 2002). However, the full-length product specifi-cally suppresses the replication of flaviviruses(Mashimo et al. 2002; Perelygin et al. 2002).MmOasl2 is a functional 2-5A synthetase, whileMmOasl1 is not (Eskildsen et al. 2003).

The human (Homo sapiens; Hs) 2-5A synthetasegene family has been studied extensively. A cluster ofthree 2-5A synthetase genes (HsOAS1, HsOAS2, andHsOAS3) is located on human chromosome 12(HSA12). The smallest gene, HsOAS1, encodes sev-eral isoforms, which are identical in their N-terminal346 amino acids but have different C-termini gener-ated by alternative splicing (Benech et al. 1985;Justesen et al. 2000). Similarly, HsOAS2 encodes twoisoforms (69 and 71 kDa) that share a common 683N-terminal amino acid sequence but have different C-termini produced by alternative splicing (Marie andHovanessian 1992). HsOAS3 is the largest of thehuman 2-5A synthetase genes and encodes a 100-kDaenzyme (Rebouillat et al. 1999). These three genes areclosely linked and occupy about 103 kb of humangenomic DNA (Kumar et al. 2000). Human 2-5Asynthetase-like 1 (HsOASL1) mRNAs have also beenidentified in a separate region of HSA12 (Hartmannet al. 1998; Rebouillat et al. 1998). The 346 N-ter-minal amino acids of HsOAS1 or HsOASL1 con-stitute one domain, while HsOAS2 and HsOAS3contain two and three domains, respectively, as wellas have unique C-termini (Hovnanian et al. 1998).HsOAS3 may function as a monomer (Rebouillat etal. 1999), while HsOAS2 and HsOAS1 have beenreported to have enzymatic activity as a dimer and atetramer, respectively (Ghosh et al. 1997; Marie et al.1990).

In addition to the structural differences between thevarious human OAS enzymes, there are multiplefunctional differences suggesting that these proteins

may have distinct roles in the cell. The HsOAS1 andHsOAS2 proteins require relatively high concentra-tions of dsRNA for activation and production of 2-5Aoligomers, while the HsOAS3 protein can be activatedby much lower concentrations of dsRNA and prefer-entially synthesizes short 2-5As, consisting mainly ofdimers (Marie et al. 1997). These enzymes also differ intheir subcellular localization; HsOAS1 is usuallypresent in the ribosomal fraction, HsOAS2 is associ-ated with intracellular membranes such as the nuclearenvelope and the endoplasmic reticulum, andHsOAS3is detected in the microsomal fraction (Chebath et al.1987b). The genes encoding these three enzymes aredifferentially induced by interferons a, b, and c invarious tissues (Chebath et al. 1987a; Witt et al. 1993).

Because only a few additional mammalian 2-5Asynthetase genes were previously reported, it was notknown whether the gene families in other mammalswere more like the one in mice or the one in humans.Two rat (Rattus norvegicus; Rn) Oas1 genes located onrat chromosome 12 (RNO12) were previously detected(Shimizu et al. 2003; Truve et al. 1993). Also, one pig(Sus scrofa; Ss) and one cattle (Bos taurus; Bt) OAS1gene sequence were previously submitted to GenBankunder accession numbers AJ225090 and AB104656,respectively. The first draft of the dog (Canis familiaris;Cf) genome was recently released. The NCBI databaseannotated two OAS-like genes, CfOASL1 andCfOASL2, on canine chromosome 26 (CFA26). Thehorse (Equus caballus; Ec) 2-5A synthetase gene family,which consists of single copies of the EcOAS1,EcOAS2, EcOAS3, and EcOASL genes, was describedin a recent report (Perelygin et al. 2005). The structureof the horse family is more similar to that of humansthan to that of mice.

The evolution of the murine Oas gene family hasbeen discussed in several previous publications(Eskildsen et al. 2002; Kakuta et al. 2002; Kumar etal. 2000; Mashimo et al. 2003; Perelygin et al. 2002;Rogozin et al. 2003). Tandem gene duplication isconsidered to be the primary mechanism of mouseOas evolution (Kumar et al. 2000). In this study, weanalyzed the 2-5A synthetase gene families of cattle,dog, pig, and rat and obtained evidence supportingconcerted evolution of paralogous Oas1 genes withinthe orders Rodentia and Artiodactyla.

Materials and Methods

Amplification of Oas Genes from Rat, Cow, and PigDNAs

Full-length cDNAs from rat spleen, intestine, and uterus were

purchased from Seegene USA (Del Mar, CA). Gene-specific

primers (Table 1) were designed from rat mRNA sequences pre-

dicted by the bl2seq program (Tatusova and Madden 1999) and

used for 5¢- and 3¢-RACE. The full-length sequences for the

563

majority of the rat RnOas1 genes were obtained by RACE using

the two universal primers, r3¢RACE-1U and r5¢RACE-1U. Gene-

specific primers, r3¢RACE-1b, r5¢RACE-1b, and r3¢RACE-1k,

were designed for the RnOas1b and RnOas1k genes (Table 1).

RACE products were cloned into the pCR-XL-TOPO vector (In-

vitrogen, Carlsbad, CA) and sequenced in both directions using

three independent clones for each gene.

Bovine lung and spleen poly(A)+ RNA was purchased from

BD Biosciences Clontech (Palo Alto, CA), mixed in equal pro-

portions, and converted into first-strand cDNA by ThermoScript

RNase H– Reverse Transcriptase (Invitrogen) using an oligo(dT)

primer. The bovine BtOas1y cDNA was amplified using forward

(5¢-GCACGAGCACAGATTCAGGCA-3¢) and reverse (5¢-GCA

GGTGCCCAATTAATATCTT-3¢) PCR primers. Pooled porcine

cDNA (from fetal and adult lungs as well as adult kidney) was

kindly provided by Dr. Jonathan E. Beever (University of Illinois

at Urbana–Champaign). Partial sequences obtained from the

GenBank searches were extended from the bovine and porcine

cDNAs, which were described above, and from commercial geno-

mic DNAs of both species (Seegene USA) using a DNA Walking

SpeedUp Kit (Seegene) according to the manufacturer�s protocol.The sequences of the dog CfOAS1, CfOAS2, CfOAS3,

CfOASL1, and CfOASL2 genes were predicted from GenBank

using a blastn search against human OAS sequences. Canine single-

stranded cDNAs, derived from either kidney or small intestine of a

male beagle (4 years old), were purchased from BioChain Institute,

Inc. (Hayward, CA) and used to obtain five 2-5A synthetase se-

quences via PCR amplification with primers listed in Table 1. The

CfOAS3 cDNA was amplified by nested PCR with two primer

pairs (Table 1). The PCR products were sequenced directly on an

ABI 3100 Genetic Analyzer using a BigDye terminator v1.1 Cycle

Sequencing Kit according to the manufacturer�s recommendations.

Bovine BAC clones were purchased from the Children�s Hos-

pital Oakland Research Institute and grown individually in 500 ml

of LB medium. BAC DNA was isolated using the NucleoBond

BAC Maxi Kit (BD Biosciences, Palo Alto, CA) and then se-

quenced directly using primers designed from the cDNA sequences

previously obtained. Additionally, BAC clone inserts were removed

by NotI digestion and the insert lengths were estimated by pulsed-

field electrophoresis. The genomic region between the bovine Oas1y

and Oas2 genes was amplified by PCR from the CH240_49I16

BAC clone with forward (5¢ CACAGTGACCTGATGTTCCAG

3¢) and reverse (5¢ GACAGTCTTCAGAAGGCCTCA 3¢) primers.

Analysis of Rat Oas Gene Expression Levels inDifferent Tissues

The tissue-specific expression of individual Oas genes was estimated

by PCR using a rat Multiple Tissue cDNA panel (BD Biosciences)

as template. This cDNA panel had been normalized by the man-

ufacturer to the mRNA of the housekeeping gene, G3pdh. Gene-

specific primers used for this study are listed in Table 2.

Sequence Alignment and Phylogenetic Analysis

Genomic sequences were searched for 2-5A synthetase genes and

pseudogenes using the Blast program (Altschul et al. 1997). The

Dotmap program (Zharkikh et al. 1991) was used to visually

Table 1. PCR primers used to obtain full-length rat Oas1 cDNA sequences by RACE and to amplify dog OAS cDNAs

Rat primer Primer sequence (5¢–3¢) Dog primer Primer sequence (5¢–3¢)

Rn3¢RACE-1U CAGGAGGTGGAGTTTGATGTGC CfOAS1-F TGTTCCAAGCTGCCAGTTCTGCA

Rn5¢RACE-1U GACCGTGAGCAGCTCCAGGGC CfOAS1-R TGAACTCCGGAGAAGGGGTATGC

Rn3¢RACE-1b AAGGAGGTGAAGTTTGATGTGC CfOAS2-F TCTGGCAGGAGCAATGGGAATCT

Rn3¢RACE-1k CATGTGGTGAAGTTTGAGTGC CfOAS2-R TGAGTCCTAGATCACCTGCCTTAG

Rn5¢RACE-1b GATCGTGAGCAGCTCCAGGGC CfOAS3-1F AGCGTCCCGGCGCGCGGA

Rn3¢RACE-2 AGTGCTCTGCCTCTGTCATTCACA CfOAS3-2F CAGGCCATGGACGTGTACCGCA

Rn5¢RACE-2 GCAGAGTCTTTGCCAGATGTCTCT CfOAS3-1R GTCTCCTCACACCAGGGCTGGT

Rn3¢RACE-3 TCTAATCCTAGACCCTGCAGATCC CfOAS3-2R GTCCTTCTGCCCATCTGGTACA

Rn5¢RACE-3 CTTGACATCGAACCTCTGCTTCTG CfOASL1-F TACTCTGGCTCAGAGATGGCACA

Rn3¢RACE-l1 CATGGCAGTAGCCCAGGAGCTTTA CfOASL1-R CCATCCCTCCTTGTTCCCCTACT

Rn5¢RACE-l1 CCTACCTTGCCTTGGATAGGCTGA CfOASL2-F GGGACAGAAGCCGAGAGAATTTAC

Rn3¢RACE-l2 TTCTCCCAGCATTGCTGAGCAGAA CfOASL2-R GGACATCATGCAGGGAAGGATAGA

Rn5¢RACE-l2 GTATCCTTGTAGAGCACAAGGCCT

Table 2. PCR primers used to analyze the tissue-specific expression of rat Oas genes

Gene Forward primer sequence (5¢–3¢) Reverse primer sequence (5¢–3¢)

RnOas1b ACAGTACGCCCTGGAGCTGCTC CAGGTTGTAAGGAAGGCTGTCCAT

RnOas1c CTTCTGAAGTCAGGGCCATATGCT CTTCGGAGTCTGACTCCCAAAGAA

RnOas1d CAAACATGTGTCATCCTGTGAGCC TTCAGAGTCCGACTCCCAGAGAAC

RnOas1f TCGAGTTTCCACAGGAATGTGTCC TGGTCACACATTGGACAGAACTCC

RnOas1g ACAGTACGCCCTGGAGCTGCTCAC CTGAATCTGTTGAGGAAGGCTGTC

RnOas1h GCTCTGGAGTCAGGGGCATTCACT CTGACTCCCAGAGAGCACTGTGGA

RnOas1i ACAGTACGCCCTGGAGCTGCTCAC ATCTCTTGTCTCCCAAGAGAGCAC

RnOas1k CGGAGGTTCCTGTATGTTTCTAGC TGACAGACATGGTTGGAACAGAACC

RnOasl1 CGTCCTGGAAGACTGGTTTGACTT AGGAACTTGATACCCTACCTTGCC

RnOasl2 CTGCACACTCAATGCACACCAGAT TCAAACTCAAGCTTGCCGGATGAC

RnOas2 CCTATGATGCACTAGGTCAGCTGC TAGAAGATGCC AACACGAGCGGTC

RnOas3 TCGATGCTCTTGGTCAGCTGAAGT CTGTTGGTACCAGTGTTTCACCAG

564

identify additional exons and pseudogenes. Although there are

MmOas1b sequences from many mouse strains in GenBank, only

the full-length murine Flv-C3H.PRI-Flvr Oas1b cDNA sequence

(AF328926) was used for evolutionary comparisons (Table 3).

Multiple alignments were performed with the Align program

(Zharkikh et al. 1991), which is based on the Needleman and

Wunsch (1970) algorithm. Manual editing was used to synchronize

the positions of gaps between pairwise alignments.

The phylogenetic tree was constructed using the njtree program

with the distances calculated by the method of Li (1993) and tree

topology was inferred using the neighbor-joining algorithm (Saitou

and Nei 1987). The confidence of each node was estimated using a

generic bootstrap algorithm (Zharkikh and Li 1995) with 1000

replications. The Unix version of this program is available from

Dr. Zharkikh on request. One-parameter (Jukes and Cantor 1969),

two-parameter (Kimura 1980), and three-parameter (Kimura 1981)

methods were used to validate the tree topology.

Results and Discussion

Identification of a Second Murine Oas1 Pseudogene

Annotations of the mouse and rat genomes arecurrently available in both the NCBI and the Celeradatabases. Although the mouse Oas family has beenextensively studied (Eskildsen et al. 2002; Ichii et al.1986; Kakuta et al. 2002; Perelygin et al. 2002;Rutherford et al. 1991; Shibata et al. 2001; Smithet al. 2002; Tiefenthaler et al. 1999), a detailedanalysis of the murine Oas genomic region was notpreviously performed. The identification of 10MmOas genes, including 8 small genes (MmOas1athrough MmOas1h), in the MmDtx1 to MmRph3aregion of the murine genome was reported previ-ously (Perelygin et al. 2002). The locations of indi-vidual MmOas genes in this region were determinedusing the bl2seq program and the modal distancebetween two adjacent MmOas genes was found tobe 3 to 5 kb. However, in two instances unusuallylarge intergenic regions were detected. One regionwas located in the MmOas1d–MmRph3a interval(�20 kb) that contained the MmOas1i pseudogene

and a retrovirus-like element described previously(Kakuta et al. 2002). The other large intergenic re-gion was located in the mouse MmOas1b–MmOas1finterval (�23 kb). A search of this region using theblastx program identified an additional 2-5A syn-thetase pseudogene, designated MmOas1j (Fig. 1),as well as an adjacent partial retrovirus-like element.The GenBank sequence XM_144550 corresponds tothe murine MmOas1j pseudogene. In the humangenome, gene duplications occur frequently in ret-roelement-rich chromosomal regions (Jurka et al.2004). For example, retroelements have been asso-ciated with duplications of mammalian major his-tocompatibility complex and histone deacetylasegenes (Khier et al. 1999; Kulski et al. 1999; Yangand Yu 2000). The identification of retrovirus-likeelements within the mouse Oas gene cluster suggeststhat retroelement integration led to gene duplicationin this region.

Rat 2-5A Synthetase Gene Family

To identify the rat Oas genomic region, the NCBIRattus norvegicus database was searched using se-quences from two rat 2-5A synthetases, NM_138913and NM_144752, previously deposited in GenBank.These sequences aligned perfectly with regions of therat chromosome 12 WGS supercontig NW_047376.

Table 3. Sequences of mammalian 2-5A synthetase genes used for the phylogenetic analysis

Gene Mouse Rat Human Horse Cattle Pig Dog Gene

Oas1a NM_145211 none NM_016816 AY321355 NM_178108 (1x) NM_214303 (1x) AY863104 OAS1

Oas1b AF328926 NM_144752 AY243505 (1y) AY550259 (1y)

Oas1c NM_033541 NM_001009492 NM_001029846 (1z)

Oas1d NM_133893 NM_001009379

Oas1e NM_145210 NA

Oas1f NM_145153 NM_001009490

Oas1g NM_011852 AY221507

Oas1h NM_145228 NM_001009491

Oas1i XM_144553 NM_001009680

Oas1j XM_144550 NA

Oas1k NA AY196696

Oas2 NM_145227 NM_001009715 NM_016817 AY425674 NM_001024557 NM_001031796 AY906957 OAS2

Oas3 NM_1452261 NM_001009493 NM_006187 AY569128 NA NA AY916055 OAS3

Oasl1 NM_145209 NM_001009681 NM_003733 AY463162 AY271906 AY594645 AY906958 OASL1

Oasl2 NM_011854 NM_001009682 NA NA NA NA AY906959 OASL2

Fig. 1. Structures of mouse and rat 2-5A synthetase gene families.Functional genes and pseudogenes are indicated as black and grayarrows, respectively.

565

Using the bl2seq program, the sequence of this su-percontig was aligned with known murine 2-5A syn-thetases and cDNA sequences for several additionalrat 2-5A synthetase genes were predicted. A set ofgene-specific primers (Table 1) was designed for eachpredicted gene and utilized to obtain full-lengthcDNA sequences by RACE.

The rat cDNA sequences obtained were used tosearch the GenBank rat database and several matcheswere found. The NM_144752 mRNA deposited pre-viously in GenBank (Shimizu et al. 2003) was com-pletely identical to the RnOas1b sequence obtained inthe RACE experiments (Table 3). Another sequenceobtained by RACE, RnOas1g (AY221507), was verysimilar but contained an additional exon of 27 bpcompared to the NM_138913 mRNA submittedpreviously (Truve et al. 1993). The RnOas1d,RnOas1i, RnOas2, RnOasl1, and RnOasl2 sequencesgenerated in this study partially matched the se-quences XM_222193, XM_222192, XM_344120,XM_222230, and XM_222236, respectively, anno-tated in GenBank by automated computationalanalysis. The 5¢ end of the RnOas3 sequence(AY250706) obtained by RACE was very similar tothe MmOas3 (NM_145226) and identical to thebeginning of the short rat sequence XM_344115annotated in GenBank and mistakenly designated asrat Deltex1 (RnDtx1).

The majority of the rat cDNA sequences obtainedaligned perfectly with the rat chromosome 12 WGSsupercontig NW_047376, with the exception of theRnOas2 cDNA that aligned to supercontigNW_047389. Although the position of the RnOas2gene could not be deduced with certainty from theNCBI database, this gene is most likely located be-tween the RnDtx1 and RnOas3 loci (Fig. 1). Al-though RnOas2 is not included in the rat chromosome12 WGS supercontig NW_047376, which contains the3¢ end of the RnDtx1 and the 5¢ end of the RnOas3loci, a complete copy of this gene is present in theother supercontig NW_047389, that also includes the5¢ end of RnDtx1, suggesting that there may be adeletion in this region in supercontig NW_047376. Allof the identified RnOas genes, including RnOas2,were present within the single Celera rat genomicscaffold CH473973. The sequences of two pseudo-genes, designated RnOas1j and RnOas1l, were alsoidentified in this scaffold. Using the rat gene orderpredicted by the Celera database, the relationshipsbetween murine and rat Oas and Oasl gene clusterswere established according to maximal sequencesimilarities (Fig. 1). In both mouse and rat genomes,there are 10 Oas genes and 2 Oas-like genes as well as 2Oas pseudogenes. While most of the murine geneshave orthologues in the rat genome, the MmOas1aand MmOas1e genes are unique to mouse, and theRnOas1k gene and RnOas1l pseudogene are unique

to rat. The Oas1j pseudogene is present in both themouse and the rat genomes. In contrast, the ortho-logue of the rat RnOas1i gene is the MmOas1ipseudogene in mouse.

The exon/intron structures of each of the rat Oasgenes were determined by alignment of the Celera ratgenomic sequence CH473973 with the various indi-vidual rat cDNA sequences obtained. Comparison ofthe structures of the mouse and rat Oas orthologuesshowed that the total number of exons is conserved inthe mouse and rat Oas orthologues (Table 4). Thereare 6 or, rarely, 7 exons in the rodent Oas1 and Oas-like genes. The rodent Oas2 and Oas3 genes have 12or 16 exons, respectively. Although the lengths ofcorresponding exons are usually similar in mouse andrat orthologous Oas genes, there are several excep-tions. The 12-bp deletion that was previously foundin the MmOas1b gene (Perelygin et al. 2002) was notobserved in the second exon of the RnOas1b gene(Table 4). However, a similar deletion is present inthe second exon of the RnOas1f gene. Due to thisdeletion, both the mouse MmOas1b and the ratRnOas1f proteins lack four amino acids in a helicalturn that interacts with the ATP substrate accordingto predictions made from a crystal structure of aporcine OAS1 protein (Hartmann et al. 2003). TheMmOas1b locus was previously identified as themurine flavivirus resistance (Flv) gene and the dele-tion in the second exon was proposed as an importantfeature of this gene (Brinton and Perelygin 2003;Perelygin et al. 2002). The lengths of the fifth exon ofOas1b, the first exon of Oas1h, both the fifth and thesixth exons of Oasl2, and both the first and the sec-ond exons of Oas2 differ between the mouse and ratgenomes (Table 4).

Tissue Expression Patterns of Rat 2-5A SynthetaseGenes

As another means of validating the predicted ortho-logous relationships between the murine and the ratOas loci, the tissue expression pattern of each of theRnOas1 and RnOasl genes was assayed by PCRusing a normalized rat multiple tissue cDNA panel asdescribed under Materials and Methods (Fig. 2). Thetissue expression patterns of the mouse MmOas1 andMmOasl genes were reported previously (Kakuta etal. 2002; Mashimo et al. 2003) and the expressionpatterns of orthologous rat and mouse 2-5A synthe-tase mRNAs were compared (Table 5). In general,the expression levels of the individual Oas genes werevery similar in mouse and rat with the exception ofthe Oas1h gene, which is expressed at detectablelevels in many rat tissues but is expressed only inmouse testis. Variation observed in the expressionlevels of the Oas1b, Oas1d, Oas1g, and Oas1l genes in

566

various tissues between the two rodent species couldbe due to actual differences as well as differences inPCR amplification efficiencies.

Bovine and Porcine 2-5A Synthetase Gene Families

The bovine sequence, AB104656 (designatedBtOAS1X), recently released in GenBank was uti-lized to identify additional bovine OAS genes. Asearch of the GenBank bovine EST database withBtOAS1X detected several similar but not identicalsequences. Two nonoverlapping bovine EST se-quences, BM363371 and BE750543, aligned well withthe equine EcOAS1 full-length cDNA sequenceAY321355 obtained previously (Perelygin et al.2005). Based on this alignment, these EST sequenceswere predicted to represent the 5¢ and 3¢ parts of asecond bovine OAS1 gene designated BtOAS1Y.Forward and reverse PCR primers designed from thetwo bovine EST sequences were used to amplify andsequence full-length BtOAS1Y cDNA (AY243505).Two additional bovine ESTs, BE722088 andCK772576, contained exons 4 to 6 as well as the 3¢part of intron 5 of the BtOAS1Z gene, while theBE752306 EST sequence corresponded to the 3¢ endof the BtOAS2 gene. A partial sequence of a BtOASLgene was obtained by sequencing the cDNA clonesAW357641 and BE476320. The 5¢ sequences of theBtOAS1Z, BtOAS2, and BtOASL genes were ex-tended using a DNA Walking SpeedUp Kit. The full-length sequences obtained were submitted to Gen-Bank under accession numbers AY650038,AY599197, and AY271906, respectively (Table 3).

Table 4. Length (bp) of exons within ORFs of rodent 2-5A synthetase genes

Exon number

Gene name 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

MmOas3 174 283 176 236 154 507 280 176 251 145 174 286 176 239 148 12

RnOas3 174 283 176 236 154 504 280 176 251 145 174 286 176 239 148 12

MmOas2 315 247 179 239 145 174 283 188 239 154 93/0a 0/66a

RnOas2 282 255 179 239 142 174 283 191 239 154 93/0a 0/63a

MmOas1a 183 289 185 233 154 27 33

MmOas1b 180 277 185 233 240 102 0

RnOas1b 180 289 185 233 154 99 0

MmOas1c 183 289 197 236 154 30 0

RnOas1c 183 289 197 233 154 30 0

MmOas1d 183 289 197 233 154 30 0

RnOas1d 183 289 197 236 154 30 0

MmOas1e 183 289 197 233 154 15/105a 0

MmOas1f 183 289 197 233 154 39 0

RnOas1f 180 277 197 233 154 39 0

MmOas1g 183 289 185 233 154 27 33

RnOas1g 183 289 185 233 154 27 33

MmOas1h 204 289 197 236 154 30 0

RnOas1h 183 289 197 233 154 30 0

RnOas1i 183 289 185 233 154 105 0

RnOas1k 183 289 185 233 154 24 0

MmOasl1 198 283 164 242 148 501 0

RnOasl1 198 283 167 242 148 501 0

MmOasl2 198 289 176 245 142 372 0

RnOasl2 198 289 176 242 157 474 0

a Alternatively spliced transcripts.

Fig. 2. Tissue-specific expression of rat 2-5A synthetase genes.

567

The bovine cDNA sequences obtained were usedto search the GenBank bovine BAC ends database.The sequence adjacent to the SP6 promoter of thebovine BAC clone CH240_433I24 (CC549202) wasfound to represent the 3¢ end of the BtOAS2 gene(Fig. 3). The sequence located close to the SP6promoter in an additional bovine BAC clone,CH240_49I16 (BZ931675), was previously identified(Larkin et al. 2003) as a part of the BtRPH3A gene(NM_174454) that is closely linked to the 2-5Asynthetase genes in mouse, rat, chimpanzee, andhuman (www.ensembl.org). The CH240_49I16 BACclone was partially sequenced with the same bovineprimers utilized for cDNA amplification. In addi-tion to the BtRPH3A gene, this clone contained thebovine BtOAS1X, BtOAS1Y, BtOAS1Z, andBtOAS2 genes.

A sequence derived from intron 3 of the BtOAS1Zgene was identical to the sequence (in the oppositeorientation) located adjacent to the T7 promoter in thebovine BAC clone CH240_117E17. Sequence analysisshowed that this clone contained only the BtRPH3Aand BtOAS1Z genes (Fig. 3). When an intergenic re-gion of 5.5 kb located between the BtOAS1Y and theBtOAS2 genes was amplified from the CH240_49I16clone and sequenced directly (AY780549), a BtOAS3gene was not detected. Based on these data, the mostlikely bovine gene order is BtRPH3A–BtOAS1Z–BtOAS1X–BtOAS1Y–BtOAS2 (Fig. 3).

The sequence of the porcine SsOAS1X mRNA(AJ225090) was previously deposited in GenBank(Hartmann et al. 1998). A blast search of the pig ESTGenBank database with full-length human 2-5Asynthetase cDNA sequences revealed several mat-ches. Two overlapping sequences, BP160622 andBG383212, contained the complete cDNA sequenceof a second porcine OAS1 gene designated SsOAS1Y.A pair of PCR primers was designed from these ESTsequences and utilized to amplify and sequenceSsOAS1Y cDNA (AY550259). Two additionaloverlapping cDNA clones, BF199347 and BX914596,were also identified, purchased, and sequenced toobtain a partial SsOAS2 sequence. Three EST se-quences detected by blast search, CB285790,CF722880, and CJ009960, corresponded to the 5¢ endof the porcine SsOASL cDNA. Partial SsOAS2 andSsOASL sequences were extended using a DNAWalking SpeedUp Kit and sequenced. The full-lengthsequences obtained were submitted to GenBank un-der accession numbers AY288913 and AY594645,respectively (Table 3).

Fig. 3. BAC clone alignment, exon/intron structures, and pro-posed order of the bovine BtRPH3A, BtOAS1X, BtOAS1Y,BtOAS1Z, and BtOAS2 genes.

Table 5. Tissue-specific expression levels of rodent 2-5A synthetase genes

Gene Species Brain Heart Kidney Liver Lung Muscle Spleen Testis

Oas1a Mousea ) + + + ++ + ++ +

Rat NAc NAc NAc NAc NAc NAc NAc NAc

Oas1b Mousea + + + ++ + ) ++ )Rat ++ ++ + +++ + + +++ )

Oas1c Mousea + ) ) ) + ) + +

Rat + ) ) ) ) ) + +

Oas1d Mousea ) ) + ) + ) ++ )Rat ) ) ) ) ) ) + +

Oas1e Mousea ) ) ) + + ) ++ )Rat NAc NAc NAc NAc NAc NAc NAc NAc

Oas1f Mouseb + ? + + + ? ? +

Rat + + + ++ + + + +

Oas1g Mousea ) + ++ ++ ++ + ++ )Rat + + + ++ + + ++ )

Oas1h Mousea ) ) ) ) ) ) ) +

Rat + + ) + ++ + ++ +

Oas1i Mouse NAc NAc NAc NAc NAc NAc NAc NAc

Rat + + + + + + ++ +

Oas1k Mouse NAc NAc NAc NAc NAc NAc NAc NAc

Rat ) + + + + + ++ +

Oasl1 Mousea ) + + ++ + + ++ +

Rat ) ) ) ++ + + ++ ++

Oasl2 Mousea + + + + ++ + ++ +

Rat + + + ++ ++ + ++ +

a Data from Kakuta et al. 2002.b Data from Masimo et al. 2003. The question marks indicate no data available.c No gene present.

568

Although the lengths of the coding regions ofbovine and porcine OAS1 and OAS2 genes are gen-erally similar to those of orthologous genes in hu-mans and rodents, the bovine BtOASL and porcineSsOASL genes contain premature stop codons andencode truncated proteins that do not contain C-terminal ubiquitin domains. For each of these species,mRNA and genomic DNA derived from two or moreunrelated animals of various breeds were sequenced.Two independent cDNA clones, AW357641 (MARC3BOV library) and BE476320 (BARC 5BOV library),lung and spleen poly(A)+ RNA (BD Biosciences), aspleen cDNA (Seegene), and genomic DNA (See-gene) were used to confirm the premature stop codonin the bovine OASL sequence. The same prematurestop codon is also present in the independently ob-tained bovine genomic contig NW_929228. PooledcDNA (see Materials and Methods) and indepen-dently obtained genomic DNA (Seegene) were uti-lized to confirm the premature stop codon in theporcine OASL gene. An analogous stop codon wasalso found in five independent porcine cDNA clones,BF712605, BQ598446, CB468964, CO942090, andCO992433. The 3¢ ends sequenced for both the bovineand the porcine OASL genes were derived by primerwalking on bulk cDNAs, suggesting that the majorityof the OASL mRNAs contain the premature stopcodon. If any alternative transcripts are present inthese cDNA populations, they are present at a verylow concentration since they were not detectable ei-ther by primer walking or by sequencing severalclones obtained from independent cDNA libraries.However, it is possible that some individuals orbreeds might not contain a premature stop codon intheir OASL gene sequence. A premature stop codonin the MmOas1b gene is a common feature of themajority of laboratory inbred mouse strains, whichwere derived from a small number of progenitors.Mice in wild populations exhibit a high frequency ofthe allele that encodes the full-length MmOas1bprotein (Brinton and Perelygin 2003).

The identification of multiple OAS1 genes in twospecies of the order Artiodactyla demonstrates thatthe order Rodentia is not the only mammalian line-age, in which Oas1 gene duplication has occurred.Although the pig and cattle genomes contain singlecopies of the OAS2 and OASL genes, no traces of anOAS3 gene have yet been detected.

Equine and Canine 2-5A Synthetase Gene Families

The structure of the equine 2-5A synthetase genefamily was described elsewhere (Perelygin et al. 2005).To date only a single copy of each of the EcOAS1,EcOAS2, EcOAS3, and EcOASL genes has beendetected, suggesting that the horse OAS gene familyis similar to the gene family in humans.

Recently, two independent assemblies of the dogwhole genome were released in GenBank (NCBI) andGenome Browser (UCSC). A search of the UCSCGenome Browser against human OAS coding se-quences identified two OAS gene-containing regionsin the dog genome. One was the CfDtx1–CfRph3aregion on chromosome 26 (CFA26; 13,500 to 13,700kb in UCSC Genome Browser) that containedthe genes CfRPH3A–CfOAS1–CfOAS3–CfOAS2–CfDTX1 in the direct transcription orientation. An-other region on CFA26 (119,850 to 120,000 kb inUCSC Genome Browser) contained the CfTCF1 genein the direct transcription orientation and theCfOASL1 and CfOASL2 genes in the opposite tran-scription orientation. The number and location of theCfOAS genes and the flanking genes are identical tothose in the human genome with the exception of thehuman OASL2 gene, which has become a pseudo-gene. The NCBI assembly also predicted theCfOASL1 and CfOASL2 genes on CFA26. Gene-specific primer pairs (Table 1) were designed fromcorresponding canine genomic sequences and wereutilized to amplify and sequence the CfOAS1,CfOAS2, CfOAS3, CfOASL1, and CfOASL2cDNAs. The sequences obtained were submitted toGenBank under the accession numbers listed in Ta-ble 3. Multiple forms of 2-5A synthetases were pre-viously detected in dog serum by Western blotting(Iwata et al. 2004).

The canine 2-5A synthetase gene families aresimilar to the human gene family in having singlecopies of the OAS1, OAS2, and OAS3 genes. How-ever, similarly to rodents, the dog genome containstwo OASL genes. In contrast to even-toed ungulates,which appear to encode truncated OASL proteins,neither the equine nor the canine OASL genes have apremature stop codon and encode a full-lengthprotein with two ubiquitin domains at the C-termini.

Phylogeny of Rodent Oas1 Genes

The 5¢ portions of all rodent Oas1 genes are similarto each other and consist of five exons (Table 4).Many of the Oas1 genes have only one additional 3¢exon that varies in size. However, only the rodentOas1a and Oas1g genes have two additional shortexons, 6 and 7. The alignment of rodent Oas1 se-quences (data not shown) revealed complex rela-tionships between their 3¢-terminal exons. Exon 6 ofthe RnOas1k gene corresponds to exon 6 of theMmOas1a gene. In the rodent genes, Oas1c, Oas1d,Oas1e, Oas1f, and Oas1h, exon 6 corresponds toexon 7 of the MmOas1a gene. In the rodent genes,Oas1b and Oas1i, the long exon 6 is not similar to the3¢ regions of the other Oas1 genes. The mouseMmOas1i pseudogene consists of partial exon 4 andcomplete exons 5 and 6, the rat RnOas1l pseudogene

569

is represented by only a partial exon 6, while thepseudogene Oas1j consists of exons 4 and 5 in bothrat and mouse. The RnOas1k genomic sequencecontains an additional copy of exon 4, which isprobably a pseudoexon because of mutations in itssplice sites and a premature stop codon in framewhen this pseudoexon is included in the mRNA.

The coding regions of eight mouse MmOas1 genesand eight rat RnOas1 genes were aligned and thealignment obtained was used to build a phylogenetictree as described under Materials and Methods. Al-though several additional distance-based methodsthat differ in the number of parameters analyzed wereused to infer the tree topology, only the one con-structed using the njtree program is shown in Fig. 4.Cattle, dog, human, horse, and pig OAS1 cDNAsequences were used as outgroups. The mouse and ratcDNA sequences of the Oas1b, Oas1f, and Oas1hgenes paired with each other in the phylogenetic treegenerated, which supports their orthological rela-tionships. The rat RnOas1k gene appears to have nomouse counterpart. A search for a correspondingmouse gene between MmOas1e and MmOas3 in themouse Oas gene cluster (Fig. 1) did not reveal anyhomologous sequences. The unique position of theRnOas1k gene on the phylogenetic tree and the ab-sence of an orthologue in the corresponding mousegenomic sequence suggest that this gene was deletedin the mouse genome. In contrast, the mouse genecluster contains two extra genes, MmOas1a and

MmOas1e, that have recently arisen in the mousegenome. The MmOas1c and MmOas1e genes areorthologues of RnOas1c, while both MmOas1a andMmOas1g are orthologues of RnOas1g.

However, two parts of the tree are in discordancewith the map of the rodent Oas gene clusters (Fig. 1).First, the rodent Oas1c and Oas1d genes are sepa-rated from each other on the map by six other genesand show a good orthological correspondence be-tween mouse and rat. However, comparison of theirsequences showed that the distances between paral-ogous rodent Oas1c and Oas1d genes in each speciesare smaller than the distances between the corre-sponding orthologues. The mouse gene MmOas1ecan be considered a recent duplication of the ances-tral rodent Oas1c gene. Second, a similar relationshipis found for the rat RnOas1g and RnOas1i genes. Themouse genome contains only a partial sequence of theMmOas1i pseudogene, which is unsuitable for phy-logenetic comparisons. On one hand, the phyloge-netic tree shows (Fig. 4) that the RnOas1g gene ismore similar to RnOas1i than to MmOas1g. On theother hand, the RnOas1i gene has an exon structuresimilar to that of the rodent Oas1b genes (Table 4)and does not contain exon 7, which is present in ro-dent Oas1g. All these data suggest a common originfor the RnOas1b and RnOas1i genes but not for theRnOas1g and RnOas1i genes. Therefore, in manycases the similarities between rodent Oas1 ortholo-gous cDNA sequences are lower than those betweenOas1 paralogous sequences within each species. Theobserved deviation from the principle of divergentevolution, which assumes that more recent splits be-tween genes produce more similar sequences, sup-ports concerted evolution of paralogous Oas1 genesin rodents.

With an increasing number of parameters thestatistical significance of the tree decreases for thedata set. The difference in the substitution rates be-tween transitions and transversions as well as thedifferences between synonymous and nonsynony-mous substitutions are major factors affecting thetopology of the tree. Other factors are minor andusually do not change the tree topology. The topol-ogies obtained using several methods varied only inminor details. These differences were observed pri-marily where the distance-based methods show lowbootstrap values (data not shown). Therefore, theutilization of other methods did not significantlychange the tree topology.

In the trees built from the intron sequences (Sup-plemental Figs. 1–3), the sequences were clusteredaccording to species instead of by the actual ortho-logus relationship. Since the existing phylogeneticmethods are all based on the hypothesis of divergentevolution of sequences, none of these methods is ableto infer the correct tree if the degree of gene conver-

Fig. 4. Phylogenetic tree of mammalian 2-5A synthetase 1 genes.

570

sion is high enough to destroy the divergent pattern ofgene evolution. Only by using additional informationfrom the chromosomal gene order (Fig. 1) could theactual phylogenetic order be suggested.

Concerted Evolution of Paralogous Oas1 Genes inRodentia and Artiodactyla

An independent comparison of intron sequencesprovided even stronger evidence for concerted evo-lution of paralogous Oas1 genes. In the intron phy-logenetic trees (Supplemental Figs. 1–3), two largemouse Oas1 clusters, MmOas1a-related genes(MmOas1a, MmOas1g, and MmOas1b) andMmOas1c-related genes (MmOas1c, MmOas1e,MmOas1d, MmOas1h, and MmOas1f), are com-pletely separated from two corresponding groups ofrat sequences, RnOas1g-related genes (RnOas1g,RnOas1b, and RnOas1i) and RnOas1c-related genes(RnOas1c, RnOas1d, RnOas1h, and RnOas1f). Se-quence similarities within each of these groups ofparalogous genes were higher than those between thecorresponding groups of orthologous genes in thesetwo species.

Both exon and intron sequences showed multipleinterlocus exchanges between various genes. Theseexchanges may distort the divergent pattern of Oas1gene evolution to the extent that identification oforthologous genes becomes incorrect. The genomiclocation of Oas1 genes appears to be more conservedand allows the establishment of the orthologicalcorrespondence between mouse and rat genes. Thiscorrespondence can be used to estimate the numberof convergent substitutions between paralogousgenes. The total numbers of convergent changes be-tween different mouse branches and between differentrat branches are 151 and 159, respectively. However,between the mouse and the rat branches there areonly 111 changes, which is far fewer than expectedfrom a uniform distribution of convergent changesamong branches. The frequency of exchanges is re-lated to the evolutionary distance between loci. Forexample, between the Oas1c/Oas1e lineage and theOas1d, Oas1h, and Oas1f lineages, there are 30, 17,and 11 convergent changes, respectively; between theOas1d lineage and the Oas1h and Oas1f lineages,there are 22 and 14 convergent changes, respectively.In intron sequences, which are mostly not undernatural selection, the interlocus exchanges are evenmore frequent. In addition to single nucleotide sub-stitutions, there is an abundance of deletions andinsertions. Unlike the nucleotide substitutions, theprobability of an independent occurrence of inser-tions and deletions of exactly the same size and po-sition is quite low. If identical deletions or insertionsare found in different genes, they most probably havea common origin. Considering only the most reliable

parts of the intron alignment, we counted 82 inser-tions and deletions shared by two or more genes.Among them, 41 unique gaps supported the correcttree topology. Eleven gaps in mouse lineages and 30gaps in rat lineages were shared by unrelated treebranches and therefore most probably originated byinterlocus exchanges. The largest number of gaps wasshared by the RnOas1c and RnOas1d genes (sevengaps), by the RnOas1c, ROas1d, and RnOas1h genes(seven gaps), by the RnOas1d, RnOas1h, andRnOas1f genes (five gaps), and by the MmOas1c andMmOas1d genes (three gaps).

Three OAS1 genes (BtOAS1X, BtOAS1Y, andBtOAS1Z) were identified in the bovine genome andtwo OAS1 genes (SsOAS1X and SsOAS1Y) werefound in the porcine genome. The map positions ofthese porcine genes have not yet been determined.Therefore, the we cannot establish the orthologousrelationships between the bovine and the porcineOAS1 genes. In the phylogenetic tree (Fig. 4), 2-5Asynthetase 1 genes from different species form twoindependent clusters. Human, canine, and equinegenes are in one cluster, while the OAS1 genes ofeven-toed ungulates are in a second cluster. Thenumber of convergent substitutions in exons be-tween the bovine and the porcine branches does notexceed the random background level (two betweenSsOAS1X and BtOAS1X, five between SsOAS1Yand BtOAS1Y, one between SsOAS1X and BtOA-S1Y, and one between SsOAS1Y and BtOAS1X).Although the convergence between correspondinggenes X/X and Y/Y (seven substitutions) exceeds theconvergence between different genes X/Y (two sub-stitutions), the difference is not large enough to re-ject the hypothesis that independent duplications ofOAS1 genes occurred in the bovine and porcinegenomes. However, comparison of the substitutiondistributions along the nucleotide sequence in dif-ferent evolutionary lineages can unambiguouslydemonstrate the presence of gene conversion. Asimple approach was developed for such a compar-ison to allow quantification of the differences be-tween distributions of substitutions in probabilisticterms.

The main assumption of our approach is that thesubstitution rate at each particular nucleotide posi-tion does not change among the genes and speciesbeing considered. However, different nucleotidepositions may have different substitution rates. Foreach pair of genes, the cumulative distribution ofdifferences can be built by assigning to each position ithe number of differences Ni located to the left of thisposition normalized by the total number of differ-ences N observed between the two genes, i.e.,Fi = Ni/N. The distribution starts with F = 0 at thefirst position and reaches F = 1 at the end of thesequences. The distributions for two pairs of

571

sequences are then compared by a nonparametricKolmogorov-Smirnov (KS) test, which estimates themaximum difference between the two distributions interms of probability. The distributions are expectedto be similar when there are no evolutionary events orrestrictions that differentially affect the occurrence ofsequence changes in different lineages.

If a large portion of the sequence is exchangedbetween different loci, the distribution of differencesin this portion will deviate locally and the exchangecan be detected by comparing the distributions fordifferent pairs of sequences. The comparison of twodistributions, F1 and F2, can be shown graphically asa curve in coordinates (F1, F2). If the distributionsare similar, the curve will stay close to the diagonal.The three curves shown in Fig. 5A representthe BtOAS1Y/BtOAS1Z, BtOAS1X/BtOAS1Z, andBtOAS1X/BtOAS1Y pairs. For the first pair, devia-tion from the diagonal is not significant since the KStest probability is p = 0.512. The second pair shows alarger deviation (p = 0.069) and a higher substitu-tion rate in the proximal half of the gene compared tothe distal half. For the third pair of genes, deviationfrom the diagonal is statistically highly significant(p = 4.9 · 10)9). This pair has a low substitution ratein the first half of the sequence and a significantlyhigher rate in the second half. For the second andthird pairs of sequences, the rate switch point is lo-cated approximately in the same region. The bestestimate predicted the largest difference (0.47) be-tween the two distributions to be at position 500 ofthe BtOAS1X coding sequence. Therefore, BtOAS1XcDNA is more similar to BtOAS1Y cDNA beforeposition 500, but the BtOAS1X sequence is moresimilar to BtOAS1Z after this position. The simplestmodel explaining this observation would assume that

BtOAS1X originated by unequal crossing-over be-tween genes BtOAS1Z and BtOAS1Y and thenaccumulated further differences.

A similar analysis was performed for the porcinegenes, SsOAS1X and SsOAS1Y (Fig. 5B). The dis-tribution of differences between the two porcine genesdeviates by 0.47 at position 940 (p = 1.4 · 10)9),which can be explained by a recent transfer of a largeportion of the sequence from one gene to another.The distribution of the differences between SsOAS1Xand BtOAS1Z does not differ from that betweenEcOAS1 and BtOAS1Y (Figs. 5A and B).

The presence of interlocus transfers betweenduplicated copies of genes violates the major principleof phylogenetic inference, i.e., the principle of diver-gence of related sequences. The resulting phylogenetictree will be biased toward joining the paralogousgenes. In cases where the amount of divergence issufficient to correctly join the orthologous genes, theexchanges between paralogous genes may result indecreased confidence values as reflected by lowbootstrap values for these joins.

Gene conversion is the major mechanism by whichinterlocus exchanges occur. This mechanism has beenreported to occur in many multigene families (Anniloet al. 2003; Archibald and Roger 2002a,b; Betten-court and Feder 2002; Desjardins et al. 2002; Drouin2002a,b; Higgs et al. 1984; Israel et al. 2002; Lazzaroand Clark 2001; Nagawa et al. 2002; Zhao et al.1998). For successful exchange to occur betweenparalogous genes, the genes should be located close toeach other in the genome. Genes located on differentchromosomes or separated by more than one meg-abase on the same chromosome are prevented fromundergoing gene conversion (Lazzaro and Clark2001; Rooney et al. 2002). The opposite orientation

Fig. 5. Distribution of exchanges between (A) bovine BtOAS1X, BtOAS1Y, and BtOAS1Z genes and between (B) porcine SsOAS1X andSsOAS1X genes. The horizontal axis represents the distribution values for the sequence pair (A) EcOAS1/SsOAS1X or (B) EcOAS1/BtOAS1Y and the vertical axis shows the distribution values for a sequence pair.

572

of genes does not prevent gene conversion. As seenwithin the rodent Oas1 gene cluster, there was anexchange between the rodent Oas1c and Oas1d genes,which are separated by several other genes and haveopposite orientations within the cluster.

Evolution of 2-5A Synthetase Domains

A phylogenetic tree (Fig. 6) was next constructedfrom aligned domains of all types of 2-5A synthe-tase genes. The tree shows a major split betweenOAS1-related and OASL-related domains. TheOAS1-related domains include OAS1 genes and C-

terminal domains of the OAS2 and OAS3 genes,while the OASL-related domains include OASLgenes as well as the N-terminal domains of OAS2and OAS3 genes and the middle (M) domains ofOAS3 genes. This observation is consistent with theevolutionary model suggested by Kumar and co-workers (2000) with the following exception. In thetree (Fig. 6), the OAS2N domain branches from theOAS1-related subtree rather than from the OASL-related subtree. Unlike the other domains, theOAS2N domain is highly variable; the substitutionrate in the OAS2N cluster is about 2.5 times higherthan in the other branches. Correspondingly, the

Fig. 6. Phylogenetic tree of vertebrate2-5A synthetase and 2-5A synthetase-likegenes.

573

convergence rate in other domains is higher,resulting in low bootstrap values and the biasedtree position of the OAS2N domain. Also, thealignment of the OAS2N domains was problematic.In some regions, the OAS2N domains are similarto the OAS3N/M domains, while in other placesthey are similar to the OAS1 domains. Because thisis an ancient evolution, it could not be determinedwhether gene conversion or functional selectionaccounts for the similarity between the OAS1 do-mains and regions of the OAS2N domains.

The locations of the other OAS domains in the treeare consistent with the model shown in Fig. 7.According to this model, an ancient 2-5A synthetasegene was tandemly duplicated producing the ancestorsof the major domain groups, OAS1-related andOASL-related (Fig. 7A). The next event was a dupli-cation of the chromosomal segment containing thesetwo genes (Fig. 7B). One copy of the OAS1-relatedgene then evolved into themammalianOAS1 ancestralgene, while the other OAS1-related copy evolved intoC-terminal domain of a two-domain ancestral gene(Fig. 7C). In a similar manner, one copy of the OASL-related gene was the ancestor of the modern OASLgenes (Fig. 7E), while the second OASL-related copybecame the N-terminus of a two-domain ancestralgene (Fig. 7C). This two-domain gene was thenduplicated (Fig. 7D) and one of its copies served as acommon ancestor of the OAS2 gene, while the N-ter-minal domain of the second copy was subsequentlyduplicated (Fig. 7F) to generate the modern OAS3gene.

This scenario explains why modern OAS1 genesare more closely related to the C-terminal domains ofthe OAS2 and OAS3 genes, while the OASL genes aremore closely related to the N-terminal domains.However, the detection of only one chicken OAS gene(NM_205041) may put this scenario in doubt. A blastsearch of the chicken GenBank databases usingmammalian OAS sequences did not reveal any addi-tional OAS sequences in the chicken genome. Theinitial ATG start-codon is missing in all of the chickenOASL sequences available in GenBank (AB002585,AB002586, AB037592, AB037593, and NM_205041).The missing part of the GgOASL1 sequence appearsto be present in both the BI067058 and the BI390511chicken ESTs. A partial mRNA sequence, AB002585,supports this suggestion. Comparison of theNW_096287 genomic sequence with the mRNA se-quences, AB002585 and NM_205041, revealed sixexons in the chicken GgOASL1 gene. The lack ofGgOAS1, GgOAS2, and GgOAS3 genes in thechicken genome suggests that they have been lost orsilenced during evolution. Data from the ongoingchicken genome project should resolve the issue ofwhether there are additional GgOAS genes orpseudogenes in the chicken genome. The CK308414sequence of a small songbird, the zebra finch, has beenrecently deposited into GenBank by David F. Clayton(University of Illinois). This sequence contains a partof a gene that is orthologous to the chicken GgOASL1gene. To our knowledge, no traces of OAS or OASLgenes have yet been detected yet in other birds, fish,amphibians, or reptiles.

Fig. 7. Model of theevolution of 2-5Asynthetase domains.A Duplication of theancestral OAS gene anddivergence of OAS1- andOASL-related domains.B Duplication anddivergence of theOAS1- and OASL-relateddomains. C Fusion of theOAS1- and OASL-relateddomains into atwo-domain OAS gene.D Duplication of thetwo-domain OAS gene.E Duplication of anOASL-related domainand divergence of OASL1and OASL2 genes.F Duplication of theN-terminal domain in atwo-domain OAS geneand divergence of theOAS3 and OAS2 genes.

574

Conclusion

The 2-5A synthetase gene families have been studiedin the rat, cattle, pig, and dog genomes and comparedwith those in humans, mice, and horses. Geneduplication is a general evolutionary mechanism inthese gene families. While the four major types ofthese genes (OAS1, OAS2, OAS3, and OASL) areconserved, some of the OAS genes have been lost ortransformed to pseudogenes during mammalianevolution The results of the phylogenetic and com-parative sequence analyses presented provide con-siderable support for concerted evolution by geneconversion in the 2-5A synthetase gene families ofrodent end even-toed ungulate lineages.

Acknowledgments. We would like to thank Jonathan Beever for

providing porcine cDNA and Ping Jiang for assistance in DNA

sequencing. This work was supported by Public Health Service

Research Grant AI045135 from the National Institute of Allergy

and Infectious Diseases, National Institutes of Health; by Grant

CI000216 from the National Center for Infectious Diseases, Cen-

ters for Disease Control and Prevention; and by a grant from the

Southeastern Center for Emerging Biologic Threats through

Grant/Cooperative Agreement U38/CCU423095 from CDC. The

contents of this paper are solely the responsibility of the authors

and do not necessarily represent the official views of CDC or the

Southeastern Center for Emerging Biologic Threats.

References

Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller

W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new

generation of protein database search programs. Nucleic Acids

Res 25:3389–3402

Annilo T, Chen ZQ, Shulenin S, Dean M (2003) Evolutionary

analysis of a cluster of ATP-binding cassette (ABC) genes.

Mamm Genome 14:7–20

Archibald JM, Roger AJ (2002a) Gene conversion and the evolu-

tion of euryarchaeal chaperonins: a maximum likelihood-based

method for detecting conflicting phylogenetic signals. J Mol

Evol 55:232–245

Archibald JM, Roger AJ (2002b) Gene duplication and gene con-

version shape the evolution of archaeal chaperonins. J Mol Biol

316:1041–1050

Benech P, Mory Y, Revel M, Chebath J (1985) Structure of two

forms of the interferon-induced (2¢-5¢) oligo A synthetase of

human cells based on cDNAs and gene sequences. EMBO J

4:2249–2256

Bettencourt BR, Feder ME (2002) Rapid concerted evolution via

gene conversion at the Drosophila hsp70 genes. J Mol Evol

54:569–586

Brinton MA, Perelygin AA (2003) Genetic resistance to flavivi-

ruses. Adv Virus Res 60:43–85

Chebath J, Benech P, Hovanessian A, Galabru J, Revel M (1987a)

Four different forms of interferon-induced 2¢,5¢-oligo(A) syn-

thetase identified by immunoblotting in human cells. J Biol

Chem 262:3852–3857

Chebath J, Benech P, Revel M, Vigneron M (1987b) Constitutive

expression of (2¢-5¢) oligo A synthetase confers resistance to

picornavirus infection. Nature 330:587–588

Clemens MJ, Williams BR (1978) Inhibition of cell-free protein

synthesis by pppA2¢p5¢A2¢p5¢A: a novel oligonucleotide syn-

thesized by interferon-treated L cell extracts. Cell 13:565–572

Desjardins PR, Burkman JM, Shrager JB, Allmond LA, Stedman

HH (2002) Evolutionary implications of three novel members of

the human sarcomeric myosin heavy chain gene family. Mol

Biol Evol 19:375–393

Drouin G (2002a) Characterization of the gene conversions be-

tween the multigene family members of the yeast genome. J Mol

Evol 55:14–23

Drouin G (2002b) Testing claims of gene conversion between

multigene family members: examples from echinoderm actin

genes. J Mol Evol 54:138–139

Eskildsen S, Hartmann R, Kjeldgaard NO, Justesen J (2002) Gene

structure of the murine 2¢-5¢-oligoadenylate synthetase family.

Cell Mol Life Sci 59:1212–1222

Eskildsen S, Justesen J, Schierup MH, Hartmann R (2003) Char-

acterization of the 2¢-5¢-oligoadenylate synthetase ubiquitin-likefamily. Nucleic Acids Res 31:3166–3173

Ghosh A, Sarkar SN, Guo W, Bandyopadhyay S, Sen GC (1997)

Enzymatic activity of 2¢-5¢-oligoadenylate synthetase is im-

paired by specific mutations that affect oligomerization of the

protein. J Biol Chem 272:33220–33226

Hartmann R, Olsen HS, Widder S, Jorgensen R, Justesen J (1998)

p59OASL, a 2¢-5¢ oligoadenylate synthetase like protein: a novelhuman gene related to the 2¢-5¢ oligoadenylate synthetase

family. Nucleic Acids Res 26:4121–4128

Hartmann R, Justesen J, Sarkar SN, Sen GC, Yee VC (2003)

Crystal structure of the 2¢-specific and double-stranded RNA-

activated interferon-induced antiviral protein 2¢-5¢-oligoadeny-late synthetase. Mol Cell 12:1173–1185

Higgs DR, Hill AV, Bowden DK, Weatherall DJ, Clegg JB (1984)

Independent recombination events between the duplicated hu-

man alpha globin genes; implications for their concerted evo-

lution. Nucleic Acids Res 12:6965–6977

Hovanessian AG, Brown RE, Kerr IM (1977) Synthesis of low

molecular weight inhibitor of protein synthesis with enzyme

from interferon-treated cells. Nature 268:537–540

Hovnanian A, Rebouillat D, Mattei MG, Levy ER, Marie I,

Monaco AP, Hovanessian AG (1998) The human 2¢,5¢-oligoa-denylate synthetase locus is composed of three distinct genes

clustered on chromosome 12q24.2 encoding the 100-, 69-, and

40-kDa forms. Genomics 52:267–277

Ichii Y, Fukunaga R, Shiojiri S, Sokawa Y (1986) Mouse 2-5A

synthetase cDNA: nucleotide sequence and comparison to hu-

man 2-5A synthetase. Nucleic Acids Res 14:10117

Israel RL, Kosakovsky Pond SL, Muse SV, Katz LA (2002)

Evolution of duplicated alpha-tubulin genes in ciliates. Evol Int

J Org Evol 56:1110–1122

Iwata A, Yamamoto A, Fujino M, Sato I, Hosokawa-Kanai T,

Tuchiya K, Ishihama A, Sokawa Y (2004) High level activity of

2¢,5¢-oligoadenylate synthetase in dog serum. J Vet Med Sci

66:721–724

Jukes TH, Cantor CR (1969) Evolution of protein molecules. In:

Munro NH (ed) Mammalian protein metabolism. Academic

Press, New York, pp 21–123

Jurka J, Kohany O, Pavlicek A, Kapitonov VV, Jurka MV (2004)

Duplication, coclustering, and selection of human Alu retro-

transposons. Proc Natl Acad Sci USA 101:1268–1272

Justesen J, Hartmann R, Kjeldgaard NO (2000) Gene structure and

function of the 2¢-5¢-oligoadenylate synthetase family. Cell Mol

Life Sci 57:1593–1612

Kakuta S, Shibata S, Iwakura Y (2002) Genomic structure of the

mouse 2¢,5¢-oligoadenylate synthetase gene family. J Interferon

Cytokine Res 22:981–993

Khier H, Bartl S, Schuettengruber B, Seiser C (1999) Molecular

cloning and characterization of the mouse histone deacetylase 1

575

gene: integration of a retrovirus in 129SV mice. Biochim Bio-

phys Acta 1489:365–373

Kimura M (1980) A simple method for estimating evolutionary

rates of base substitutions through comparative studies of

nucleotide sequences. J Mol Evol 16:111–120

Kimura M (1981) Estimation of evolutionary differences between

homologous nucleotide sequences. Proc Natl Acad Sci USA

78:454–458

Kulski JK, Gaudieri S, Martin A, Dawkins RL (1999) Coevolution

of PERB11 (MIC) and HLA class I genes with HERV-16 and

retroelements by extended genomic duplication. J Mol Evol

49:84–97

Kumar S, Mitnik C, Valente G, Floyd-Smith G (2000) Expansion

and molecular evolution of the interferon-induced 2¢-5¢ oligoa-denylate synthetase gene family. Mol Biol Evol 17:738–750

Larkin DM, der Everts-van Wind A, Rebeiz M, Schweitzer PA,

Bachman S, Green C, Wright CL, Campos EJ, Benson LD,

Edwards J, Liu L, Osoegawa K, Womack JE, de Jong PJ, Lewin

HA (2003) A cattle-human comparative map built with cattle

BAC-ends and human genome sequence. Genome Res 13:1966–

1972

Lazzaro BP, Clark AG (2001) Evidence for recurrent paralogous

gene conversion and exceptional allelic divergence in the

Attacin genes of Drosophila melanogaster. Genetics 159:659–

671

Li WH (1993) Unbiased estimation of the rates of synonymous and

nonsynonymous substitution. J Mol Evol 36:96–99

Marie I, Hovanessian AG (1992) The 69-kDa 2-5A synthetase is

composed of two homologous and adjacent functional do-

mains. J Biol Chem 267:9933–9999

Marie I, Svab J, Robert N, Galabru J, Hovanessian AG (1990)

Differential expression and distinct structure of 69- and 100-

kDa forms of 2-5A synthetase in human cells treated with

interferon. J Biol Chem 265:18601–18607

Marie I, Blanco J, Rebouillat D, Hovanessian AG (1997) 69-kDa

and 100-kDa isoforms of interferon-induced (2¢-5¢)oligoadeny-late synthetase exhibit differential catalytic parameters. Eur J

Biochem 248:558–566

Mashimo T, Lucas M, Simon-Chazottes D, Frenkiel MP, Mon-

tagutelli X, Ceccaldi PE, Deubel V, Guenet JL, Despres P

(2002) A nonsense mutation in the gene encoding 2¢-5¢-oligoa-denylate synthetase/L1 isoform is associated with West Nile

virus susceptibility in laboratory mice. Proc Natl Acad Sci USA

99:11311–11316

Mashimo T, Glaser P, Lucas M, Simon-Chazottes D, Ceccaldi PE,

Montagutelli X, Despres P, Guenet JL (2003) Structural and

functional genomics and evolutionary relationships in the

cluster of genes encoding murine 2¢,5¢-oligoadenylate syntheta-

ses. Genomics 82:537–552

Nagawa F, Yoshihara S, Tsuboi A, Serizawa S, Itoh K, Sakano H

(2002) Genomic analysis of the murine odorant receptor

MOR28 cluster: a possible role of gene conversion in main-

taining the olfactory map. Gene 292:73–80

Needleman SB, Wunsch CD (1970) A general method applicable to

the search for similarities in the amino acid sequence of two

proteins. J Mol Biol 48:443–453

Perelygin AA, Scherbik SV, Zhulin IB, Stockman BM, Li Y,

Brinton MA (2002) Positional cloning of the murine flavivirus

resistance gene. Proc Natl Acad Sci USA 99:9322–9327

Perelygin AA, Lear TL, Zharkikh AA, Brinton MA (2005) Struc-

ture of equine 2¢-5¢ oligoadenylate synthetase (Oas) gene family

and FISH mapping of Oas genes to ECA8p15-p14 and

BTA17q24-25. Cytogenet Genome Res 111:51–56

Rebouillat D, Marie I, Hovanessian AG (1998) Molecular cloning

and characterization of two related and interferon-induced 56-

kDa and 30-kDa proteins highly similar to 2¢-5¢ oligoadenylatesynthetase. Eur J Biochem 257:319–330

Rebouillat D, Hovnanian A, Marie I, Hovanessian AG (1999) The

100-kDa 2¢,5¢-oligoadenylate synthetase catalyzing preferentiallythe synthesis of dimeric pppA2¢p5¢A molecules is composed of

three homologous domains. J Biol Chem 274:1557–1565

Rogozin IB, Aravind L, Koonin EV (2003) Differential action of

natural selection on the N and C-terminal domains of 2¢-5¢oligoadenylate synthetases and the potential nuclease function

of the C-terminal domain. J Mol Biol 326:1449–1461

Rooney AP, Piontkivska H, Nei M (2002) Molecular evolution of

the nontandemly repeated genes of the histone 3 multigene

family. Mol Biol Evol 19:68–75

Rutherford MN, Kumar A, Nissim A, Chebath J, Williams BR

(1991) The murine 2-5A synthetase locus: three distinct

transcripts from two linked genes. Nucleic Acids Res

19:1917–1924

Saitou N, Nei M (1987) The neighbor-joining method: a new

method for reconstructing phylogenetic trees. Mol Biol Evol

4:406–425

Shibata S, Kakuta S, Hamada K, Sokawa Y, Iwakura Y (2001)

Cloning of a novel 2¢,5¢-oligoadenylate synthetase-like mole-

cule, Oasl5 in mice. Gene 271:261–271

Shimizu T, Kawakita S, Li QH, Fukuhara S, Fujisawa J (2003)

Human T-cell leukemia virus type 1 Tax protein stimulates the

interferon-responsive enhancer element via NF-kappaB activ-

ity. FEBS Lett 539:73–77

Smith JB, Nguyen TT, Hughes HJ, Herschman HR, Widney DP,

Bui KC, Rovai LE (2002) Glucocorticoid-attenuated response

genes induced in the lung during endotoxemia. Am J Physiol

Lung Cell Mol Physiol 283:L636–L647

Tatusova TA, Madden TL (1999) Blast 2 sequences—a new tool

for comparing protein and nucleotide sequences. FEMS

Microbiol Lett 174:247–250

Tiefenthaler M, Marksteiner R, Neyer S, Koch F, Hofer S, Schuler

G, Nussenzweig M, Schneider R, Heufler C (1999) M1204, a

novel 2¢,5¢ oligoadenylate synthetase with a ubiquitin-like

extension, is induced during maturation of murine dendritic

cells. J Immunol 163:760–765

Truve E, Aaspollu A, Honkanen J, Puska R, Mehto M, Hassi A,

Teeri TH, Kelve M, Seppanen P, Saarma M (1993) Transgenic

potato plants expressing mammalian 2¢-5¢ oligoadenylate syn-

thetase are protected from potato virus X infection under field

conditions. Biotechnology (NY) 11:1048–1052

Witt PL, Marie I, Robert N, Irizarry A, Borden EC, Hovanessian

AG (1993) Isoforms p69 and p100 of 2¢,5¢-oligoadenylate syn-

thetase induced differentially by interferons in vivo and in vitro.

J Interferon Res 13:17–23

Yang Z, Yu CY (2000) Organizations and gene duplications of the

human and mouse MHC complement gene clusters. Exp Clin

Immunogenet 17:1–17

Zhao Z, Hewett-Emmett D, Li WH (1998) Frequent gene con-

version between human red and green opsin genes. J Mol Evol

46:494–496

Zharkikh A, Li WH (1995) Estimation of confidence in phylogeny:

the complete-and-partial bootstrap technique. Mol Phylogenet

Evol 4:44–63

Zharkikh AA, Rzhetsky A, Morosov PS, Sitnikova TL, Krushkal

JS (1991) VOSTORG: a package of microcomputer programs

for sequence analysis and construction of phylogenetic trees.

Gene 101:251–254

576