Genome-wide expression profiles of endogenous retroviruses in lymphoid tissues and their biological...

11
Genome-wide expression profiles of endogenous retroviruses in lymphoid tissues and their biological properties Young-Kwan Lee 1 , Alex Chew 1 , Ho Phan, David G. Greenhalgh, Kiho Cho Burn Research, Shriners Hospitals for Children and Department of Surgery, University of California, Davis, 2425 Stockton Blvd., Sacramento, CA 95817, USA Received 14 July 2007; returned to author for revision 21 August 2007; accepted 30 October 2007 Available online 9 January 2008 Abstract Endogenous retroviruses (ERVs) constitute approximately 810% of the human and mouse genome. Some autoimmune diseases are attributed to the altered expression of ERVs. In this study, we examined the ERV expression profiles in lymphoid tissues and analyzed their biological properties. Tissues (spleen, thymus, and lymph nodes [axillary, inguinal, and mesenteric]) from C57BL/6J mice were analyzed for differential murine ERV (MuERV) expression by RT-PCR examination of polymorphic U3 sequences. Each tissue had a unique profile of MuERVexpression. A genomic map identifying 60 putative MuERVs was established using 22 unique U3s as probes and their biological properties (primer binding site, coding potential, transcription regulatory element, tropism, recombination event, and integration age) were characterized. Interestingly, 12 putative MuERVs retained intact coding potentials for all three polypeptides essential for virus assembly and replication. We suggest that MuERV expression is differentially regulated in conjunction with the transcriptional environment of individual lymphoid tissues. © 2007 Elsevier Inc. All rights reserved. Keywords: Endogenous retrovirus; Spleen; Thymus; Lymph node; Coding potential; Transcription; Genome Introduction Infection of germline cells with retroviruses leads to their permanent colonization into the germline genome. These germ- line-integrated retroviruses are transmitted vertically to the offspring in a Mendelian fashion and are called endogenous retroviruses (ERVs), in contrast to exogenous retroviruses, which are acquired from external surroundings and transmitted hori- zontally. ERVs represent collections of retroviral proviruses in- troduced into the germline and accumulated throughout an entire set of generations. ERVs and other forms of transposable repetitive elements make up 45% of the human and mouse genomes (Urnovitz and Murphy, 1996). Among them, 8% of the human genome and 10% of the mouse genome consist of ERVs (Griffiths, 2001; Lander et al., 2001; Waterston et al., 2002). Although the majority of ERVs are defective primarily due to deletions and point mutations leading to a disruption of the coding potential of the gag, pol, and/or env genes, some are characterized to be biologically active. Most ERVs have the potential to modulate the expression of host genes near the integration site, for example, the transcrip- tional control of the amylase gene in the salivary gland (Samuelson et al., 1996). The HTDV (human teratocarcinoma- derived virus), a member of the human ERV (HERV)-K family, produces type C retroviral particles and retains the coding potential for retroviral polypeptides essential for virus assembly and replication (Knossl et al., 1999; Lower et al., 1993). The HTDV U3 promoter is highly specific for the transcriptional environment in testicular tumor cells (Lower et al., 1996). Furthermore, the pathogenic processes of certain autoim- mune diseases, such as systemic lupus erythematosus, insulin- dependent diabetes mellitus, and multiple sclerosis, have been associated with altered expression of ERVs (Conrad et al., 1997; Deas et al., 1998; Wang et al., 2001). In particular, expression of the HERV-W encoded envelope glycoprotein, called syncytin, was increased in the brain of multiple sclerosis patients (Antony et al., 2004). Syncytin expression in astrocytes was responsible for neuroinflammation, resulting in demyelination and the death Available online at www.sciencedirect.com Virology 373 (2008) 263 273 www.elsevier.com/locate/yviro Corresponding author. Fax: +1 916 453 2288. E-mail address: [email protected] (K. Cho). 1 Young-Kwan Lee and Alex Chew contributed to this project equally. 0042-6822/$ - see front matter © 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.virol.2007.10.043

Transcript of Genome-wide expression profiles of endogenous retroviruses in lymphoid tissues and their biological...

Available online at www.sciencedirect.com

8) 263–273www.elsevier.com/locate/yviro

Virology 373 (200

Genome-wide expression profiles of endogenous retroviruses in lymphoidtissues and their biological properties

Young-Kwan Lee 1, Alex Chew 1, Ho Phan, David G. Greenhalgh, Kiho Cho ⁎

Burn Research, Shriners Hospitals for Children and Department of Surgery, University of California, Davis, 2425 Stockton Blvd., Sacramento, CA 95817, USA

Received 14 July 2007; returned to author for revision 21 August 2007; accepted 30 October 2007Available online 9 January 2008

Abstract

Endogenous retroviruses (ERVs) constitute approximately 8–10% of the human and mouse genome. Some autoimmune diseases are attributedto the altered expression of ERVs. In this study, we examined the ERV expression profiles in lymphoid tissues and analyzed their biologicalproperties. Tissues (spleen, thymus, and lymph nodes [axillary, inguinal, and mesenteric]) from C57BL/6J mice were analyzed for differentialmurine ERV (MuERV) expression by RT-PCR examination of polymorphic U3 sequences. Each tissue had a unique profile of MuERVexpression.A genomic map identifying 60 putative MuERVs was established using 22 unique U3s as probes and their biological properties (primer bindingsite, coding potential, transcription regulatory element, tropism, recombination event, and integration age) were characterized. Interestingly, 12putative MuERVs retained intact coding potentials for all three polypeptides essential for virus assembly and replication. We suggest that MuERVexpression is differentially regulated in conjunction with the transcriptional environment of individual lymphoid tissues.© 2007 Elsevier Inc. All rights reserved.

Keywords: Endogenous retrovirus; Spleen; Thymus; Lymph node; Coding potential; Transcription; Genome

Introduction

Infection of germline cells with retroviruses leads to theirpermanent colonization into the germline genome. These germ-line-integrated retroviruses are transmitted vertically to theoffspring in a Mendelian fashion and are called endogenousretroviruses (ERVs), in contrast to exogenous retroviruses, whichare acquired from external surroundings and transmitted hori-zontally. ERVs represent collections of retroviral proviruses in-troduced into the germline and accumulated throughout an entireset of generations.

ERVs and other forms of transposable repetitive elementsmake up ∼45% of the human and mouse genomes (Urnovitzand Murphy, 1996). Among them, ∼8% of the human genomeand ∼10% of the mouse genome consist of ERVs (Griffiths,2001; Lander et al., 2001; Waterston et al., 2002). Although themajority of ERVs are defective primarily due to deletions and

⁎ Corresponding author. Fax: +1 916 453 2288.E-mail address: [email protected] (K. Cho).

1 Young-Kwan Lee and Alex Chew contributed to this project equally.

0042-6822/$ - see front matter © 2007 Elsevier Inc. All rights reserved.doi:10.1016/j.virol.2007.10.043

point mutations leading to a disruption of the coding potential ofthe gag, pol, and/or env genes, some are characterized to bebiologically active.

Most ERVs have the potential to modulate the expression ofhost genes near the integration site, for example, the transcrip-tional control of the amylase gene in the salivary gland(Samuelson et al., 1996). The HTDV (human teratocarcinoma-derived virus), a member of the human ERV (HERV)-K family,produces type C retroviral particles and retains the codingpotential for retroviral polypeptides essential for virus assemblyand replication (Knossl et al., 1999; Lower et al., 1993). TheHTDV U3 promoter is highly specific for the transcriptionalenvironment in testicular tumor cells (Lower et al., 1996).

Furthermore, the pathogenic processes of certain autoim-mune diseases, such as systemic lupus erythematosus, insulin-dependent diabetes mellitus, and multiple sclerosis, have beenassociated with altered expression of ERVs (Conrad et al., 1997;Deas et al., 1998; Wang et al., 2001). In particular, expression ofthe HERV-W encoded envelope glycoprotein, called syncytin,was increased in the brain of multiple sclerosis patients (Antonyet al., 2004). Syncytin expression in astrocytes was responsiblefor neuroinflammation, resulting in demyelination and the death

Fig. 1. RT-PCR analyses of the MuERVexpression profiles in various lymphoidtissues. Five different lymphoid tissues (spleen, thymus, lymph nodes [axillary,inguinal, and mesenteric]) from C57BL/6J mice were subjected to RT-PCRanalysis of MuERV expression using the primer set (ERV-U1 and ERV-U2).Tissue-specific MuERV expression profile is evident. Four different fragmentsfrom each tissue, which were subjected to further analyses, are indicated (I, II,III, and IV). The genomic MuERV U3 profile serves as a reference. Bottompanel depicts locations of primers on a schematic MuERV.

264 Y.-K. Lee et al. / Virology 373 (2008) 263–273

of oligodendrocytes. The proinflammatory properties of syn-cytin demonstrated a unique role of an ERV protein in a path-ologic process in humans. In addition, recent studies from ourlaboratory provide evidence that burn-elicited stress signals canalter the transcriptional activities of certain murine ERVs(MuERVs) in distant organs of mice (Aziz et al., 1989; Choet al., 2002; Cho and Greenhalgh, 2003). Interestingly, some ofthese burn-elicited MuERVs are structurally similar to theMAIDS (murine acquired immunodeficiency syndrome)-inducing virus.

It is likely that the unique transcriptional environment in eachcell or tissue type directly influences the MuERV expressionprofile. The genome-wide distribution of MuERVs and theirdifferential expression in various types of tissues and cells maybe directly networked to a vast range of signaling events con-trolling normal physiologic as well as pathologic processes. Twomain types of MuERVs, murine leukemia virus (MuLV) andmouse mammary tumor virus (MMTV), are reported to beinvolved in various pathophysiologic processes involving im-mune organs (Choi et al., 1987; King and Corley, 1990). In thisstudy, we examined the MuERV expression profiles in variouslymphoid tissues of mice and characterized their biologicalproperties.

Results

Differential expression of MuERVs among various lymphoidtissues

We postulate that MuERV expression profiles vary depend-ing on tissue type mainly due to the unique composition of thetranscriptional environment within the diverse cell populationscomprising each tissue. In this study, the MuERV expressionprofile in five different lymphoid tissues (spleen, thymus, andlymph nodes [axillary, inguinal, and mesenteric]) were exam-ined in female C57BL/6J mice. The NCBI (National Center forBiotechnology Information) mouse genome database, which isderived from the C57BL/6J strain, was used for comparativeanalysis. RT-PCR analyses of the MuERV expression profileswere performed by amplifying the polymorphic U3 regionwithin the 3′ long terminal repeats (LTRs). Electrophoreticanalysis of amplified U3 products revealed a unique MuERVexpression profile within each lymphoid tissue examined(Fig. 1). There were variations in the length (ranging from∼470 bp to ∼750 bp) as well as in the intensity of amplifiedMuERV U3s. Interestingly, the genomic MuERV profile (sizeand intensity of amplified U3s) was substantially different fromthe expression profiles of all lymphoid tissues examined. Onlyfour different MuERV U3 fragments (labeled as I, II, III, andIV), which were present in all five tissues, were selected for adirect comparison of differential expression levels followed bydownstream characterization of biological properties. Furtherinvestigation into the other MuERV U3 fragments, which werepresumed to be derived from certain MuERVs expressed only insome tissues examined, may provide additional information inregard to differential expression of MuERVs in lymphoidtissues.

Multiple alignment and phylogenetic analyses of differentiallyexpressed MuERV U3 sequences

Cloning and sequencing analyses of four different U3 frag-ments (I, II, III, and IV) from all five lymphoid tissues yielded43 MuERV U3 sequences which represent at least two se-quences from each fragment (Fig. 2A). These U3 sequenceswere subjected to multiple alignment and phylogeneticanalyses. Since the amplified U3 regions include ∼117 bpupstream and downstream of the exact U3 sequence, theseadditional nucleotides were trimmed prior to the alignmentanalyses. The initial multiple alignment analysis yielded 22unique U3s (24 U3s are shown to include at least one sequencefrom each fragment) (Fig. 2B). The U3 sequences wereapparently grouped into six main branches coinciding withthe differences in their sizes (ranging from 346 bp to 601 bp)(Fig. 2A). In addition, the Th(IV)-1 U3 sequence formed aunique branch, which is consistent with its unique size of 392 bpand absence of a direct repeat (1/1⁎) (Fig. 2B), although it hadhigh sequence similarities with the branches of Th(III)-1 andmLN(IV)-2. Multiple alignment of the 24 unique U3 sequencesrevealed that both the 5′-end and 3′-end were well-conservedand the middle region, which includes a single 190 bp insertion,was hypervariable (Fig. 2B). It has been established that the

265Y.-K. Lee et al. / Virology 373 (2008) 263–273

cellular tropism of MuERVs can be determined by surveyingspecific sequence features (e.g., direct repeat, unique region)within the U3 promoter (Tomonaga and Coffin, 1998; Tomonagaand Coffin, 1999). Putative cellular tropisms of the unique U3sequences were determined by the examination of four directrepeat regions (1/1⁎, 4/4⁎, 5/5⁎, and 6/6⁎), a single 190 bpinsertion, and two unique regions (2 and 3) (Fig. 2B and Table 1).There were 15 xenotropic and nine polytropic/modified poly-tropic U3 sequences.

Profile of transcription regulatory elements on individualMuERV U3 promoters

It is expected that the 3′ U3 sequences are almost identical tothe 5′ U3 sequences, the latter which serve as promoters. Toexamine the transcription potential of the 22 unique MuERV U3

Fig. 2. A: Phylogenetic analysis of differentially expressed MuERV U3 sequenceestablished. The confidence values at the branch nodes represent the percentage suppotissues. Sp (spleen), Th (thymus), aLN (axillary lymph node), iLN (inguinal lymdifferentially expressed MuERV U3 sequences. Differentially expressed MuERV U3U3 sequences. Yellow (100% homology), white (low homology), blue (conserved s(1/1⁎, 4/4⁎, 5/5⁎, and 6/6⁎), unique regions (2 and 3), and TATA box are indicate

sequences (3′ U3s), the putative transcription regulatory ele-ments were mapped on each U3 (Table 2). Only the tran-scription regulatory elements with a core similarity (comparedto conserved sequences) of greater than 90% were selected. Thismapping study yielded 71 transcription regulatory elements,which included binding sites for the glucocorticoid receptor andNF-κB. Although some elements were shared by the majority ofthe U3 sequences examined, other elements were unique tocertain U3 promoters. For instance, a glucocorticoid responseelement (GRE), binding site for glucocorticoid receptor (high-lighted with ⁎), was present only in the iLN(II)-2 U3 promoter.A binding site for signal transducers and activators of tran-scription 3 (STAT3) (highlighted with ⁎) was mapped only inthe Th(IV)-1 U3 promoter. It will be of interest to test whetherthe putative GRE responds to stimulation by glucocorticoids invitro and in vivo. On the other hand, a binding site for NF-κB

s. A phylogenetic tree of differentially expressed MuERV U3 sequences wasrt of a particular branching. Various colored shapes represent different lymphoidph node), mLN (mesenteric lymph node). B: Multiple alignment analysis ofsequences were subjected to a multiple alignment analysis and yielded 22 uniqueequences), and dash (absence of sequences). The locations of the direct repeatsd in dotted boxes. An insertion of 190 bp is identified in the middle.

Fig. 2 (continued ).

266 Y.-K. Lee et al. / Virology 373 (2008) 263–273

(highlighted with ⁎), a key transcription factor involved ininflammatory response, was identified in five different U3promoters (Th(IV)-1, Th(I)-2, iLN(I)-1, Sp(I)-2, and aLN(I)-1).

The unique profiles of transcription regulatory elements on eachU3 promoter suggest their differential transcription potentials ina given transcriptional environment.

Table 1Tropism analysis of 22 unique MuERV U3 sequences isolated from five different lymphoid tissues

U3s Direct repeat/unique region Tropism

1/1⁎ 2 3 4/4⁎ 5/5⁎ 6/6⁎

iLN(V)-1 X-II, X-III X-I, X-II, X-IV, P-II, P-III, P-V d X-III d X-I, X-II, X-III, X-IV, Poly X-IIISp(V)-1 X-II, X-III X-I, X-II, X-IV, P-II, P-III, P-V d X-III d X-I, X-II, X-III, X-IV, Poly X-IIIaLN(V)-1 X-II, X-III X-I, X-II, X-IV, P-II, P-III, P-V d X-III d X-I, X-II, X-III, X-IV, Poly X-IIImLN(V)-1 X-II, X-III . d X-III d X-I, X-II, X-III, X-IV, Poly X-IIITh(V)-1 X-II, X-III X-I, X-II, X-IV, P-II, P-III, P-V d X-III d X-I, X-II, X-III, X-IV, Poly X-IIITh(III)-1 X-II, X-III X-III, P-I, P-IV d X-II, X-IV X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III X-IIiLN(III)-1 X-II, X-III X-III, P-I, P-IV d X-II, X-IV X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III X-IITh(III)-2 X-II, X-III X-III, P-I, P-IV d X-II, X-IV X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III X-IImLN(IV)-2 X-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly X-II X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly X-IISp(IV)-1 X-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly X-II X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly X-IImLN(III)-1 X-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly X-II X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly X-IImLN(IV)-1 X-III · Xeno/Poly/mPoly X-II X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly X-IImLN(III)-2 X-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly X-II X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly X-IIaLN(IV)-1 X-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly X-II X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly X-IIiLN(IV)-2 X-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly X-II X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly X-IITh(IV)-1 · X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly P-IV X-II, X-III, X-IV, P-I, P-IV X-I, X-II, X-III, X-IV, Poly P-IVTh(I)-2 P-II, P-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly P-II P-I, P-II, P-III P-II P-IIiLN(I)-1 P-II, P-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly P-II P-I, P-II, P-III P-II P-IISp(I)-2 P-II, P-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly P-II P-I, P-II, P-III P-II P-IIaLN(I)-1 P-II, P-III X-I, X-II, X-IV, P-II, P-III, P-V Xeno/Poly/mPoly P-II P-I, P-II, P-III P-II P-IITh(II)-1 P-I X-III, P-I, P-IV Xeno/Poly/mPoly P-I P-I, P-II, P-III X-I, X-II, X-III, X-IV, Poly P-IiLN(IV)-2 P-I X-III, P-I, P-IV Xeno/Poly/mPoly P-I P-I, P-II, P-III X-I, X-II, X-III, X-IV, Poly P-ImLN(I)-2 P-I X-III, P-I, P-IV Xeno/Poly/mPoly P-I P-I, P-II, P-III X-I, X-II, X-III, X-IV, Poly P-IaLN(II)-1 P-I X-III, P-I, P-IV Xeno/Poly/mPoly P-I P-I, P-II, P-III X-I, X-II, X-III, X-IV, Poly P-I

Dot indicates absence of a homology to reference sequences. Underscore represents the reference type closest to the individual U3 sequences examined. ⁎Direct repeatregion.

267Y.-K. Lee et al. / Virology 373 (2008) 263–273

Genomic localization of putative MuERVs using the U3sequences as a probe and characterization of their biologicalproperties

Genomic localization of putative MuERVsIn order to investigate the genome-wide distribution of

MuERVs harboring the U3 sequences identified in various lym-phoid tissues, theNCBImouse genome databasewas probedwithindividual U3 sequences. Subsequently, the putative MuERVswith a homology of greater than 98% relative to the U3 probe andapproximately 5 kb to 9 kb in size were selected for furtheranalyses (Table 3). The defective MuERVs with substantialdeletions in pol and env genes tend to be approximately 5 kb andthe full-length MuERVs are about 9 kb. Genomic probing usingthe 22 unique U3 sequences revealed 60 defective or full-lengthMuERVs dispersed throughout the entire genome in both strandsexcept for on chromosomes 17 and Y. Chromosomal location,including proviral size and strand orientation, of these putativeMuERVs are summarized in Table 3.

Open reading frame (ORF)The structures of the putativeMuERVs identified in this study

were determined by comparison to full-length murine leukemiavirus (MuLV) reference sequences (GenBank accession nos.AF033811 and DQ241301). ORF analyses revealed that 12 ofthe 60 putative MuERVs were full-length and had intact codingpotentials for all three retroviral polypeptides (gag, pol, and env)(highlighted in gray, Table 3). Six putativeMuERVs (highlightedin yellow) had an insertion of a nucleotide (cytosine) within

various poly C regions resulting in a defective gag polypeptidedue to an insertional frame shift. The locations (nucleotidenumber from the 5′-end of the provirus) of the insertions withinthese MuERVs are indicated in parentheses: Th(I)-2a 99%(1,700), Th(I)-2b 98% (1,701), aLN(II)-1a 99% (1,738), Th(II)-1a 100% (1,657), aLN(II)-1p 99% (1,738), and aLN(II)-1h 99%(1,657). On the other hand, putative MuERV Th(I)-2g 99% hada deletion of a cytosine at 1,657 resulting in a frameshift of thegag polypeptide. It is possible that the insertion or deletion of asingle cytosine in the poly C regions may be attributableto a sequencing error in the NCBI database. In addition,“ATG→GTG” mutations were observed in 26 putativeMuERVs (highlighted in blue, Table 3) at the first start codon(ATG) of the reverse transcriptase (RT) region within the polgene, resulting in a loss of three amino acids from the start ofthe RT protein. The putative MuERV Th(II)-1r 99% contains agreen box under the gag column because the gag ORF wasdeemed complete, despite the one amino acid deletion withinthe p12 region.

Primer binding site (PBS)Analyses of PBSs for all 60 putative MuERVs revealed that

59 proviruses contained a glutamine (Q) tRNA binding site,while the Th(IV)-1a 100% putative MuERV contained a proline(P) tRNA binding site (Table 3).

Recombination event and integration ageOnly 6 out of 60 putative MuERVs had LTR mutations

(5′ LTR sequence compared to 3′ LTR sequence) ranging from

Table 2Transcription regulatory elements in 22 unique MuERV U3 promoter sequences isolated from five different lymphoid tissues

The number in the box indicates frequency of a transcriptional regulatory element within a given U3 promoter sequence, and the gray box indicates no occurrence of the specific transcriptional regulatory element.Some transcription regulatory elements discussed in the results section are highlighted with ⁎ and bold-face.

268Y.-K

.Lee

etal.

/Virology

373(2008)

263–273

Table 3In silico cloning and biological properties of putative MuERVs differentially expressed in lymphoid tissues

Gray indicates putative full-length MuERVs. Two putative MuERVs mapped by only one U3 probe are highlighted in purple. Absence of direct repeat sequences isindicated with “/” between two sequences. Yellow indicates putative MuERVs with a cytosine insertion. Blue indicates putative MuERVs with ATG→GTG mutationat the RT start site. Green indicates that the gag was considered intact despite one amino acid deletion within p12. PBS (primer binding site): P (tRNAproline) andQ (tRNAglutamine). ND (not determined). Coding potential: intact (+), partial (P), and defective (−). NA (lacking a proper start codon).

269Y.-K. Lee et al. / Virology 373 (2008) 263–273

270 Y.-K. Lee et al. / Virology 373 (2008) 263–273

0.1349% to 0.2869% (Table 3). The putative iLN(III)-1a 98%MuERV had a deletion of 39 nucleotides in the middle of the5′ LTR in comparison to its 3′ counterpart and the mutation ratewas not calculated. The integration age of these MuERVs wascalculated (ranging from 1.038 million years [MYr] to 2.2072MYr) based on a formula of “0.13% mutation rate between twoflankingLTRs/oneMYr” (Sverdlov, 2000). To determinewhetherthere were genetic rearrangements by recombination during thelifespan of the putative MuERVs, we examined the presence orabsence of a direct repeat sequence flanking both ends of theproviral sequences. Direct repeats are formed during initial inte-gration of proviruses and any downstream recombination eventsresult in two unique sequences instead. There were 10 putativeMuERVs with unique sequences flanking the integration sites,suggesting recombination events, and the rest had direct repeatsof four nucleotides (49 putative MuERVs) or 12 nucleotides (oneputative MuERV: aLN(II)-1h 99%) (Table 3).

Tropism analysis by restriction fragment length polymorphism(RFLP)

Cellular tropisms of 12 full-length putative MuERVs withintact coding potentials were determined by in silico RFLPusing three restriction enzymes (BamHI, EcoRI, and HindIII)(Stoye and Coffin, 1988). It revealed six polytropic and sixmodified polytropic putative MuERVs (Fig. 3). It needs to benoted that although the putative tropism traits described inthis study are likely to be correct, it may be necessary todetermine each MuERV's tropism by an in vitro infectionstudy.

Fig. 3. RFLP tropism analysis of the putative full-length MuERVs with intact codingby in silico RFLP analysis using three restriction enzymes (BamHI, EcoRI, and Hin

Discussion

The findings from this study demonstrated that certainMuERVs were actively transcribed and their expression profileswere specific for each lymphoid tissue examined. Furtherstudies may be necessary to determine the MuERV expressionprofile in various subsets of cells within each lymphoid tissue.Individual lymphoid tissues and cell types are presumed to haveunique characteristics specific to their transcriptional environ-ment, such as transcription factor pool (Schon et al., 2001; vanOpijnen et al., 2004). Considering the tissue and cell type-specific transcription factor pool, in conjunction with the uniqueset of transcription regulatory elements within each MuERV U3promoter, the findings of differential MuERV expression invarious lymphoid tissues in this study are readily predictable.Moreover, the transcription factor profile within each tissue orcell type is affected by its own pathophysiologic status and thesurrounding environment, such as systemic immune modulationand carcinogenesis (Boral et al., 1989).

Due to the polymorphic nature of the MuERV U3 sequences,each U3 promoter is likely to harbor a unique transcriptionpotential, primarily determined by its profile of binding sites fortranscription factors. It suggests that changes in the transcriptionenvironment due to stress (e.g., injury, infection) will lead todifferential expression of certain MuERVs in different tissues.The viral gene products and viral replication itself associatedwith the altered MuERV expression may participate in a rangeof pathophysiologic activities in a tissue- and cell type-specificmanner.

potentials. Cellular tropisms of 12 putative full-length MuERVs were determineddIII). Each restriction site is indicated in parenthesis on schematic MuERVs.

271Y.-K. Lee et al. / Virology 373 (2008) 263–273

In silico cloning of the putativeMuERVs in the genome of theC57BL/6J strain using the U3 sequences as a probe allowed us tocharacterize their biological properties such as coding potential,replication competency, and tropism. The results from this studydemonstrated that some of the putative MuERVs are full-lengthwith intact ORFs for the gag, pol, and env genes and are pre-sumably replication-competent. Further functional characteriza-tion of these putativeMuERVs is essential to understanding theirroles in normal physiology and pathology of individual lym-phoid tissues and cells. In particular, the putative MuERVsretaining intact coding potentials for gag, pol, and/or env maybe cloned from the C57BL/6J genome into an expression vectorfor functional analyses in vitro (e.g., cell tropism, viral geneexpression, cytokine production, cytotoxicity) and in vivo (e.g.,infection of virions into mice followed by examination of theireffects on the immune system). In addition, the biological pro-perties of individual gene products, primarily gag and envproteins, from each putative MuERV may be investigated invitro using overexpression as well as knock-out protocols.

Retroviral U3 promoters are capable of controlling the ex-pression of host genes adjacent to the integration sites throughcertain transcription regulatory elements, such as enhancers andnegative regulatory elements (Trusko et al., 1989; Yu et al.,2005). It will be of interest to investigate whether the U3 pro-moters of putative MuERVs identified in this study play a role inthe transcriptional activities of neighboring host genes.

It may be reasonable to postulate that MuERVs, distributedthroughout the genome, play crucial and differential roles innormal physiologic and pathologic processes in various lym-phoid tissues and cells. The specificity of MuERVexpression isprimarily dependent upon the polymorphic U3 promoter se-quences and the specific transcriptional environment in eachtissue/cell type. Understanding the biological characteristics oftissue-specific MuERVs and their relationship with neighboringgenes will broaden insight into their roles in a range of patho-physiologic events pertaining to individual lymphoid tissuesand cells.

Materials and methods

Animals

Female C57BL/6J mice from Jackson Laboratories (BarHarbor, ME) were housed according to the guidelines of theNational Institutes of Health. The Animal Use and Care Ad-ministrative Advisory Committee of the University of Califor-nia, Davis, approved the experimental protocol. Threemiceweresacrificed by cervical dislocation for lymphoid tissue collectionwithout any pretreatments.

RT-PCR analysis

Total RNA isolation and cDNA synthesis were performedbased on protocols described previously (Cho et al., 2000).Briefly, total RNA was extracted using a RNeasy kit (Qiagen,Valencia, CA). Total RNA (100 ng) from each tissue samplewas subjected to reverse transcription. The sequence of the

oligo-dT primer was as follows: 5′-GGC CAC GCG TCG ACTAGTACT TTT TTT TTT TTT TTT T-3′. Primers, ERV-U1 (5′-CGG GCG ACT CAG TCTATC GG-3′) and ERV-U2 (5′-CAGTAT CAC CAA CTC AAATC-3′) were used to amplify the U3regions of nonecotropic MuERVs. The primers for β-actinamplification were 5′-CCA ACT GGG ACG TGG AA-3′ and5′-GTA GAT GGG CAC AGT GTG GG-3′. The comparabilitybetween samples was determined by both the electrophoresisof an equal amount (500 ng) of total RNA and the RT-PCRamplification of β-actin from each sample.

Cloning of MuERV U3 sequences

PCR products were gel purified using the QIAquick GelExtraction kit (Qiagen) and cloned into the pGEMT-Easy vector(Promega,Madison,WI). PlasmidDNAs for sequencing analysiswere prepared using a miniplasmid kit from Qiagen. Sequencingwas performed at Davis Sequencing Inc. (Davis, CA).

Multiple alignment and phylogenetic tree analysis

The resulting 43 MuERV U3 sequences were aligned usingVector NTI Advance 10 (Invitrogen, Carlsbad, CA) program toidentify unique U3 sequences. A phylogentic tree was obtainedusing the neighbor-joining protocol within the MEGA3 pro-gram (Kumar et al., 2004; Saitou and Nei, 1987). Bootstrapevaluation of the branching pattern was performed with 100replications.

Tropism analysis

The putative tropism of unique U3 sequences was deter-mined by comparison to the reference sequences (direct repeats,unique region) first reported by Tomonaga et al. (Tomonagaand Coffin, 1998; Tomonaga and Coffin, 1999). A total of fourdirect repeats (1/1⁎, 4/4⁎, 5/5⁎, and 6/6⁎), a single 190 bpinsertion, and two unique sequences (2 and 3) were utilized forthe tropism analysis.

Analysis of transcription regulatory elements

The 22 unique U3 sequences were analyzed for transcriptionregulatory elements using the MatInspector program (Genoma-tix, Munich, Germany). The core similarity was set to 0.9 andthe matrix similarity was optimized within the vertebrate matrixgroup. The U3 sequences are organized in Table 2 to match theorder within the multiple alignment and phylogenetic tree.

In silico mapping and cloning of putative MuERVs

Putative MuERV sequences were identified by probing theNCBI mouse genome database using the 22 unique U3 se-quences with the NCBI Megablast program. The key referencefor identifying viral sequences of interest was a 5–9 kb se-quence region flanked by LTRs, which harbor U3 sequences.The following parameters were used with the Megablast pro-gram: in the options section “NC_000067:NC_000087” was

272 Y.-K. Lee et al. / Virology 373 (2008) 263–273

entered into the “limited by entrez query” field. The filters wereset to none while the percent identity, match, mismatch scoreswere set to 98, 1, and −3, respectively.

MuERV provirus nomenclature

The putative MuERV proviruses were named after the U3probes/sequences used in Megablast (e.g., Th(I)-2), followed bythe percentage homology of the blast hits (e.g., Th(I)-2 99%).The letters following the U3 probe/sequence (e.g., Th(I)-2a99%) are to differentiate multiple hits found using the U3 probewith similar blast percentage hits.

ORF analysis

The ORFs of each putative MuERV were analyzed usingthe ORF search feature within Vector NTI (Invitrogen). Theparameterswere set at aminimumof 50 codonswith “ATG” as thestart codon, and each candidate ORF was translated. The aminoacid translations were then compared using Vector NTI AlignX(Invitrogen) to the following MuLV references retrieved fromNCBI (GenBank accession nos. M17327, AY219567.2,AA037285, DQ241301, and AF033811). The criteria for defin-ing the intactness of each proviral gene depended on the p12region of gag, RT (reverse transcriptase) of pol and SU (surfacedomain) of env. Proviral genes were deemed intact (+) if theaforementioned sequences were intact and the remaining aminoacid sequence of each respective gene matched one of the refer-ence sequences, while allowing for missense mutations. Proviralgenes were classified as partial (p) if the defining sequences wereintact but the remaining proviral gene sequences were defec-tive. Defective (−) proviral gene sequences contained amino aciddeletions and premature stop codons in addition to improper startcodons leading to defective defining coding sequences.

Integration age and recombination event

5′ and 3′ LTR sequences of the putative MuERVs werecompared using the Vector NTI AlignX program (Invitrogen).The integration age was calculated based on a formula of “0.13%mutation rate between two flanking LTRs per one MYr”. In casethere is only a single nucleotide difference between two flankingLTRs, its integration age is recorded as less than the estimatedage in consideration of potential error during cloning andsequencing. To examine the genomic rearrangement betweenMuERVs as well as with other parts of the genome, a shortstretch of sequences (4 bp to 12 bp) flanking each MuERV wassurveyed for a direct repeat, which is formed during the initialproviral integration.

PBS analysis

A stretch of 18 bp, located immediately downstream of the5′ U5 region, was examined to determine PBSs for the putativeMuERVs. The conserved PBS sequences for tRNAProline (P) andtRNAGlutamine (Q) were used as references (Harada et al., 1979;Nikbakht et al., 1985).

RFLP tropism analysis of full-length MuERVs

Tropism of the putative full-length MuERVs was determinedby in silico RFLP analysis using three restriction enzymes,BamHI, EcoRI, and HindIII, using Vector NTI (Invitrogen).The RFLP data of each putative MuERV were compared to thereference profiles for each tropism (ecotropic, xenotropic, poly-tropic, and modified polytropic) reported previously (Stoye andCoffin, 1988).

Acknowledgments

This study was supported by grants from Shriners of NorthAmerica (No. 8680 to KC) and National Institutes of Health(R01 GM071360 to KC).

References

Antony, J.M., van Marle, G., Opii, W., Butterfield, D.A., Mallet, F., Yong, V.W.,Wallace, J.L., Deacon, R.M., Warren, K., Power, C., 2004. Human endo-genous retrovirus glycoprotein-mediated induction of redox reactants causesoligodendrocyte death and demyelination. Nat. Neurosci. 7 (10), 1088–1095.

Aziz, D.C., Hanna, Z., Jolicoeur, P., 1989. Severe immunodeficiency diseaseinduced by a defectivemurine leukaemia virus. Nature 338 (6215), 505–508.

Boral, A.L., Okenquist, S.A., Lenz, J., 1989. Identification of the SL3-3 virusenhancer core as a T-lymphoma cell-specific element. J. Virol. 63 (1),76–84.

Cho, K., Greenhalgh, D., 2003. Injury-associated induction of two novel andreplication-defective murine retroviral RNAs in the liver of mice. Virus Res.93 (2), 189–198.

Cho, K., Zipkin, R.I., Adamson, L.K., McMurtry, A.L., Griffey, S.M., Green-halgh, D.G., 2000. Differential regulation of c-jun expression in liver andlung of mice after thermal injury. Shock 14 (2), 182–186.

Cho, K., Adamson, L.K., Greenhalgh, D.G., 2002. Induction of murine AIDSvirus-related sequences after burn injury. J. Surg. Res. 104 (1), 53–62.

Choi, Y.W., Henrard, D., Lee, I., Ross, S.R., 1987. The mouse mammary tumorvirus long terminal repeat directs expression in epithelial and lymphoid cellsof different tissues in transgenic mice. J. Virol. 61 (10), 3013–3019.

Conrad, B.,Weissmahr, R.N., Boni, J., Arcari, R., Schupbach, J.,Mach, B., 1997.A human endogenous retroviral superantigen as candidate autoimmune genein type I diabetes. Cell 90 (2), 303–313.

Deas, J.E., Liu, L.G., Thompson, J.J., Sander,D.M., Soble, S.S.,Garry,R.F.,Gallaher,W.R., 1998. Reactivity of sera from systemic lupus erythematosus and Sjogren'ssyndrome patients with peptides derived from human immunodeficiencyvirus p24 capsid antigen. Clin. Diagn. Lab. Immunol. 5 (2), 181–1815.

Griffiths, D.J., 2001. Endogenous retroviruses in the human genome sequence.Genome Biol. 2 (6) REVIEWS1017.

Harada, F., Peters, G.G., Dahlberg, J.E., 1979. The primer tRNA for Moloneymurine leukemia virus DNA synthesis. Nucleotide sequence and aminoa-cylation of tRNAPro. J. Biol. Chem. 254 (21), 10979–10985.

King, L.B., Corley, R.B., 1990. Lipopolysaccharide and dexamethasone inducemouse mammary tumor proviral gene expression and differentiation in Blymphocytes through distinct regulatory pathways. Mol. Cell. Biol. 10 (8),4211–4220.

Knossl, M., Lower, R., Lower, J., 1999. Expression of the human endogenousretrovirus HTDV/HERV-K is enhanced by cellular transcription factor YY1.J. Virol. 73 (2), 1254–1261.

Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: integrated software for mole-cular evolutionary genetics analysis and sequence alignment. Brief Bioin-form. 5 (2), 150–163.

Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody,M.C., Baldwin, J., et al.,2001. Initial sequencing and analysis of the human genome. Nature 409 (6822),860–921.

Lower, R., Boller, K., Hasenmaier, B., Korbmacher, C., Muller-Lantzsch, N.,Lower, J., Kurth, R., 1993. Identification of human endogenous retroviruses

273Y.-K. Lee et al. / Virology 373 (2008) 263–273

with complex mRNA expression and particle formation. Proc. Natl. Acad.Sci. U. S. A. 90 (10), 4480–4884.

Lower, R., Lower, J., Kurth, R., 1996. The viruses in all of us: characteristics andbiological significance of human endogenous retrovirus sequences. Proc.Natl. Acad. Sci. U. S. A. 93 (11), 5177–5184.

Nikbakht, K.N., Ou, C.Y., Boone, L.R., Glover, P.L., Yang, W.K., 1985. Nucleo-tide sequence analysis of endogenous murine leukemia virus-related proviralclones reveals primer-binding sites for glutamine tRNA. J. Virol. 54 (3),889–893.

Saitou, N., Nei, M., 1987. The neighbor-joining method: a new method forreconstructing phylogenetic trees. Mol. Biol. Evol. 4 (4), 406–425.

Samuelson, L.C., Phillips, R.S., Swanberg, L.J., 1996. Amylase gene structuresin primates: retroposon insertions and promoter evolution. Mol. Biol. Evol.13 (6), 767–779.

Schon, U., Seifarth, W., Baust, C., Hohenadl, C., Erfle, V., Leib-Mosch, C.,2001. Cell type-specific expression and promoter activity of human endo-genous retroviral long terminal repeats. Virology 279 (1), 280–291.

Stoye, J.P., Coffin, J.M., 1988. Polymorphism of murine endogenous provirusesrevealed by using virus class-specific oligonucleotide probes. J. Virol. 62(1), 168–175.

Sverdlov, E.D., 2000. Retroviruses and primate evolution. Bioessays 22 (2),161–171.

Tomonaga, K., Coffin, J.M., 1998. Structure and distribution of endogenousnonecotropicmurine leukemia viruses inwildmice. J. Virol. 72 (10), 8289–8300.

Tomonaga, K., Coffin, J.M., 1999. Structures of endogenous nonecotropicmurine leukemia virus (MLV) long terminal repeats in wild mice: impli-cation for evolution of MLVs. J. Virol. 73 (5), 4327–4340.

Trusko, S.P., Hoffman, E.K., George, D.L., 1989. Transcriptional activation ofcKi-ras proto-oncogene resulting from retroviral promoter insertion. NucleicAcids Res. 17 (22), 9259–9265.

Urnovitz, H.B., Murphy, W.H., 1996. Human endogenous retroviruses: nature,occurrence, and clinical implications in human disease. Clin. Microbiol.Rev. 9 (1), 72–99.

van Opijnen, T., Jeeninga, R.E., Boerlijst, M.C., Pollakis, G.P., Zetterberg, V.,Salminen, M., Berkhout, B., 2004. Human immunodeficiency virus type 1subtypes have a distinct long terminal repeat that determines thereplication rate in a host–cell-specific manner. J. Virol. 78 (7),3675–3683.

Wang, Y., Pelisson, I., Melana, S.M., Go, V., Holland, J.F., Pogo, B.G., 2001.MMTV-like env gene sequences in human breast cancer. Arch. Virol. 146(1), 171–180.

Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal,P., et al., 2002. Initial sequencing and comparative analysis of the mousegenome. Nature 420 (6915), 520–562.

Yu, X., Zhu, X., Pi, W., Ling, J., Ko, L., Takeda, Y., Tuan, D., 2005. The longterminal repeat (LTR) of ERV-9 human endogenous retrovirus binds to NF-Yin the assembly of an active LTR enhancer complex NF-Y/MZF1/GATA-2.J. Biol. Chem. 280 (42), 35184–35194.