A highly conserved human gene encoding a novel member of WD-repeat family of proteins (WDR13)

14
A highly conserved human gene encoding a novel member of WD-repeat family of proteins (WDR13) Bhupendra N. Singh, Amritha Suresh, Gogineni UmaPrasad, Subbaya Subramanian, Mehar Sultana, Sandeep Goel, Satish Kumar, and Lalji Singh* Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad 500 007, India Received 15 November 2002; accepted 6 December 2002 Abstract We have identified and characterized a novel member of the WD-repeat motif gene family, WDR13, which contains 9 exons and 8 introns. The gene has been mapped to the genomic locus Xp11.23 by fluorescent in situ hybridization and in silico mapping. Sequence analysis has revealed a continuous open reading frame (ORF) encoding for 485 amino acids with six WD motifs. The expression of this gene has been detected in all the tissues analyzed with significantly varied expression levels among the tissues studied. Analysis of EST clones from various tissues, showing significant homology to WDR13, has identified two spliced variants. The transcription start point has been mapped. Promoter analysis has identified high activity in the 5 UTR, which interestingly showed a testis-specific activity in the transgenic animals studied. The subcellular localization of the WDR13 protein in the nucleus suggests that it may also have a regulatory role in nuclear function along with protein-protein interaction like other members of the WD family of proteins. © 2003 Elsevier Science (USA). All rights reserved. Keywords: WDR13; WD motif; Alternate splicing; Regulatory protein; X chromosome Introduction Biomolecular interactions are critical to various cellular processes. Among these processes, protein-protein interac- tions are vital and have been implicated in a host of cellular functions that control physiology, growth, and differentia- tion of various cell types. It is mediated by a variety of proteins that are widely present across the taxa and mark- edly possess structural motifs, which often exist in multiple copies, and coordinate these interactions in a stringent man- ner. Various repeat motifs like tetratricopeptide repeat (TPR) [1], EH [2], and WW [3] have been found to be present in multiple copies in a number of different proteins and have been shown to facilitate specific interactions with their partner protein(s). WD-repeat proteins constitute another superfamily of proteins that are found in the majority of eukaryotes studied, ranging from yeast to humans [4,5]. The WD repeat, to which the family owes its name, is a conserved motif of nearly 40 amino acids that often end with the dipeptide Trp-Asp (WD). It was first identified in the -subunit of heterotrimeric G protein, G that transduces signals from transmembrane receptors to a variety of second messenger generating effectors [6,7,8]. G protein is also the only WD-repeat protein whose crystal structure is known so far [9,10]. The seven copies of WD repeats in the G protein together form a propeller-like highly symmetrical structure, which is believed to mediate protein-protein interactions [11]. WD proteins are known to have 4 –16 repeats [7,12], barring a few exceptions like a Drosophila protein DMX with 30 repeats and its human homologue DMXL1 (with 28 repeats) [13,14]. Most of the WD-repeat family proteins are functionally heterogeneous and are implicated in a variety of cellular functions that encompass signal transduction, transcriptional regulation, pre-mRNA splicing, cell-cycle regulation, cytoskeletal organization, and vesicular fusion [12,4]. In the present study, we have described the identi- fication of a novel human gene, designated as WDR13 * Corresponding author. Fax: 27160252. E-mail address: [email protected] (L. Singh). R Available online at www.sciencedirect.com Genomics 81 (2003) 315–328 www.elsevier.com/locate/ygeno 0888-7543/03/$ – see front matter © 2003 Elsevier Science (USA). All rights reserved. doi:10.1016/S0888-7543(02)00036-8

Transcript of A highly conserved human gene encoding a novel member of WD-repeat family of proteins (WDR13)

A highly conserved human gene encoding a novel memberof WD-repeat family of proteins (WDR13)

Bhupendra N. Singh, Amritha Suresh, Gogineni UmaPrasad, Subbaya Subramanian,Mehar Sultana, Sandeep Goel, Satish Kumar, and Lalji Singh*

Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad 500 007, India

Received 15 November 2002; accepted 6 December 2002

Abstract

We have identified and characterized a novel member of the WD-repeat motif gene family, WDR13, which contains 9 exons and 8introns. The gene has been mapped to the genomic locus Xp11.23 by fluorescent in situ hybridization and in silico mapping. Sequenceanalysis has revealed a continuous open reading frame (ORF) encoding for 485 amino acids with six WD motifs. The expression of thisgene has been detected in all the tissues analyzed with significantly varied expression levels among the tissues studied. Analysis of ESTclones from various tissues, showing significant homology to WDR13, has identified two spliced variants. The transcription start point hasbeen mapped. Promoter analysis has identified high activity in the 5� UTR, which interestingly showed a testis-specific activity in thetransgenic animals studied. The subcellular localization of the WDR13 protein in the nucleus suggests that it may also have a regulatoryrole in nuclear function along with protein-protein interaction like other members of the WD family of proteins.© 2003 Elsevier Science (USA). All rights reserved.

Keywords: WDR13; WD motif; Alternate splicing; Regulatory protein; X chromosome

Introduction

Biomolecular interactions are critical to various cellularprocesses. Among these processes, protein-protein interac-tions are vital and have been implicated in a host of cellularfunctions that control physiology, growth, and differentia-tion of various cell types. It is mediated by a variety ofproteins that are widely present across the taxa and mark-edly possess structural motifs, which often exist in multiplecopies, and coordinate these interactions in a stringent man-ner. Various repeat motifs like tetratricopeptide repeat(TPR) [1], EH [2], and WW [3] have been found to bepresent in multiple copies in a number of different proteinsand have been shown to facilitate specific interactions withtheir partner protein(s).

WD-repeat proteins constitute another superfamily ofproteins that are found in the majority of eukaryotes studied,

ranging from yeast to humans [4,5]. The WD repeat, towhich the family owes its name, is a conserved motif ofnearly 40 amino acids that often end with the dipeptideTrp-Asp (WD). It was first identified in the �-subunit ofheterotrimeric G protein, G� that transduces signals fromtransmembrane receptors to a variety of second messengergenerating effectors [6,7,8]. G� protein is also the onlyWD-repeat protein whose crystal structure is known so far[9,10]. The seven copies of WD repeats in the G� proteintogether form a propeller-like highly symmetrical structure,which is believed to mediate protein-protein interactions[11]. WD proteins are known to have 4–16 repeats [7,12],barring a few exceptions like a Drosophila protein DMXwith 30 repeats and its human homologue DMXL1 (with 28repeats) [13,14]. Most of the WD-repeat family proteins arefunctionally heterogeneous and are implicated in a varietyof cellular functions that encompass signal transduction,transcriptional regulation, pre-mRNA splicing, cell-cycleregulation, cytoskeletal organization, and vesicular fusion[12,4]. In the present study, we have described the identi-fication of a novel human gene, designated as WDR13

* Corresponding author. Fax: �27160252.E-mail address: [email protected] (L. Singh).

R

Available online at www.sciencedirect.com

Genomics 81 (2003) 315–328 www.elsevier.com/locate/ygeno

0888-7543/03/$ – see front matter © 2003 Elsevier Science (USA). All rights reserved.doi:10.1016/S0888-7543(02)00036-8

(nomenclature approved by Human Gene NomenclatureCommittee), which encodes a novel member of WD-repeatfamily of proteins. The gene has been completely se-quenced, its structural organization unveiled, chromosomallocation determined, and promoter region identified in 5�UTR, which shows testis-specific activity in the transgenicanimals studied. Significantly high levels of sequence sim-ilarities across the species testifies evolutionary functionalsignificance of this gene.

Results and discussion

Identification of the cDNA clone

In order to identify genes predominantly expressed intestis and their possible implication in testis-determiningfunction, we screened a human testis cDNA library using acosmid subclone that was positive to Bkm (Banded KraitMinor satellite DNA), which is preferentially associatedwith a sex-determining chromosome [15]. One of the pos-itive clones showed a hybridization pattern that suggestedits presence in both male and female human DNA and wasconserved across many taxa. This clone was predominantlyexpressed in mouse testis. The fact that when crocodilian(Crocodilus palustris) eggs are incubated at a higher tem-perature (32.5°C), they all develop as males and thoseincubated at lower temperature (30°C) develop as females[16], testifies that the major genes involved in gonadogen-esis (testes and/or ovary) must be present in both sexes butexpressed predominantly in a sex-specific manner. The or-ganization of gonads being apparently similar in the animalkingdom, the gene(s) involved in gonadogenesis may alsobe highly conserved. Therefore, we chose to characterizethis gene further.

Chromosomal localization of the WDR13 gene

Fluorescence in situ hybridization mapping of WDR13revealed that the gene is located on the p arm of the Xchromosome (Fig. 1). In silico mapping helped in assigningthe physical position of the gene to Xp11.23. An ESTdatabase search identified several ESTs that assemble into asingle UniGene cluster Hs. 12142, which features an STSmarker, A007K03 that has been mapped to chromosomeXp11.23 at interval DXS1061-DXS1039 using radiationhybrid panel Genebridge 4. Electronic PCR, using the STSmarker, identified a genomic contigue of 113 kb(AF196969), which also mapped to the X chromosome atthe same physical position. The genomic contigue featuresseveral STS markers that were uniquely assigned to thesame physical position on the X chromosome based onindependent mapping assignment using different radiationhybrid panels.

Structural organization of the WDR13 gene

A total genomic clone comprising 20 kb of DNA frag-ment was completely sequenced. Enough redundancy wasgenerated to ensure the correctness of the sequence. Com-plete sequence was checked against the database of vectorsequences to rule out the possibility of any vector sequencecontamination. The genomic sequence was aligned withcDNA sequence and the exonic breakpoints were identified.The total length of the transcribed portion of this gene wasfound to be 7.3 kb (Fig. 2A). This contained nine exons andeight introns the boundaries of which were marked by thepresence of highly conserved signature sequences GT andAG at 5� and 3� end, respectively (Table 1). A canonicalhexameric polyadenylation signal (AATAAA) was found atthe 3� end of the cDNA sequence in its last exon (Fig. 3).Interestingly, a similar signature was found to be present285 bases downstream in the genomic sequence(AF149817) flanking the last exon of this gene (not shown).The identified cDNA sequence was devoid of a poly (A)tail. All exonic sequences were devoid of any repeat ele-ments.

Multitissue expression array and the Northern blotanalysis of WDR13 and its related transcripts

Analysis of the multiple tissue expression (MTE) arraywith a gene-specific probe showed a basal level of expres-

Fig. 1. Fluorescence in situ mapping of the WDR13 sequences on thehuman metaphase: the gene maps on the p arm of the X chromosome atXp11.23. Inset: enlarged view of the X chromosome with the fluorescencesignal.

316 B.N. Singh et al. / Genomics 81 (2003) 315–328

sion in both the adult and fetal tissues studied (Fig. 4C).However, an increased level of expression was observed inadult tissues like the apex of the heart and associated tissues,brain (putamen, caudate nucleus, and pituitary gland), kid-ney, placenta, skeletal muscle, and pancreas. Among thefetal tissues, the lung, heart and kidney showed a compar-atively higher level of expression. Reprobing of the blotwith actin established the integrity of various RNA samplesused and authenticated our hybridization conditions.

To study the transcripts that were produced from thisgene, Northern blots of multiple human tissues were hybrid-ized with DNA probes spanning complete cDNA region.Consistent with the finding of MTE array profiling, heart,kidney, and pancreas showed increased the level of thistranscript. The blot revealed the presence of two transcriptsof �3.0 kb and 2.0 kb (Fig. 4A). The 2.0 kb transcriptappears to be the primary product of this gene because a

basal level of it was observed in all tissue-types analyzed.The level of 3.0 kb transcript quantitatively paralleled with2.0 kb in the testis, ovary, small intestine and large intestine;however, a marginal increase in its expression was observedin the prostrate, spleen, and ovary. Rehybridization of blotswith constitutively expressed actin probe (Fig. 4B) estab-lished the quantity as well as the integrity of the RNA ineach lane. The testis cDNA clone (2.4 kb), identified earlier,does not correspond with either of the transcript sizes, andmay represent an incomplete cDNA. It is also possible thatthe two transcripts that are observed in testis and otherhuman tissue types represent two distinct products from thesame locus being produced as a result of alternative splic-ing.

Quantitative heterogeneity of the transcripts, identifiedby expression analysis, does not find support from in silicoanalysis, as ESTs based expression profiling is limited in its

Fig. 2. Genomic organization and alternate splicing in the WDR13. (A) The organization of the exons (E1–8) and introns (I1–8) in the gene are indicated.The comparison between the various cDNA clones obtained from RZPD and the alternate splicing of intron 1 sequence is indicated. (intron 1 is shown asan open box). The different ESTs from pancreas, lungs, Cerebellum, Pituitary, and pineal gland showing homology to the WDR13 sequence are also indicated.(B) indicates evidence for alternate splicing in WDR13 as shown by RT-PCR profile of human and mouse testes RNA with the primers encompassing exons1–3. The PCR using the same primers in the testis cDNA clone gave a 1.5 kb product due to the presence of the intron 1. (C) shows the PCR profile of thedifferent EST clone with the intron primers and the E3R primer indicating the presence of the intron in them. The retinal cDNA clone showed the presenceof the entire intron 1 amplified with E1–3 primers.

317B.N. Singh et al. / Genomics 81 (2003) 315–328

scope to reveal anything regarding the level of a giventranscript. In the present study, the RNA samples of a giventissue were collected from a group of individuals withvaried age group spanning 10 to 75 years (information wasobtained from the manufacturers). Therefore, it would beinappropriate to correlate conclusively the increased level ofthe WDR13 transcripts with its increased expression inidentified tissues on a general basis. A more conclusiveanalysis with respect to the expression of the gene at thelevel of age, as well as physiological and pathophysiologi-cal state of an individual, seeks large-scale expression pro-filing with an increased number of samples of the sameindividual and different sets of tissues from varying agegroups of individuals to study the temporal regulation of itsexpression.

In silico expression profiling, evaluation of the integrityof WDR13 cDNA sequence, and identification of variantspliced products

A sequence homology search against EST database re-vealed the presence of several ESTs from human and otherorganisms that showed a significantly high level of similar-ity with the queried testis cDNA sequence. The cDNAclones producing the ESTs possessed DNA fragments rang-ing between 1.8–2.0 kb, similar to the size of a majortranscript identified through Northern hybridization. A Uni-Gene database search identified a cluster of more than 100ESTs (UniGene ID # Hs. 12142) derived from severalhuman cDNA libraries of various embryonic, adult humantissues, and different cancerous tissues corresponding to thiscDNA sequence. Viewing the heterogeneity of RNA sam-ples and the presence of different sized transcripts in theNorthern blot analysis, we speculated the presence ofspliced variants of this gene. The hypothetical UniGenecluster (Hs. 12142) aligned with the testis cDNA sequencefrom exon six onwards to its 3� end, while the 5� ESTs

aligned only from exon 2 onwards. The removal of a 679 bpintervening intronic sequence, identified based on the pres-ence of 5� GT and AG signatures, resulted in a properalignment and redefining of the WDR13 sequence(AF329819). The originally identified 2.4 kb testis cDNA(AF158978) showed only eight exons due to the presence ofan unspliced intron one sequence. The 3� end sequences ofa few clones possessing poly (A) tails (A1025224,NM017883), which extended downstream of the last exon,corroborated with the genomic sequence and were incorpo-rated into WDR13 cDNA sequence submitted in the data-base. The resulting WDR13 cDNA sequence measured upto 1.8 kb size and transcriptional start point analysis (dis-cussed later) revealed the 5� end of the cDNA and estab-lished the size of the transcript at 2.1 kb, which corroborateswith the major transcript in the Northern blot analyses.

Redefining the structure of testis cDNA sequence asWDR13 does not exclude the possibility of the presenceof the original sequence in its own structural format.Sequence analysis of EST clones from different humantissues, namely, neuroepithelial cells (AA074569), eye(AA046947), brain (R87503, H39107), stomach(A1801038), and uterus (AI699021), which showed simi-larity with WDR13 cDNA sequence (Human Genome Re-source Centre, Germany; RZPD), revealed the presence ofthe complete intron one in eye, and its partial sequence inthe other clones. The intron was absent in the neuroepithe-lial clone (Fig. 2A). All cDNAs were matured at their 3� endwith the characteristic poly A tail, and other introns from2–8 were spliced off. RT-PCR analysis of human testesRNA samples using different exonic primers revealed thepresence of amplicon (Fig. 2B), the size and sequenceanalyses of which confirmed the integrity of WDR13 cDNAsequence. The earlier identified testis cDNA sequence couldalso subscribe to a variant-spliced product of the transcript.However, in support of the presence of some spliced vari-ants in other tissues, a human retina cDNA clone (MG21)

Table 1

S.No. Exon size(bp)

5� Splice donor Intron size(bp)

3� Splice acceptor

1 229 cgcgag/GTGAGG––– 679 ––TGCAG/gtac2 241 atggag/GTGAGC––– 395 ––TGCAG/gact3 110 ggacag/GTATGC––– 123 ––CCCAG/gcccCTGGCAG4 131 aggcag/GTGAGC––– 601 ––TGCAG/tccc5 308 actgtg/GTCAGG––– 1163 ––CACAG/gtgg6 181 ccacag/GTAGGC––– 100 ––TGCAG/ggaa7 142 ctacag/GTGGGT––– 2065 ––CTCAG/ggtg8 119 gcgtgg/GTGAGT––– 454 ––GACAG/tgac9 344

Consensus naag/GTAAGT––– ––TNCAG/gnnnct GGAA CC t

Note. Introns are invariably flanked by GT at 5� end and AG toward 3� end. Alternate nucleotides given in the consensus are based on their occurrencein the maximum of boundaries sequences studied. Sequences of introns and exons are shown in uppercase and lowercase, respectively. Note another AGsignature in exon (shown in uppercase) that flanks the third intron of WDR13, which is used during the splicing of the MG21 intron. AG that marks the 3�end of the WDR13 third intron is bypassed in this case.

318 B.N. Singh et al. / Genomics 81 (2003) 315–328

was identified from the EST database, the 5� end of whichstarts 276 bases upstream of exon one of WDR13. Fromthere onwards its sequence is collinear with exon one, intronone, exon two, intron two, and exon three of WDR13.

Colinearity breaks at 5� end of the third intron of WDR13,but 3� end of putative MG21 intron extends farther by a 12bases into the fourth exon of WDR13 (Fig. 2A and Table 1).The retinal cDNA is incomplete at both its 5� and 3� end.

Fig. 3. Complete sequence of human WDR13 cDNA and its putative protein. The 1819 nucleotides of cDNA contain an open reading frame that encodesa putative protein of 485 amino acids. Initiation codons ATG and stop codon TAG are in boldface and underlined. Amino acid residues that mark the WDrepeats are shown in boldface, while nucleotides sequence in bold lowercase represent the polyadenylation signature. Nucleotides shown in lowercase areuntranslated regions. Primers corresponding to different exons that were used in RT-PCR are underlined and marked with F for forward and R for reverse.The arrow points to the orientation of primers from 5� to 3� end. Nucleotides and amino acid residue numbers are indicated to their right side.

319B.N. Singh et al. / Genomics 81 (2003) 315–328

We need to decipher its entire length to unveil its structuraldetail and functional significance.

Nearly 35% of human genes show variably spliced prod-ucts produced from a single gene [17]. Variation in tran-script structure and size is introduced in numerous ways[18,19]; methods by which exons can be spliced into themRNA or skipped, introns retained, the 5� or 3� splice sites,and transcriptional start site can be altered are generally inuse. The identification of the testis cDNA clone, and a fewcDNA clones identified from database, which showed re-tention of the intronic sequence, suggest alternate splicingof the WDR13 transcript. In addition, human retinal cDNAclone (MG21) is a variant-spliced product obtained by theusage of another splice site AG, which marks the 3� end ofits putative intron. The 3� end of the WDR13 cDNA se-quence contains a highly conserved hexameric signature forpolyadenylation preceding poly (A) tail in the last exon ofthe transcript and a second one 285 bases downstream of theend point of the WDR13 cDNA sequence. ESTs sequencedfrom cDNA 3� ends are expected to provide multiple ex-amples of alternate polyadenylation in human mRNAs [20]and the UniGene build (Hs.12142) features 3� ESTs thatharbor the same polyadenylation signature found in the last

exon of WDR13 cDNA sequence; hence, we believe that thefirst hexameric signature is used for polyadenylation of allmessages corresponding to the WDR13 transcript. Never-theless, we do not exclude the possibility of the usage of thesecond hexameric signature for polyadenylation of the vari-ant-spliced product that may be produced from this locus insome specialized condition. These results, when consideredin combination with the Northern blot analysis that identifytwo variant transcripts, suggest the occurrence of variedtranscripts in different tissues.

WDR13 coding sequences are highly conserved across thephylogeny

We chose to characterize this gene because it revealedpositive hybridization to DNA samples of different organ-isms in a Southern hybridization analysis. Databasesearches to identify its orthologs in other organisms resultedin the identification of several ESTs from different organ-isms that include pigs, cattle, mice, zebra fish, and Xenopuslaevis. Similar to humans, these ESTs belong to differenttissue types and various stages of development. All of themshowed a significantly high level of sequence similarity

Fig. 4. Expression analysis of WDR13. (A) Northern hybridization of human multiple tissue RNA blot with the complete cDNA probe that revealed the widedistribution of WDR13 transcripts in different human tissues such as spleen, thymus, prostrate, testis, ovary, small intestine, colon, peripheral bloodleucocytes, heart, brain, placenta, lung, liver, skeletal muscle, kidney, and pancreas. Note an increased level of 2.0 kb transcript in heart, skeletal muscle,kidney, and pancreas. Note an increased level of 2.0 kb transcript in heart, skeletal muscle, kidney, and pancreas, as well as the notable absence of 3.0 kbtranscript in heart, brain, liver, and skeletal muscle. (B) Rehybridization with an actin probe shows, by and large, an equal amount of the RNA of these tissuesexcept for the liver sample. (C) Human multitissue expression (MTE™) array with the WDR13 cDNA probe. Arrowheads point to the increased level oftranscripts in that particular tissue type. Note the absence of the signal in column 12 that contained nonspecific DNA and RNA samples from other organisms.

320 B.N. Singh et al. / Genomics 81 (2003) 315–328

with the corresponding regions of WDR13 coding se-quences. Importantly, we have identified a mouse testiscDNA clone (AF220146) that revealed 88% sequence sim-ilarity at DNA level with a complete coding sequence ofWDR13. This suggests that the complete WDR13 gene ishighly conserved across the phylogeny. It is noteworthyhere that we did not obtain any sequence-bearing similarityto the intronic regions of WDR13 in any of these organisms.

Analysis of open reading frames in cDNA sequence andidentification of WD repeats in the predicted proteinsequence

Since the orientation of WDR13 cDNA was establishedwith identification of a canonical polyadenylation signaturein the cDNA sequence and GT-AG boundaries of introns,we considered ORFs only from positive reading framesduring analysis. The longest ORF was predicted in the thirdframe (Fig. 3) and the presence of Kozak translation initi-ation sequence [21] flanking the ATG indicated that it wasthe correct translation initiation site. The ATG is present inthe first exon of the cDNA sequence leaving 500 bp of UTRtoward its 5� end. The deduced protein sequence comprises485 amino acids with an estimated molecular mass of 53kDa. The ORF analysis of the 2.4 kb transcript showed thelongest ORF in the first frame. A 5� UTR of 1.5 kb wasobserved in this case and the encoded protein was 393amino acids (43 kDa). The latter sequence differed onlywith respect to the deletion of the initial 92 amino acids. The3� UTR is 161 bases long in both cases and ends with aperfect polyadenylation signal preceding 94 bases of poly(A) tail. Primary sequence analysis of the putative proteinpredicted this to be a basic protein with a pI value of 9.4. Adatabase search with a predicted protein sequence did notidentify any protein with a significant similarity at a primarysequence level. A PROSITE database search identified re-gions bearing a significantly high level of homology withthe class of structural repeat called WD repeat, which isbased on the profile that is generated after the alignment ofdifferent WD repeats of various WD repeat-containing pro-teins enlisted on PSA server at Biomolecular EngineeringResearch Center (http://bmercwww.bu.edu). Six such po-tential WD repeats with significantly high profile probabil-ities are found to be present in the putative protein (Fig. 3).BLASTP searches with the protein sequence did not iden-tify any significant match except for the WD repeat-encod-ing region. This suggests that the putative protein is a novelmember of WD-repeat family of proteins. Furthermore, thePROSITE scan identified patterns that correspond to acAMP- and cGMP-dependent protein kinase phosphoryla-tion site, a protein kinase C phosphorylation site, and sev-eral casein kinase II phosphorylation sites. Notably, allsignatures indicative of post-translational modification ofthe protein residues were predicted in the amino terminaldomain of the putative protein. It was also interesting to

note that the first 92 amino acid motif was highly hydro-philic.

Transcriptional start point mapping

The full length of the cDNA was ascertained by RT-PCRusing upstream primers corresponding to the genomic se-quence. The transcriptional start point of the human testestranscript was then mapped by the 5� RACE technique.Nested amplifications done by gene-specific primers anti-sense to the cDNA at about 80–100 bp of the first exon gavea product of 400 bp (Fig. 5). Sequencing of the productrevealed the start point at about 317 bp upstream of theexisting first exon of the cDNA clone initially isolated,which indicates the size of the cDNA as 2.1 kb that corrob-orates with the transcript size obtained in the Northernanalysis. RT-PCR analysis using intronic primers also indi-cated that the transcript was devoid of intron and is hencerepresentative of the smaller transcript. The sequence ob-tained was found aligned with the cDNA sequence of Uni-Gene (Hs12142) and the brain clone (R36037).

Overexpression of WDR13 protein

Overexpression of the entire ORF of the 2.4kb transcriptresulted in an expressed product of 43 kDa (Fig. 6A). Theexpression was about 30% and it was expressed as aninsoluble fraction. The overexpressed product was extractedfrom SDS-PAGE gel and used in raising the antibodyagainst this protein. Unfortunately, attempts to raise theantibody for the purified overexpressed protein (43 kDa)were not successful. Repeated doses of the protein into tworabbits for a period of 10 months did not show any signif-icant titer of antibodies. Similar attempts in chicken andguinea pigs for a period of 3–4 months also produced asimilar result. Earlier attempts in raising the antibody for aprotein with WD motifs in a different laboratory haveshown that they could raise the antibody only after 8 months[22]. We suspect that, because the gene is conserved acrosstaxa, the purified protein is not being recognized as anantigen. The protein sequence analysis has revealed that the

Fig. 5. 5� Rapid amplification of cDNA ends. This figure shows the 400 bpamplified product of 5� RACE, the subsequent sequencing of which indi-cated the transcriptional start point (�1) of the mRNA transcript.

321B.N. Singh et al. / Genomics 81 (2003) 315–328

first 92 amino acids region is highly hydrophilic and thusmight be more antigenic.

Subcellular localization of WDR13

The GFP-tagged WDR13 protein was used to ascertainthe subcellular localization of the protein encoded by bothof the alternatively spliced transcripts. The ORFs of both thealternatively spliced transcripts were cloned in frame withthe GFP cDNA in the pEGFPC1 (full length 53 kDa),pEGFPC2 (from alternative ATG, 43 kDa) vector (Clon-tech). The transfection analysis was done in the CHO cellline and the localization was visualized by confocal micros-copy. The 53 kDa protein was found to localize to thenucleus while the 43 kDa protein was present in both thenucleus and cytoplasm (Fig. 6B). This indicates the pres-ence of the Nuclear Localization Signal (NLS) and theNuclear Export Signal (NES) in the protein. However, thesearch for NLS/NES signals in the protein sequence did notfind any such signals, indicating the presence of a novelNLS/NES. There is growing evidence that gene transcrip-tion can be controlled by the nuclear import of transcriptionfactors. The nuclear localization of WDR13 signifies thatthe WD family of proteins mainly constitutes members thatare regulatory in function.

Analysis of the putative regulatory regions

In silico analysis of the putative sequences indicated thepresence of various transcription factor-binding sites like

that of AP-1, SP-1, GATA-1, and GC box. In an effort tostudy the 5� upstream and the UTR regions in the regulationof the gene, a series of fragments were cloned into thepromoterless pGL3-Basic vector coding for the luciferasereporter gene, and transfected into HeLa and HepG2 celllines after ascertaining the expression of the gene in thesame. Luciferase assay analysis in both the cell lines indi-cated activity of 50-fold for the fragment (�415/�1533)that encompasses about 400 bp of the 5� upstream and theentire 1.5 kb of the 5�UTR in comparison with the basicvector. The region �415/�390 showed a 10-fold activitywhile the construct, including the UTR �316/�1533,showed 40-fold activity. Deletions of the regions extendingfrom �415/�274, �415/�100 reduced the activity to basallevels. Similarly, a deletion of the region �316/�656 con-taining the GATA-1 site also reduced activity to basallevels. The activity levels were the same for both HeLa andHepG2 cell lines (Fig. 7A). The construct (�316/�1533),which gave the maximum reporter activity, was used toconstruct a transgenic line of mice to give a further insightinto the regulation of the gene. The linear cassette, includingthe promoter construct upstream to the luciferase reportergene, was used for microinjection of mice embryos. Thetissues from five transgenic mice were assayed for lucif-erase activity and compared with nontransgenic normalmice. The luciferase assay revealed an almost 200-foldincrease in the activity in the testes when compared to othertissues like the brain, kidney, liver, heart, and spleen (Fig.7b). The brain tissue also showed a marginal increase in

Fig. 6. (A) shows the overexpression of the protein in a bacterial host. The lanes are represented as nonrecombinant (N), recombinant (R), marker (M), solublefraction (SO), and insoluble fraction (IN). (B) shows the subcellular localization of the WDR13 protein in CHO cells. Panel A shows the 53 KDa proteinlocalizing only in the nucleus. Panel B shows the localization of the 43 Kda protein in both the cytoplasm and nucleus. Panel C shows the control GFPlocalization in both the cytoplasm and nucleus.

322 B.N. Singh et al. / Genomics 81 (2003) 315–328

reporter activity. Transgenic females did not show any dif-ference in activity with respect to the different tissues ana-lyzed.

Analysis of the putative regulatory regions of the gene bycell culture and transgenics revealed a major activity con-fined in the 5� untranslated region. The 5� UTR seems,therefore, to play a role in the post-transcriptional regulationof one of the alternatively spliced transcript, and the man-ifold expression of the reporter gene in the testis points tothe tissue specificity of the promoter and a testicular func-tion for the gene. Evidences of the 5� UTR being involvedin the regulation of the gene have been well documented[23,24]. The above region, which showed the 40-fold ac-tivity, contained a GATA-1 transcription factor binding site,which is highly expressed in testes tissues [25]. The func-tional relevance of the transcription factor in the tissue-specific expression of the gene needs to be ascertained.

We have described the isolation and characterization of anovel human gene designated as WDR13. Its characteriza-tion was initiated because it revealed positive hybridizationto different DNA samples of various organisms. This ob-servation was further substantiated because its sequenceshowed similarity with many ESTs that belonged to organ-isms ranging from mammals to amphibians and in tissuescollected from different developmental stages and physio-logical states. Identification of a mouse ortholog of this genethat shows 88% similarity through its entire cDNA se-quence is notable in this regard.

The WDR13 gene encodes a putative protein of 485amino acids that show a significant homology with WD

repeats containing proteins owing to the presence of sixpotential WD repeats in the C-terminal portion; absence ofhomology with the N-terminal amino acids describes it as anovel protein. The WD-repeat proteins identified to datefrom different organisms exhibit a high degree of functionaldiversity, despite possessing a common sequence motif and,expectedly, a three-dimensional structure. All WD-repeatproteins are known to be regulatory and none acts as en-zyme. They are grouped into two major classes: one inwhich the proteins are composed almost entirely of therepeat motifs and a second in which the repeats are re-stricted to the C-terminal domain [26]. Proteins in the firstgroup are known to serve as �-subunits in the heterochro-matic G-protein complexes that transduce receptor-gener-ated signals. Mammalian �-transducins and S. cerevisiaeSte4 fall into this class. Proteins in the second group areinvolved in activities as diverse as microtubule-dependentprocesses (Cdc20), catabolite repression (Tup1), regulationof the RAS-cAMP pathway (Msi1), RNA splicing (Prp4),DNA replication (Cdc4), and neurogenesis (E-spl) [26].There is no evidence that proteins of this group form part ofthe heterotrimeric G-protein complexes. The repeat is nev-ertheless expected to be involved in protein-protein interac-tions, as demonstrated for the transducin �-�-subunit com-plex [27]. It is not clear in most cases where function isknown, whether functionality should be attributed to theWD-repeat domain itself or to the amino- or carboxy-ter-minal extension of a WD-repeat protein. It has been spec-ulated that WD repeat combines a conserved core structurewith variable regions that probably translate into the surface

Fig. 7. (A) shows the Relative Luciferase activity of promoter constructs. The graph shows the activity of the different deletion constructs of the promoterregion in two cell lines, HeLa and HepG2. The activity is expressed as fold activity with respect to the activity of the pGL3 basic vector. The darker bar showsthe activity in the HepG2 cell line, while the lighter bar shows the same in HeLa cell line. (B) shows the relative luciferase activity (RLU/mg) of the differenttissues of the transgenic animals generated using the 5� UTR construct. Note the increased luciferase activity in the testis.

323B.N. Singh et al. / Genomics 81 (2003) 315–328

[4] and the repeat might fold with a variable loop precedingthe repeat followed by a �-strand/turn/�-strand/turn/�-strand ending with WD [28,11]. This structure would onlybe stable if there is a ligand (such as metal ion) or by contactwith other WD repeats. Because the WD-repeat family ofproteins is known to mediate protein-protein interactions, itis therefore probable that functionally similar WD-repeatproteins have similar binding partners. We examined thedatabase, which consists of more than 30 clusters of WD-repeat proteins that are assumed to have common bindingpartners [4], but failed to identify any such similarity withour putative protein sequence. It is therefore noteworthyhere that a simple correspondence with WD repeats does notgive any clue regarding its functionality. Therefore, in as-signing proteins to functional families, one must also dis-tinguish predictions of probable binding partners of theWD-repeat domain from prediction of the overall cellularrole of the protein. The latter will be influenced by both theWD-repeat domain and the adjoining segments. The WD-repeat family proteins have also been classified functionallyon the basis of signature present in their adjoining N- orC-terminal domains [4]. The extensions in these regionsmight define the subcellular localization of the protein andthereby define discrete cellular roles for the structurallysimilar WD-repeat proteins. In the present study, theWDR13 protein is found to have a nuclear localization, butwe did not identify any known nuclear localization signal inthe adjoining amino-terminal domain of the protein. Nev-ertheless, the nuclear localization does suggest a regulatoryfunction for the protein as is the case of most of the WDfamily proteins. Identification of potential phosphorylationsites by different kinds of kinases only suggests a possiblepost-translational modification of this protein that is prereq-uisite for executing appropriate function of a given protein.

A number of WD-repeat proteins are identified in human.They have been assigned important functions and loss offunction mutations often result in severe phenotypes; forexample, a splice variant of G� subunit resulting in thedeletion of one WD-repeat unit was shown to be associatedwith essential hypertension [29]. Loss of function of PEX7(peroxisome biogenesis factor 7) is the cause of rhizomelicchondro-dysplacia punctata [30,31], and the CKN1 genewas shown to be responsible for Cockayne syndrome [32],which is associated with a genetic defect in repairing UV-induced DNA damage in transcriptionally active DNA [33].The TUPLE1/HIRA gene, which codes for a protein actingon chromatin structure to control gene transcription, is acandidate gene for DiGeorge syndrome [34]. Other humanWD-repeat proteins with important functions are the nuclearretinoblastoma-binding proteins RBBP4 [35] and RBBP[36], APAF (apoptotic protease activating factor 1) in-volved in the initiation of apoptosis [37], the p60 subunit ofthe chromatin assembly factor I (CAF1p60, (7), and theNS-MAF that plays a crucial role in TNF signaling [38].

These examples, encompassing a repertoire of diversebiological functions that are executed by this family of

proteins, explain our limitation in assigning a particularfunction to the WD-repeat protein. In view of this back-ground, the wide distribution of WDR13 transcripts invarious human tissues, in combination with its nuclear lo-calization, indicates a probable regulatory function. Further-more, the identification of a testis-specific promoter alsosuggests a significant role of the gene in the testis tissue.Identification of the mouse ortholog bearing a significantlyhigh level of sequence similarity and conservation acrossthe phyla further adds to the evolutionary importance of thebiological role(s) imparted by this gene. Additional studies,such as the expression of the putative protein, identificationof binding partners in different tissues, and, most impor-tantly, a large-scale expression profiling in mouse tissuesfollowed by gene knock-out analysis, are needed to assign aprobable function of this gene. We are currently working inthis direction.

Materials and methods

Screening of cDNA and genomic libraries andidentification of clones

A subclone of Bkm (Banded krait minor satellite)-posi-tive human cosmid clone [15] was used to screen a humantestis cDNA library (Clontech cat. # HL1010B). Positiveclones, identified after three rounds of screening, were re-assessed by Southern hybridization. Isolated after restrictiondigestion of the clone, cDNA inserts were subcloned insuitable plasmid vectors. A human genomic library con-structed in lambda phage-cloning vector (EMBL3 SP6/T7),obtained from Clontech (cat. # HL1111j), was screenedusing a cDNA probe that was isolated from the abovescreening. Several clones were obtained of which eight wereestablished as true positives after iterative screening. Cloneswere typed by restriction digestion using different enzymesand fractionated by electrophoresis to assess their similarityand to find the insert’s size. A clone named T15, whichharbored most of the cDNA, was used for further analysis.

Cloning and sequencing of DNA fragments

All subcloning were performed in plasmid vectorspBluescript (Stratagene) and pGEM3-Zf (Promega). Plas-mid DNAs were prepared by either the routine alkali lysismethod, which was followed by cesium chloride densitygradient centrifugation [39], or the Wizard plasmid DNApurification kit (Promega). DNA inserts were eluted by theGENCLEAN kit (BIO Scientific, USA). The T15-genomicclone was restricted with different enzymes and the result-ing fragments were appropriately cloned. Several subcloneswere produced that were further manipulated with suitableenzymes to generate subclones with smaller inserts. Togenerate nested subclones for sequencing, exonuclease III-based deletion subcloning was performed using the Erase-

324 B.N. Singh et al. / Genomics 81 (2003) 315–328

a-Base kit (Promega). Sequencing reactions were performedwith universal M13 primers, T3- and T7-sequencing prim-ers, using the ABI PRISM BigDye Terminator Cycle Se-quencing kit (Perkin Elmer (PE), Applied Biosystem). Se-quencing was carried out using ABI PRISM 377 and 3700automated DNA sequencer (PE, Applied Biosystem), fol-lowing the protocols described by manufacturers. Bothstrands of each clone were sequenced. Sequences wereassembled and aligned using AutoAssembler Version 1.4DNA sequence assembly software (Perkin Elmer, AppliedBiosystem. Internal gene-specific primers were used forsequencing the remaining gaps and to confirm further anyambiguity observed in the sequences generated. PCR-am-plified fragments were cleaned up using the PCR purifica-tion kit (QIAGEN) and sequenced using different gene-specific primer pairs.

Chromosomal localization of the WDR13 gene

Fluorescence in situ hybridization (FISH)Genomic clone (T15) containing WDR13 gene was la-

beled with biotin 16-dUTP by standard nick translation, orby random priming labeling procedures (Roche MolecularBiochemicals, Mannheim, Germany), and 100 ng of thelabeled DNA was used as a probe. Human metaphasespreads were prepared using lymphocyte culture, and FISHwas performed by incubating the chromosomal preparationswith labeled probes at 37°C for 16 hours in a moist cham-ber. The washing and detection were carried out usingstandard protocols as described in Roche FISH laboratorymanuals. Slides were observed under a Zeiss fluorescencemicroscope and the images were recorded and analyzedusing the Cyto Vision program.

In silico hybridizationComputational physical mapping of the WDR13 gene

was performed using the Human Genome BLAST and Uni-Gene database search. The principle of chromosomal as-signment of a new gene is based on the mapping of thehuman chromosome of a known UniGene cluster, whichcorresponds to the gene of interest. Similarly, Human Ge-nome BLAST also identifies a genomic contig in the data-base corresponding to the queried sequence.

Database search and sequence analysis

Total sequence was searched for vector sequence con-tamination by performing BLAST [40] against vector se-quence database on the (National Center for BiotechnologyInformation) NCBI server. The cDNA and genomic se-quence were submitted in GenBank under Accession Nos.AF158978 and AF149817, respectively. The sequence ho-mology search was performed using each exonic and in-tronic sequence against mammalian genome database, hu-man and mouse gene index, ESTs (expressed sequencetags), and THCs (tentative human consensus sequences)

database on the TIGR (The Institute of Genome Research;http://www.tigr.org) server. We searched the UniGene da-tabase to identify a build corresponding to our cDNA se-quence at NCBI. The cDNA sequence was analyzed forvarious consensus signatures and other structural featuresusing GCG (Genetics Computer Group, Wisconsin; http://www.accelrys.com/products/gcg_wisconsin_package) andother packages available on the web. ORF analysis wasperformed using the ORF finder program on NCBI. Putativeprotein sequence was searched for homology againstSwissprot database. Prediction of protein sorting was donewith PSORT II (a program to predict the subcellular lo-calization of proteins) at Expasy server (http://www.expasy.ch). Secondary structure elements in putative proteinwere predicted by using PHD threader accessed through thePredict Protein server at EMBL and on Protein StructureAnalysis (PSA) server of BMERC, Boston University,USA. PROSITE [41] database was searched to identifypotential structural motifs and patterns. WD-repeat motifswere analyzed on the PSA server. Sequences from eachputative repeat were aligned with the consensus derivedfrom Smith et al. 1999 by using the BESTFIT and PILEUPprograms from the GCG package. The initial alignment wasadjusted manually to improve the correspondence of WD-repeat consensus residues.

Expression analysis in human tissues

Multitissue expression array analysisThe Multi Tissue Expression (MTE�) Array (Clontech)

was used to profile the expression of the gene WDR13 indifferent human tissues. The MTE blot had polyA� RNAfrom 76 different human tissues blotted onto a nylon mem-brane. The blot was probed with the cDNA fragment labeledwith �-P32 (dATP) by random priming. Hybridization wasperformed as specified by the manufacturer with Ex-pressHyb solution at 65°C for 6 hours. This procedure wasfollowed by washings with 2� Sodium Saline Citrate (SSC)and 1� SSC, initially at 65°C and then at 55°C twice for 20minutes each. Blot was autoradiographed at �70°C for therequired duration. The blot was later reprobed with actin, ina similar manner, to ascertain the integrity and quantity ofthe RNA on the blot.

Multiple tissues Northern blot analysisHuman Multiple Tissues Northern blot (MTN™)1 (Clon-

tech), containing a range of tissues, was used for Northernanalysis. Each lane of the MTN blots contained 2 �g ofpolyA� RNA from different human tissues. Hybridizationwas performed at 65°C for 6 hours in ExpressHyb hybrid-ization solution (Clontech). 32P-labeled DNA probe of high-specific activity (3 � 108 cpm/�g) was used at a concen-tration of 5 ng/ml. Washing conditions used were: 20minutes at room temperature in 2� SSC, 0.1% SDS, 10minutes at 65°C in 2� SSC, 0.1% SDS, twice for 10minutes each at 65°C in 1� SSC, 0.1% SDS, and 10

325B.N. Singh et al. / Genomics 81 (2003) 315–328

minutes at 65°C in 0.1� SSC, 0.1% SDS. Blots wereautoradiographed as described earlier. Rehybridization withan actin probe and post-hybridization processing were per-formed in a similar manner, as described earlier for cDNAprobe.

RT-PCR analysis of human testes RNA samplesRT-PCR was performed using the RNA-PCR core kit

(PE, Applied Biosystem). MMLV reverse transcriptase wasused to reverse transcribe human testis RNA (Clontech),following the manufacturer’s protocol. PCR was performedusing exon specific oligonucleotide primers under condi-tions that included initial denaturation at 94°C for 3 minutesfollowed by 25 cycles, each at 92°C for 30 s, 65°C for 1minute, 72°C for 2 minutes, and the last extension at 72°Cfor 5 minutes. Amplicons were purified by exonucleasetreatment and sequenced, as described earlier, to confirm theintegrity of cDNA sequence. GAPDH primers were used asa control to check the integrity of RNA and efficiency ofRT-PCR reaction.

Transcriptional start point mapping

Transcriptional start point mapping was carried out using5� RACE Kit (GIBCO-BRL). Total RNA from human testeswas used and the reaction was conducted according to thespecifications provided. First strand synthesis was doneusing AMV reverse transcriptase and with the gene-specificprimer GSP1 (5� TCCTCATAGACTGCACGGTTC 3�).The cDNA was then purified by the column provided andhomopolymeric tailing was done with dCTP using terminaltransferase. The tailed cDNA was then amplified usingGSP2 (5� TTGTACCTCGCGTCCACTGC 3�) and theAbridged Anchor Primer provided in the kit. Nested ampli-fication of the product was done using GSP3 (5� CAGGT-GACCGTTGCCATAGAC 3�) and the Upstream AnchoredPrimer. The cycling conditions followed were initial dena-turation at 94°C for 5 minutes, followed by 35 cycles of94°C for 30 s, 58°C for 1 minute, 72°C for 1 minute, and afinal extension 72°C for 5 minutes. The product obtainedwas purified by exonuclease treatment and sequenced asdescribed earlier.

Sequence analysis of the promoter and deletion cloning

The putative promoter sequence was analyzed for thepresence of transcription factor-binding sites by usingdifferent databases like TFSEARCH (www.cbrc.jp), Mat-Inspector (www.transfac.gbf.de), SIGNALSCAN (bimas.dert.nih.gov), and NNPP (www-hgc.lbl.gov); consensussignals identified by these programs were used in furtheranalysis.

The luciferase reporter system (Promega) was used inthis study to assay the activity of the putative promoterregions of the gene. A genomic clone (T10) identified in theearlier screen included a major part of the upstream and was

used for promoter analysis. A 2.0 kb fragment of the cloneencompassing upstream and 5� UTR was cloned into themulti-cloning site of the promoterless pGL3-B vector up-stream to the luciferase reporter gene. Further deletionswere constructed in the vector by using PCR-based meth-ods. Different forward primers with a KpnI tail and a com-mon reverse primer with a BamHI site were used to createproducts with varied deletions at the 5� end of the promotersequence. PCR was done with an initial denaturation at94°C for 5 minutes followed by 35 cycles of 94°C for 30 s,68°C for 30 s, and 72°C for 1 minute. A final extension of5 minutes at 72°C was given. The PCR products werepurified, digested with respective enzymes and cloned intopGL3-basic vector. All clones were sequenced to ascertaintheir orientation and integrity.

Cloning of human ORF from WDR13 cDNA in theexpression vector

In an effort to characterize the protein, the entire ORFwas cloned into an expression vector by PCR-basedmethods. Amplification was done using primers withEcoRI site (PE2FE 5� CGGAATTCATGGAGGACTTT-GAG 3�, PE28RE CCGGAATTCAGCATGACCACCG 3�).The thermal cycling conditions were initial denaturation at94°C for 3 minutes followed by 30 cycles of 94°C for 30 s,65°C for 30 s, 72°C for 1 minute, and a final extension at72°C for 5 minutes. The product was cloned into the PCR2.1 vector (Invitrogen) and the orientation of the clone wasascertained by sequencing with M13 forward and reverseprimers. The insert was then cloned into pET-21a (�) andoverexpression was done using BL21 (DE3) strain. Expres-sion of the target gene was induced by the addition of IPTGfollowing the induction protocol (Novagen).

Cell culture and transfection

HeLa cells (human cervical cell carcinoma) and HepG2(Human hepatic cell) were cultured in Dulbecco’s modifiedEagle medium (Sigma) containing 10% Fetal Calf Serum,60 �g/ml of penicillin, and 50 �g/ml each of gentamicinand streptomycin. The cells were grown at 37°C in anatmosphere of 95% air and 5% CO2. The cells were sub-cultured at 80–90% confluency with 0.1% EDTA and tryp-sin. Cells were cultured to 90% confluency in T-75 flasksand RNA isolation was done by the standard triazol (Gibco-BRL) method, according to the manufacturer’s instructions.The expression of the WDR13 gene was studied usingprimers encompassing the first three exons.

Transfection experiments were carried out in 6/24-wellplates using liposome-mediated methods. 104 cells wereseeded per well and the transfection was done 24 hourslater. The transfection mix contained 0.5 �g of the construct(for subcellular localization we have used 100 to 150 ng),complexed with 1 �l of LipofectAMINE reagent (GIBCO-BRL), and 4 �l of Plus reagent. The cells were washed

326 B.N. Singh et al. / Genomics 81 (2003) 315–328

twice with 1� PBS and overlaid with 200 �l of a serum-free medium. The transfection mix was added and after 5–6hours, a complete medium was added to the wells. The cellswere grown for a period of 60 hours and harvested. Thewhole cell extracts were prepared using the reporter lysisbuffer (Promega Inc., Madison WI) and assayed for lucif-erase activity with the substrate luciferin in a luminometer.Cell extracts from untransfected cells and from cells trans-fected with pGL3-Basic vector served as the negative con-trols, while the cells transfected with the pGL3-Controlvector (SV40 promoter) were the positive control. Lucif-erase activity is represented as fold activity over the pro-moterless basic vector. All cells were cotransfected with the�-gal vector (pCMV �-gal) to normalize the transfectionefficiencies. The extracts were assayed for �-gal activity bythe ONPG assay (ortho-Nitrophenyl �-D galactoside). Eachconstruct was transfected in triplicates and an average cal-culated with standard error.

Subcellular localization of the WDR13 protein

Both the full-length (53 KDa) and truncated (43 Kda)transcripts of the WDR13 were cloned in-frame inpEGFPC1 and pEGFPC2, respectively. The WDR13 tran-script encoding 53 KDa protein was amplified from a clone,Accession No. AA074569, using the primers with EcoRIsites: forward, 5� GGAATTCCGGAATGGCCGCGGT3�and reverse, 5� CCGGAATTCCCTTCTGCTCCCG3�. Thethermal cycling conditions were initial denaturation at 94°Cfor 5 minutes followed by 35 cycles of 94°C for 30 s, 60°Cfor 30 s, 72°C for 1minute, and a final extension at 72°C for5 minutes. Chinese hamster ovary (CHO) cells were cul-tured and seeded onto cover slips. The transfection in theCHO cell lines was carried out as described earlier. Thecells were fixed in 3.4% paraformaldehyde/PBS for 10minutes at room temperature and washed in PBS. The cellnuclei were stained using DAPI. Transfected cells werevisualized by confocal microscopy.

Production and analysis of transgenic mice

A 3.1kb DNA fragment containing the human WDR13,1.1kb UTR, luciferase reporter gene, and SV 40 late poly(A) signal (Xhol-BamHI) was excised and gel purified.1000–1500 copies of the transgenic DNA fragment wasmicroinjected into a male pronuclei of a fertilized mouseegg (FVB/N). Microinjected embryos were surgically trans-ferred to pseudopregnant females (CD1). Transgenic micewere identified by PCR of the DNA isolated from tail clips.Further confirmation was done by Southern blot analysis oftail DNA with Luciferase (1.7 kb HindIII-XbaI) fragmentlabeled with [�-32P] dATP. A transgenic founder was iden-tified that was bred to establish a line, and it was found thatthe transgene was integrated on the X- chromosome becausethe male transgenic mice transmitted the transgene only tofemale progenies. Mice were mated to maintain a heterozy-

gous line and analyzed at six weeks. Different tissues frommale transgenic mice were analyzed for luciferase activity.The activity was normalized with respect to the quantity ofthe protein as calculated by Bradford’s assay.

Acknowledgments

The authors acknowledge Lakshmi Rao, Nandini R,Parthasarathy BVV, Jomini LA, Ramesh Aggarwal,Thangaraj K, and Rachel AJ for their help at various stagesof the experiments. Fellowship to AS and SS from CSIR isduly acknowledged. Financial support of the Department ofBiotechnology, Government of India is gratefully acknowl-edged.

References

[1] G.L. Blatch, M. Lassle, The Tetratricopeptide repeat: a structuralmotif mediating protein-protein interactions, Bio. Essays 21 (1999)932–939.

[2] P.P. DiFiore, P.G. Pelicci, A. Sorkin, EH: a novel protein-proteininteraction domain potentially involved in intracellular sorting,Trends Biochem. Sci. 22 (1997) 411–413.

[3] P. Bork, M. Sudol, The W. W. Domain: a signaling site in dystro-phin? Trends Biochem. Sci. 19 (1994) 531–533.

[4] T.F. Smith, C. Gaitatzes, K. Saxena, E.J. Neer, The WD repeat: Acommon architecture for diverse functions, Trends Biochem. Sci. 24(1999) 181–185.

[5] E.J. Beer, C.J. Schmidt, R. Nambudripad, T.F. Smith, The ancientregulatory-protein family of WD-repeat proteins, Nature 371 (22)(1994) 297–300.

[6] H.K.W. Fong, et al., Repetitive segmental structure and the transdu-cin � subunit: homolog with the CDC4 gene and the identification ofrelated mRNAS, Proc. Natl. Acad. Sci. USA. 83 (1986) 2162–2166.

[7] R.J. Duronio, J.I. Gordon, M.S. Boguski, Comparative analysis of the� transducin family with identification of several new membersincluding PWP1, a nonessential gene of S. cerevisiae that is diver-gently transcribed from NMT1, Proteins: Structure, Function andGenetics 13 (1992) 41–56.

[8] G.B. Downes, N. Gautam, The G protein subunit Gene Families,Genomics 62 (1999) 544–552.

[9] J. Sondek, A. Bohm, D.G. Lambright, H.E. Hamm, P.B. Sigler,Crystal structure of a G-Protein �� dimer at 2.1A° resolution, Nature379 (1996) 369–374 [Published erratum appears in Nature 1996 Feb29; 379 (6568): 847].

[10] M.A. Wall, D.E. Coleman, E. Lee, J.A. Iniguez-Lluhi, B.A. Posner,A.G. Gilman, S.R. Sprang, The structure of the G Protein heterotri-mer Gi�1�1�2, Cell 83 (1995) 1047–1058.

[11] I. Garcia-Higuera, J. Fenoglio, Y. Li, C. Lewis, M.P. Panchenko, O.Reiner, T.F. Smith, E.J. Neer, Folding of proteins with WD repeats:comparision of six members of the WD-repeat superfamily to the Gprotein � subunit, Biochemistry 35 (1996) 13985–94.

[12] L. van de Voorn, H.L. Ploegh, The WD-40 repeat, FEBS Lett. 307(1992) 131–134.

[13] C. Kraemer, T. Enklaar, B. Zabel, E.R. Schemidt, Mapping andstructure of DMXL1, a human homologue of the Dmx gene fromDrosophila melanogaster coding for a WD-repeat protein, Genomics64 (2000) 97–101.

[14] C. Kraemer, B. Weil, M. Christmann, E.R. Schmidt, The new geneDmX from Drosophila melanogaster encodes a novel WD-repeatprotein, Gene 216 (1998) 267–276.

327B.N. Singh et al. / Genomics 81 (2003) 315–328

[15] K.R. Rajyashri, L. Singh, A Bkm-associated human Y-ChromosomalDNA is conserved and transcribed in the testis of the mouse, Chomo-soma 104 (1995) 274–281.

[16] W.J. Lang, V.A. Andrews, Temperature-dependent sex determinationin Crocodilians, The J. Exp. Zool 270 (1994) 28–44.

[17] L. Croft, S. Schandroff, F. Clark, K. Burrage, P. Arctander, J.S.Mattick, ISIS, the intron information system, reveals the high fre-quency of alternative splicing in the human genome, Nat. Genet. 24(2000) 340–341.

[18] R.E. Breitbart, A. Andreadis, B. Nadal-Ginard, Alternative splicing:a ubiquitous mechanism for the generation of multiple protein iso-forms from single genes, Annu. Rev. Biochem. 56 (1987) 467–495.

[19] D.L. Black, Protein diversity from alternative splicing: a challenge forbioinformatics and post-genome biology, Cell. 103 (2000) 367–370.

[20] D. Gautheret, O. Poirot, F. Lopez, S. Audic, J.-M. Claverie, Alternatepolyadenylation in human mRNAs; a large-scale analysis by ESTclustering, Gen. Res. 8 (1998) 524–530.

[21] M. Kozak, An analysis of 5�-noncoding sequences from 699 verte-brate messenger RNAS, Nucleic Acids Res. 20 (1987) 8125–8148.

[22] F. Castets, et al., Zinidin, SG2NA, and Striatin are calmodulin-binding, WD-repeat proteins principally expressed in the brain,J. Biol. Chem. 275 (2000) 19970–19977.

[23] K. Reynolds, A.M. Zimmer, A. Zimmer, Regulation of RAR beta2mRNA expression: an evidence for an inhibitory peptide encoded inthe 5�-untranslated region, J. Cell. Biol. 134 (1996) 827–835.

[24] M.W. Wood, H.M.A. VanDongen, A.M.J. VanDongen, The 5�un-translated region of the N-methyl-D-aspartate receptor NR2A subunitcontrols efficiency of translation, J. Biol. Chem. 271 (1996) 8115–8120.

[25] E. Ito, et al., Erythroid transcription factor GATA-1 is abundantlytranscribed in mouse testis, Nature 362 (1993) 466–468.

[26] W. Spevak, B.D. Keiper, C. Stratowa, M.J. Castanon, Saccharomycescerevisiae cdc15 mutants arrested at a late stage in anaphase arerescued by Xenopus cDNAs encoding N-ras or a protein with betatransducin repeats, Mol. Cell. Biol. 13 (1993) 4953–4966.

[27] J. Sondek, A. Bohm, D.G. Lambright, H.E. Hamm, P.B. Sigler,Crystal structure of a GA protein �� dimer at 2A° resolution, Nature379 (1996) 369–374.

[28] K. Saxena, et al., Analysis of the physical properties and molecularmodeling of Sec13: a WD-repeat protein involved in vesicular traffic,Biochemistry 35 (1996) 15215–15221.

[29] W. Siffert, et al., Association of a human G-protein beta 3 subunitvariant with hypertension, Nat. Genet. 18 (1998) 45–48.

[30] N. Braverman, et al., Human PEX7 encodes the peroxisomal PTS2receptor and is responsible for rhizometric chondrodysplasia punc-tata, Nat. Genet. 15 (1997) 369–376.

[31] P.E. Purdue, J.W. Zhang, M. Skoneczny, P.B. Lazarow, Rhizometricchondrodysplasia punctata is caused by a deficiency of the humanPEX7, a homologue of the yeast PTS2 receptor, Nat. Genet. 15(1997) 381–384.

[32] K.A. Henning, et al., The Cockayne syndrome group A gene encodesa WD-repeat protein that interacts with CSB protein and a subunit ofRNA polymerase II TFIIH, Cell 82 (1995) 555–564.

[33] A.J. Van Gool, G.T.J. Van der Horst, E. Citterico, J.H.J. Hoeijmakers,Cockayne Syndrome: defective repair of transcription? EMBO J. 16(1997) 4155–4162.

[34] V. Lamour, et al., A human homologue of the S. cerevisiae HIR1 andHIR2 transcriptional repressors cloned from the DiGeorge syndromecritical region, Hum. Mol. Genet. 4 (1995) 791–799.

[35] Y.W. Qian, E.Y. Lee, Dual retinoblastoma-binding proteins withproperties related to a negative regulator of Ras in yeast, J. Biol.Chem. 270 (1995) 25507–25513.

[36] Y.W. Qian, et al., A retinoblastoma-binding protein related to anegative regulator of Ras in yeast, Nature 364 (1993) 648–652.

[37] F. Cecconi, G. Alvarez-Bolado, B.I. Meyer, K.A. Roth, P. Gruss,Apaf1 (CED-4 homolog) regulates programmed cell death in mam-malian development, Cell 94 (1998) 727–737.

[38] S. Adam-Klages, et al., FAN, a novel WD-repeat protein, couples thep55 TNF-receptor to neutral sphingomyelinase, Cell 86 (1996) 937–947.

[39] K.J. Sambrook, E.F. Fritsch, T. Maniatis, Molecular Cloning: ALaboratory Manual, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, New York, 1989.

[40] S.F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman, BasicLocal Alignment Search Tool, J. Mol. Biol. 215 (1990) 403–410.

[41] A. Bairoach, P. Bucher, K. Hofmann, The PROSITE database, itsstatus, Nucleic Acids Res. 25 (1997) 217–221.

328 B.N. Singh et al. / Genomics 81 (2003) 315–328