Natural genetic engineering of hepatitis C virus NS5A for immune system counterattack

13
NATURAL GENETIC ENGINEERING AND NATURAL GENOME EDITING Natural Genetic Engineering of Hepatitis C Virus NS5A for Immune System Counterattack Mahmoud M. El Hefnawi, a Wessam H. El Behaidy, b Aliaa A. Youssif, b Atek Z. Ghalwash, b Lamya A. El Housseiny, c and Suher Zada d a Informatics and Systems Department, Division of Engineering Research Sciences, the National Research Centre, Egypt b Faculty of Computers and Information, Helwan University, Egypt c Experimental Allergy Lab, Dermatology Department, Medical University of Vienna, Austria d Biology Department, The American University in Cairo, Egypt The Hepatitis C virus nonstructural 5A (NS5A) protein is a hydrophilic phosphopro- tein with diverse functions. The domain assignment of NS5A had been refined using a systematic in silico bioinformatics approach using DOMAC, the protein is divided into three domains and domain III is subdivided into two subdomains using ProDom and SSEP servers. The fold structure for domains II and III were predicted using the meta-server 3D-Jury. Scanning motif databases (SMART, BLOCKS, and PROSITE) gave new motifs. Two important motifs, the interleukins 1 and 8 interaction motifs, relat- ing to NS5A function in inducing the interleukin 8 promoter, were discovered from the BLOCKS scan. Protein–protein interaction motifs were predicted as hot loops and dis- ordered regions, corresponding to binding regions with the ds-protein kinase R, viral polymerase, and Src homology 3 signaling proteins binding motif. Other hot loops were predicted in the V3 region and in the single-stranded DNA-binding protein motif. The different mechanisms by which the NS5A protein leads to immune system signaling dysfunction points to the natural genetic engineering of this protein. Key words: natural genetic engineering; Hepatitis C virus; nonstructural 5A protein (NS5A); in silico motif prediction; protein–protein motif interaction prediction; integra- tive bioinformatics from structure to function; immune system counteraction; virus host interactions; domain separation; protein structural features prediction Introduction Hepatitis C Virus (HCV) is recognized as a major threat to global public health. An estimate of 3% of the world population (ap- proximately 180 million) are infected (World Health Organization website), 80% of the in- fected population are not capable of eliminat- ing the infection, which leads to chronicity, re- Address for correspondence: Mahmoud M. El-Hefnawi, Institutional address: 33, el tahrir street, Dokki, Giza, Egypt. Voice: +20106113787; fax: +20227352785. [email protected], [email protected]. sulting in hepatic fibrosis, cirrhosis, and, in- creasingly, hepatocellular carcinoma (HCC). 1 HCV is a member of the Flaviviridae fam- ily and is the sole member of the Hepacivirus genus. Viral isolates are further classified into six genotypes (1–6), and numerous subtypes (a, b, c ... etc). The most intensively studied of which are genotypes 1a and 1b. 1 Current combination therapy of pegylated interferon- (IFN-α) plus ribavirin, gives a response rate between 48% (genotypes 1, 4, 5, and 6), and 88% (genotypes 2 and 3). This stimulated a large body of research into new specific HCV Natural Genetic Engineering and Natural Genome Editing: Ann. N.Y. Acad. Sci. 1178: 173–185 (2009). doi: 10.1111/j.1749-6632.2009.05003.x c 2009 New York Academy of Sciences. 173

Transcript of Natural genetic engineering of hepatitis C virus NS5A for immune system counterattack

NATURAL GENETIC ENGINEERING AND NATURAL GENOME EDITING

Natural Genetic Engineering of Hepatitis CVirus NS5A for Immune System

CounterattackMahmoud M. El Hefnawi,a Wessam H. El Behaidy,b

Aliaa A. Youssif,b Atek Z. Ghalwash,b Lamya A. El Housseiny,c

and Suher Zadad

aInformatics and Systems Department, Division of Engineering Research Sciences,the National Research Centre, Egypt

bFaculty of Computers and Information, Helwan University, EgyptcExperimental Allergy Lab, Dermatology Department, Medical University

of Vienna, AustriadBiology Department, The American University in Cairo, Egypt

The Hepatitis C virus nonstructural 5A (NS5A) protein is a hydrophilic phosphopro-tein with diverse functions. The domain assignment of NS5A had been refined usinga systematic in silico bioinformatics approach using DOMAC, the protein is dividedinto three domains and domain III is subdivided into two subdomains using ProDomand SSEP servers. The fold structure for domains II and III were predicted using themeta-server 3D-Jury. Scanning motif databases (SMART, BLOCKS, and PROSITE) gavenew motifs. Two important motifs, the interleukins 1 and 8 interaction motifs, relat-ing to NS5A function in inducing the interleukin 8 promoter, were discovered from theBLOCKS scan. Protein–protein interaction motifs were predicted as hot loops and dis-ordered regions, corresponding to binding regions with the ds-protein kinase R, viralpolymerase, and Src homology 3 signaling proteins binding motif. Other hot loops werepredicted in the V3 region and in the single-stranded DNA-binding protein motif. Thedifferent mechanisms by which the NS5A protein leads to immune system signalingdysfunction points to the natural genetic engineering of this protein.

Key words: natural genetic engineering; Hepatitis C virus; nonstructural 5A protein(NS5A); in silico motif prediction; protein–protein motif interaction prediction; integra-tive bioinformatics from structure to function; immune system counteraction; virushost interactions; domain separation; protein structural features prediction

Introduction

Hepatitis C Virus (HCV) is recognized asa major threat to global public health. Anestimate of 3% of the world population (ap-proximately 180 million) are infected (WorldHealth Organization website), 80% of the in-fected population are not capable of eliminat-ing the infection, which leads to chronicity, re-

Address for correspondence: Mahmoud M. El-Hefnawi, Institutionaladdress: 33, el tahrir street, Dokki, Giza, Egypt. Voice: +20106113787;fax: +20227352785. [email protected], [email protected].

sulting in hepatic fibrosis, cirrhosis, and, in-creasingly, hepatocellular carcinoma (HCC).1

HCV is a member of the Flaviviridae fam-ily and is the sole member of the Hepacivirus

genus. Viral isolates are further classified intosix genotypes (1–6), and numerous subtypes(a, b, c . . . etc). The most intensively studiedof which are genotypes 1a and 1b.1 Currentcombination therapy of pegylated interferon-(IFN-α) plus ribavirin, gives a response ratebetween 48% (genotypes 1, 4, 5, and 6), and88% (genotypes 2 and 3). This stimulated alarge body of research into new specific HCV

Natural Genetic Engineering and Natural Genome Editing: Ann. N.Y. Acad. Sci. 1178: 173–185 (2009).doi: 10.1111/j.1749-6632.2009.05003.x c© 2009 New York Academy of Sciences.

173

174 Annals of the New York Academy of Sciences

Figure 1. Known domains and regions of the NS5A protein. The rectangles colored by cyan clarifies thethree known domains; 1–213, 250–339, and 352–445, which are separated by low complexity sequences(LCS). The other rectangles specify regions that have been studied before. These regions are zinc bindingdomain (36–198), ISDR (237–273), PKR (237–299), V3 (383–407), IRRDR (359–407), SH3 binding motif(26–32, 347–353), hyperphosphorylation region (228–274, 375–444), and NS5b binding regions (105–162, 273–331).

inhibitors, currently in clinical trials.2 HCV isa positive single stranded enveloped RNA virusof genomic size of ∼9.5 kbp, which containsa large open reading frame (ORF) encodinga polyprotein of ∼3000 amino acid residuesand an untranslated region (UTRs) at the 5′-and 3′-ends of the genome. The 5′ UTR con-tains an internal ribosome entry (IRES), allow-ing cap-independent initiation of translation.Also, RNA structures at the 3′ UTR end areimportant for viral replication.3 The polypro-tein is cleaved into 10 polypeptides by cellularand viral proteinases. The 10 HCV proteinsare organized in polyprotein structure with thefollowing order: NH2-C-E1-E2-p7-NS2-NS3-NS4A-NS4B-NS5A-NS5B-COOH.4 An addi-tional protein, F, is produced by ribosomalframe shift. Little is known about the functionof this protein. N terminus consists of threestructural proteins: core, glycoproteins E1, andE2. Following E2 is p7, a protein that oligomer-izes to form ion channels and has been spec-ulated to be incorporated into virus particles.The C-terminal two-thirds of the polyproteincomprises the six nonstructural proteins; NS2,NS3, NS4A, NS4B, NS5A, and NS5B.3 Two ofthese are functionally well characterized, NS3the viral protease, which needs NS4a as a cofac-tor, and NS5B the viral RNA-dependent RNApolymerase. An increasing body of literature

points to a role for these nonstructural proteinsin perturbing cell signaling and mediating im-mune evasion.3,4

The nonstructural 5A (NS5A) protein ofHCV has been the subject of intensive researchover the last decade for its critical importancefor HCV.5 The NS5A is a 56–58 kDa, hy-drophilic phosphoprotein. It has been reportedto have multifunctional activities; it plays keyroles in viral RNA replication, modulationof the cellular environment, HCV antiviralresistance, regulation of cell growth and apop-tosis.5 NS5A is a phosphoprotein with twodifferent phosphorylation states, p56 and p58indicating the corresponding molecularweights. The function of them is not entirelyclear, but there is evidence that phosphoryla-tion status of NS5A may play a regulatory rolein replication.5 NS5A has been divided intothree domains (Fig. 1); Domain I (1–213 aa),domain II (250–342 aa), and domain III (356–447 aa) identifying low complexity sequences(LCS) into protein.6 Domain I includes themembrane anchor (1–31 aa) and the zincbinding (36–198 aa). The three-dimensionalNMR structure of the membrane anchorrevealed that it forms a long amphipathic-helix(5–25 aa) embedded in-plane in the cytosolicleaflet of the endoplasmic reticulum (ER) mem-brane, with a hydrophobic side buried in the

El Hafnawi et al.: Natural Genetic Engineering of Hepatitis C Virus 175

membrane and a polar/charged side accessiblefrom the cytosol.7 The X-ray crystallographystructure of the zinc binding domain revealstwo identical monomers per asymmetricunit packed. The first monomer (36–100 aa)consists of three-stranded antiparallel β-sheet(B1, B2, and B3) with an α-helix while thesecond (101–198 aa) consists of four-strandsantiparallel β-sheet (B4, B5, B6, and B7) anda small two-strands antiparallel β-sheet nearthe C terminus (B8 and B9) and surroundedby extensive random coil structures. It alsorevealed the presence of unconventional,highly conserved, zinc metal ion coordinationsite encaged by four cysteine residues and adisulfide bond that connects the side chainsof β-strand B6 and B9.8 Domain II containsa region of NS5A that has been referredto as the interferon sensitivity determiningregion (ISDR, 237–276 aa), so named for apossible correlation between hyper-mutationin this sequence and the responsiveness to IFNtherapy in patients chronically infected withHCV subtype 1b.9 Later, the V3 region wasfound to be a more important determinantof IFN therapy success.10–12 A study done byPuig-Basagoiti and his colleagues suggestedthat the composition and dynamics of HCVNS5A quasispecies, particularly in the V3domain, might play a role in the response tocombined IFN/ribavirin therapy.13 A recentstudy on the mean number of mutations in theV3 region (aa) or that in the V3 region plus itsN-terminally flanked region, which referredas interferon/ribavirin resistance-determiningregion (IRRDR, 359–407 aa), showed thatthe high degree of sequence variation in V3(≥5) or IRRDR(≥6) and the presence ofdetectable levels of anti-NS5A antibodies inthe pretreatment sera would be useful factorsto predict early viral responders who clearedthe virus within 16 weeks.11,12

When we start to talk about NS5A protein in-teractions, we will find a wealth of informationand a body of literature works.5 NS5A binds toa Src homology 3 (SH3) domain of a numberof cellular signaling proteins including Grb2,

Amphiphysin II and Src-family tyrosine kinasesat (347–353 aa), and to the nonstructural 5Bat (105–162 aa) and (273–331 aa). A nuclearlocalization signal (NLS) at (351–359 aa) thatdoes not direct NS5A to the nucleus by itself,but is nonetheless functional in directing nu-clear translocation when placed at the aminoterminus of a reporter gene.5

NS5A counteracts the antiviral effect of IFNagainst HCV. IFNs are natural cellular proteinswith various actions, including induction of anantiviral state in their target cells and cytokinesecretion, recruitment of immune cells, and in-duction of cell differentiation. At the surfaceof target cells, IFN-α is specifically fixed ontohigh-affinity receptors that trigger a cascadeof intracellular reactions. These reactions leadto activation of numerous IFN-inducible genesthat are the mediators of the various cellularactions of IFN-α. They are responsible for theantiviral effects of IFN-α through two distinctbut complementary mechanisms: induction ofan antiviral state not specific to the virus ininfected cells, which results in a direct inhibi-tion of HCV replication; also, induction of im-munomodulatory effects that enhance specificanti-HCV immune responses of the host.10

The first mechanism includes three IFN-induced proteins and enzymatic pathways thathave been studied extensively: the 2′-5′ oligoad-enylate synthetase (2′-5′ OAS) protein, theMx proteins, and the double-stranded RNA-dependent protein kinase (PKR). But, the Mxproteins do not appear to play an importantrole in HCV infection. The 2′-5′ OAS is a cel-lular enzyme synthesized in response to IFN-stimulation. In infected cells, 2′-5′ OAS enzy-matic activity is induced by double-strandedRNAs (dsRNA). The 2′-5′ OAS catalyzes poly-merization of adenosine triphosphate into 2′-5′

oligoadenylate, that in turn activates a cellu-lar endoribonuclease, RNase L, at subnanomo-lar concentrations. RNase L degrades cellu-lar and viral single-stranded RNAs and thus,viral replication is inhibited in a non-virus-specific way.14 PKR is a protein playing arole in protection against viral infections, the

176 Annals of the New York Academy of Sciences

synthesis of which is induced by IFN-α. Itsactivation by double-stranded RNAs (dsRNA)results in its autophosphorylation. PKR acti-vation is responsible for the phosphorylationof α-subunit of eukaryotic translation initiationfactor (eIF-2), which in turn inhibits proteinsynthesis, and thus blocking viral replication ina non-virus-specific way.15

The second mechanism, IFN-α, induces ex-pression of class I major histocompatibilitycomplex antigens. IFN-α also activates effec-tor cells, such as macrophages, natural killercells, and cytotoxic T lymphocytes. Finally,IFN-α interacts with the cytokine cascade ina complex way: it stimulates the production oftype 1 immune response (Th1), which synthe-size mainly IFN-γ and IL-2, and reduces thatof type 2 immune response (Th2), which syn-thesize mainly IL-4 and IL-5. IFN-α also hasanti-inflammatory properties through the inhi-bition of peripheral production of IL-1, IL-8,and tumor necrosis factor α and stimulation ofIL-10 production.10

The NS5A protein has many mechanisms tocounterattack against IFN therapy. An impor-tant mechanism is the ability of NS5A to inhibitan IFN-induced protein, and PKR through adirect interaction with the protein kinase cat-alytic domain. The interaction of NS5A withPKR required the ISDR and the 26 residuesC-terminal, which is called the PKR bind-ing region (237–302 aa).15 Also, the N termi-nal of NS5A (1–148 aa) showed that there isa physical interaction with two separate por-tions of 2′-5′ OAS and counteracted the an-tiviral activity of IFN in an ISDR-independentmanner. Single point mutation (Phe to Asn) ofresidue 37 of NS5A (1–148 aa) significantly re-duced 2′-5′ OAS binding activity and negatedthe significant IFN-inhibitory activity of NS5A(1–148 aa). Controversially, mutation of Phe-to-Leu (F37L) augmenting interaction and inturns IFN resistance.14 In addition, the NS5Aexpression in human cells induced IL-8 mRNA.The chemokine IL-8 was able to diminish theability of IFN to inhibit an early stage of vi-ral replication because IL-8 attenuated the in-

hibition the formation of viral proteins. IL-8inhibitory action on α-IFN antiviral activitywas associated with reduced 2′-5′ OAS activ-ity, a pathway well correlated with the anti–encephalomyocarditis virus action on IFN-α.16

Our goal was the elucidation of the NS5Aprotein interaction motifs, structural features,and integrating the body of knowledge about it.This master of regulation17 has been shown tointeract with tens of cellular proteins that func-tion in the immune system networks, signalingtransduction, cell cycle, growth, apoptosis, andcalcium channels.5

Thus, we have done in silico structural andfunctional analysis using bioinformatics toolsand techniques on the NS5A protein of Hepati-tis C virus. We refined the domain assignment,predicted the 3-D structure of subdomains, andpredicted new motifs that affect the functionof NS5A protein in immune system counterat-tack. These all enlighten us with insights intothe natural genetic engineering of this some-times called “promiscuous” protein.6

Methods

Domain Separation

The known domains of NS5A had been re-fined using one of the top domain predictionservers in CASP7, DOMAC.18 This server usesa hybrid approach that combines template-based and ab-initio methods for domain predic-tion. Our domains are further subdivided intosubdomains using ProDom19,20 and SSEP21

servers. All used servers are classified as thetop in CASP7 and CAFASP4.

Sequence Homology Search

Sequence analysis was performed using theNS5A protein domain sequences as the queryto the two major heuristic algorithms forperforming database searches: BLAST22 andFASTA.23 Nevertheless, the exhaustive methodof dynamic programming allows the maxi-mum sensitivity for finding homology at the

El Hafnawi et al.: Natural Genetic Engineering of Hepatitis C Virus 177

sequence level than the heuristic methods. So,we used both the web servers ParAlign24 andScanPS25 that implement a modified versionof the Smith–Waterman algorithm optimizedfor parallel processing. Our selection in allthese servers is approximately fixed; Blosum62is selected as the substitution matrix, the gappenalty varies between 9 and 11 upon the avail-able options, and the gap extension is always 1.Also, the low complexity masking is selected inboth ParAlign and BLAST and the SEG filterin FASTA.

Multiple Sequence Alignment

The iterative approach available in Mafft (L-INS-i)26 and ProbCons27 were consistently themost accurate sequence alignment programs,with Mafft being the faster of the two.28 So,for multiple sequence alignment, the Mafftserver is used. And, for pairwise alignment,SSEARCH29 and LALIGN30 web servers wereused to apply the local alignment strategy.

Secondary Structure Prediction

Conserved secondary structural motifs, suchas α-helices and β-sheets, form a structural basisfor predicting the tertiary structure and know-ing structurally important regions for the pro-tein. Because different secondary structure pre-diction programs tend to give varied results,to maximize the accuracy of prediction, it isrecommended to use several most robust pre-diction methods and draw a consensus basedon the majority rule. The five servers Porter,31

PROF,32 PSIPRED,33 SSPRO,34 and SAM-t0235 have been used to get the conserved sec-ondary structure motifs in our sequence. Distillsolvent accessibility36 was used to find exposedand buried regions in the NS5A protein.

3D Structure Prediction

Homology modeling technique attempts topredict the unknown structure based on a

known structure (template), whose similarityto the sequence is greater than 30%. For do-mains that have no homologue, fold recog-nition, which uses a library of templates, orab-initio methods are tried. The HOMER,37

Swiss-Model,38 and Modeller39 are the serversthat are used for homology modeling predic-tion. While fold recognition, the meta servers3D-Jury,40 and Pcons41 are used, because theyoutperform the individual, autonomous serversas recent LB and CAFASP experiments havebeen demonstrated.42 For ab-initio methodthe ITASSER43 server, one of the top serversin the free modeling section in CASP7, isused.

Model Evaluation

ProQ 44 server is a neural network based pre-dictor. Based on a number of structural features,it predicts the quality of a protein model.

Functional Motifs

There are two approaches used to de-rive information about motifs. The first ap-proach matches the regular expressions witha query sequence and the PROSITE45 webserver represents this approach. The sec-ond approach, which includes BLOCKS,46,47

PRINTS,48 PRODOM,19,20 and SMART49

web servers, uses a statistical model, suchas position-specific scoring matrices (PSSMs),profile, or Hidden Markov models (HMMs) topreserve the sequence information from a mul-tiple sequence alignment and express it withprobabilistic models.

BLOCKS, PRINTS, SMART andPROSITE databases have been searchedfor different motifs all over the NS5A protein.Finding a relationship between these motifswould help us to understand its functionality.Lastly, Scansite50 was used to predict the NS5Aprotein interaction motifs.

178 Annals of the New York Academy of Sciences

Figure 2. Refined domain assignment of the NS5A protein. The cyan rectangles show the new boundariesof NS5A domains after prediction. Domain 1 is divided into two subdomains 1a (1–101) and 1b (102–203). Domain 2 has from 204 to 295. Domain 3 is also divided into two subdomains 3a (296–359) and3b (360–445).

Results

Domain Separation

The known domains of NS5A had been re-fined using one of the top domain predictionservers in CASP7, DOMAC.18 This server usesa hybrid approach that combines template-based and ab-initio methods for domain pre-diction. The NS5A, using DOMAC, is di-vided into three domains and four subdomains(Fig. 2). But, our third domain has been subdi-vided into two subdomains as a result of blast-ing ProDom database19,20 and to confirm thisseparation SSEP server,21 which has high sen-sitivity and specificity of predictions on two do-mains in CASP6 and CAFASP4, has been used.This gives us a new separation of the NS5Aprotein (Fig. 1A). Our domains are domainI (1–203), domain II (204–295), and domainIII (296–445). The first two subdomains are1a (1–101) and 1b (102–203), which matchesthe experimental verified separation, and thelast two subdomains are 3a (296–359) and 3b(360–445).

Sequence Homology Search

The results obtained from the two heuristicmethods BLAST22 and FASTA23 and the twoexhausted methods ParAlign24 and ScanPS25

demonstrate only one structure (1QHB) iscommon between them. The 1QHB structurematches our sequence (274–352) with 30.7%identity, 0.68 E-value, 33.3 bits and 139.4z-score. Also, there is a match between residues(362–393) and the C-terminal of a Src homol-ogy 2 (SH2) domain with identity 40% and withE-value 0.37.

Secondary Structure

Using a consensus result from the output offive robust servers: Porter,31 Prof,32 PSIPred,33

SSPro,34 and SAM-t0235 in secondary struc-ture prediction, the helices are specified at 208–219, 231–233, 253–259, 297–300, 362, 364–376 and β-sheets at 241–242, 265–269, 273–277, 357–360, 442–443.

Using the Distill web suite interface,36 thesolvent accessibility of the known and predictedprotein interaction motifs of the NS5A proteinwere found to be mostly exposed. The Il8 inter-action motif was 87.5% exposed, the V3 regionwas 100% exposed, the SSDB motif was 63.5%exposed and the SH2 motif was 75% exposed.

3-D Structure Prediction

The 3-D structure for the second half ofNS5A protein is not solved until now. To try topredict their structures, the sequences of eachdomain and subdomains are blasted versus thePDB database to find a suitable template(s) tobe used.

So, fold recognition was performed usingthe meta-server 3D-Jury40 for the last two do-mains (Fig. 3) (204–295) and (296–445). Thedomain (204–295) has a fold from the struc-ture 1fla_A, which is classified as electron trans-port α/β class, with Jscore equal to 9.38. Also,the subdomain 3a (296–359) and 3b (360–445)have folds from the structures 2cl3_A (nuclearprotein) with 10.83 Jscore and 1nvi_D (Trans-ferase, α + β class) with 8.14 Jscore respectively.All the folds are classified using ProQ 44 as anextremely good model, whereas all the foldsfrom Pcons/Pmodeller server41 for the subdo-main 3b are classified as bad models. Usingthe ab-initio server ITASSER,43 a fairly good

El Hafnawi et al.: Natural Genetic Engineering of Hepatitis C Virus 179

Figure 3. A graphical view of the structural foldspredicted for domains II, IIIa, and IIIb. The figureshows the fold for the two last domains of NS5A. (A)The fold of domain2 (204–295) based on structure1fla_A, which is classified as an α/β class. (B) Thefold of subdomain 3a 296–359) based on structure2cl3_A. (C) The fold of subdomain 3b (360–445)based on structure 1nvi_D, which is classified as anα + β class.

model has been obtained for the domain 204–295.

Functional Motifs

Blocks Results

Table 1 shows Blocks46,47 results for eachdomain and sub-domain. Five blocks at sub-domain 1a (1–101); there are two blocks indishevelled protein family with combined E-value 0.0069, and one block in the other threefamilies, which are glycoside hydrolase-family3-C-terminal, sulphate transporter, and proki-neticin with combined E-value 0.25, 0.32, and0.35 respectively. One block exists, at subdo-main 1b (102–203), in the IL-1 receptor type Iprecursor signature family with combined E-value 0.076. At domain II (204–295), there aretwo blocks; one in the IL-8B receptor signa-ture family and the other in doublecortin fam-ily with combined E-value 0.14 and 0.2 respec-tively. At subdomain 3a (296–359), there is oneblock in the muscarinic M1 receptor signaturefamily with combined E-value 0.51 and at lastsubdomain (360–445) there are two blocks; onein single-stranded DNA-binding protein familyand the other in glycosyl transferase-family 2with combined E-value 0.1 and 0.47, respec-tively.

Prosite Results

Table 2 shows Prosite server54 results obtai-ned for our sequence. By searching PROSITEdatabase, different signatures were retrieved allover the NS5A. These signatures are leucinezipper at (16–37), cell attachment sequence(RGD) at (48–50), camp_phospho_site at (354–357), asn_glycosylation at (69–72), myristyl at(51–56, 60–65, 221–226, 262–267, and 411–416), pkc_phospho site at (71–73, 213–215,235–237, 303–305, 325–327, 351–353, and384–386), ck2_phospho site at (114–117, 151–154, 266–269, 384–387, 392–395, and 437–440), and tyr_phospho site at (122–129).

180 Annals of the New York Academy of Sciences

TABLE 1. Blocks Results for Each Domain and Subdomain of NS5A Protein

Domain Match Position # Block Family Name E-value

1a (8–26) (81–97) 2 Dishevelled protein 0.00694–19 1 Glycoside hydrolase-family 3-C-terminal 0.2534–43 1 Sulphate transporter 0.3251–65 1 Prokineticin 0.35

1b 165–182 1 Interleukin1 receptor type I precursor 0.076II 218–233 1 Interleukin 8B receptor signature 0.14

219–247 1 Doublecortin 0.23a 332–349 1 Muscarinic M1 receptor signature 0.513b 399–444 1 Single-stranded DNA-binding protein 0.088

400–412 1 Glycosyl transferase-family 2 0.47

SMART Results

The SMART server49 uses different tools toanalyze the query protein sequence of confi-dential results from these servers. For our se-quence, the servers that have returned resultsto SMART are DisEMBL, SEG, and SCOP.DisEMBL specifies the disordered regions inproteins, which are often, contain short linearpeptide motifs (e.g., SH3-ligands and targetingsignals) that are important for protein function.The specified disordered regions are located at131–141, 217–234, 281–289, 299–313, 342–355, 374–410, and 420–445. The SEG pro-gram specifies the low complexity regions at213–241 and 385–398. And finally, the SCOPhas two hits with our sequence.

The first hit (d1bn7a_) matches our sequenceat 141–169 with E-value equal to 3.0. This hitclassifies this match as α/β class and α/β hy-drolases superfamily. The second hit (d1hq1a_)

matches our sequence at 197–222 with E-valueequal to 1.3. This hit classifies this match as allα-class and signal peptide-binding domain.

Scansite Results

Scansite50 has correctly predicted some ofthe experimentally known phosphorylationsites of NS5A and the SH3 binding motif. Italso predicted a SH2 motif, as well as otherphosphorylation sites.

Discussion

It is known that the NS5A is divided intothree domains: 1–213, 250–339, 353–445.6

The top ranking servers have shifted the bound-aries of these domains as: 1–203, 204–295, and296–445) and four subdomains. The first twosubdomains are 1a (1–101) and 1b (102–203),

TABLE 2. Prosite Database Results

Category Signature Matching positions

RNA Associated Protein Leucine Zipper 16–37Domain RGD 48–50Posttranslational Camp_Phospho_Site 354–357

Modifications Asn_Glycosylation 69–72Tyr_Phospho_Site 122–129Myristyl 51–56, 60–65, 221–226, 262–267, and 411–416Pkc_Phospho_Site 71–73, 213–215, 235–237, 303–305,325–327,

351–353, and 384–386Ck2_Phospho_Site 114–117, 151–154, 266–269, 384–387,

392–395, and 437–440

El Hafnawi et al.: Natural Genetic Engineering of Hepatitis C Virus 181

TABLE 3. Scope Database Results

Super Family Class Sequence Match E-value

alpha/beta-Hydrolases Alpha & Beta d1bn7a_ 141–169 3.00Signal peptide-binding domain All Alpha d1hq1a_ 197–222 1.3

which matches the experimental verified sep-aration,6 and the last two subdomains are 3a(296–359) and 3b (360–445).

Correlation of Structural and FunctionalPrediction

From the consensus secondary structure andSMART49 results: The ISDR (237–272) in-cludes an α-helix at 253–259 and β-sheets at241–242, and 265–269. While the rest of PKRdomain (237–299) has an α-helix at 296–300and a β-sheet at 273–277. Also, the NLS (351–359) includes a β-sheet at 357–360. The hotloops specified by SMART results are withinknown protein interaction regions or withinprotein interaction motifs predicted in thisstudy. The hot loops at 281–289, 299–313,342–355, 374–410, 420–445, and 131–141 arelocated within PKR (237–299), binding re-gion with NS5b (273–331), SH3 binding region(347–353), V3 region (383–407), SSDP (399–444), binding region with NS5b (105–162),respectively.

Exploring the interaction motifs of the NS5Aprotein is an intriguing approach. Findingbinding motifs is superior to finding sites andmore knowledge can be gleaned from it. Wepoint here to our analysis of some of the moresignificant motifs that we predicted.

Immune Counteraction Mechanisms

Scanning the BLOCKS,46,47 there arematching motifs between IL-8B receptor (IL-8R-B) signature and our sequence at 218–233 with 50% identity and 0.14 E-value. Thiscorrelates with the SMART classification ofthat region as disordered by hot loops. Also,an experimental study shows that NS5A lack-

ing 222 amino-terminal amino acids (�N222)stimulated the IL-8 promoter to higher levelsthan did the full-length NS5A protein.51 Fur-thermore from multiple sequence alignment ofHCV genotypes, this motif is highly conserved(∼92%). This hot loop is surrounded by twoα-helices regions from 208–219 and 231–233.We therefore deduce that this is an interactionmotif important for Il8 induction, and thereforeinhibiting IFN-α.

For an alternative mechanism of NS5A in-duction of IL8, there is a match between IL-1receptor type I (IL-1R-1) precursor signatureand our sequence at 165–182 with 50% iden-tity and 0.076 E-value. After studying IL1-R-1,it is found that the NH2-terminal (extracellular)of IL-1 receptor is responsible for IL-1 bind-ing. This extracellular portion is organized intothree domains, similar to those of members ofthe immunoglobulin (Ig) gene superfamily, andthey share a common three-dimensional struc-ture consisting of two β-sheets held togetherby a disulfide bond.52 On the other hand, ourmotif (165–182) matches the second domain ofIg-like domains in IL-1 receptor and it con-sists of two β-sheets. There are also seven otherβ-sheets in domain 1. Furthermore, from mul-tiple sequence alignment of HCV genotypes,these motifs are highly conserved. We pro-pose a mechanism by which the NS5A pro-tein can directly induce the IL8 promoter, orthrough interaction with the IL1,53 which reg-ulates IL8 production (Fig. 4).54,55,56 These twomechanisms can also be supported by notingthe signal-peptide binding domain (Table 3),which indicates along with other experimen-tal evidence (presence of anti-NS5A antibodies)that NS5A is found outside the nucleus in thecytoplasm, and extracellularly.

182 Annals of the New York Academy of Sciences

Figure 4. Hypothesized mechanism for IL8 induction. Interleukin-1 receptor type I (IL-1R-1) precursorsignature located at 165–182 intercepts the IL-1. Due to this binding, the two consequences (2) and (3)will be performed. (2) The IL8 promoter will be activated. (3) The nuclear factor Kappa-B (NF-κB) will alsobe activated. The other mechanism, supposed by step (4) is that the IL-8 promoter is receipted by the IL-8Breceptor signature located at 218–233.

HCV Replication

The motif at (399–444) matches the single-stranded DNA binding protein (SSDP) fam-ily with 34.78% identity and significantE-value 0.088. This is a family of eukaryoticSSDPs with specificity to a pyrimidine-richelement found in the promoter regions. SSDPspecifically binds single-stranded pyrimidinesequence. The high affinity of this protein

for this conserved pyrimidine-rich region sug-gests that it might be involved in the tran-scriptional regulation.57 So, this may suggesta role for NS5A in binding to single-strandedDNA/RNA during viral replication.58

It was shown previously how HCV inhibitsIFN signaling59 by upregulation of proteinphosphatase (PP2).60,61 It was also previouslyshown that HCV proteins inhibit signal trans-duction through the JACK STAT pathway.62

Figure 5. Different mechanisms by which NS5A protein perturb the immune system. The first rectanglespecifies one of resistance mechanism. The N terminal of NS5A (1–148) physically interacted with 2′-5′

oligoadenylate synthetase (2′-5′OAS) and counteracted the antiviral activity of interferon (IFN). The lastrectangle specifies the double-stranded RNA-dependent protein kinase (PKR) binding domain which inhibitsthe activation of PKR. The PKR protein kinase is activated by double-strand RNAs (dsRNA) of virus whichresults in its autophosphorylation. Activated PKR, in turn, phosphorylates the α-subunit of eukaryotic translationinitiation factor eIF-2 which shuts off viral translation. The two middle rectangles specify the two predictedmotifs, IL-1R-1 and IL-8R-B, which are responsible for IL-8 regulation. The chemokine IL-8 was able to diminishthe ability of IFN to inhibit an early stage of viral replication.

El Hafnawi et al.: Natural Genetic Engineering of Hepatitis C Virus 183

Sarasin et al.63 recently showed that HCVdownregulates the JACK STAT pathway,which leads to attenuation of the IFN-inducedresponse. The NS5A protein binds to a SH3 do-main which intercepts tyrosine signaling.5 Wealso discovered a SH2 homologous region thatcould possibly be involved in binding to andinterception of kinase signaling.

Several studies have previously pointed tothe possibility of the sequence variations in thePKR binding region as playing a major role inHCV resistance.64 However, these studies areconflicting and no convincing evidence is yetpresented.65 In this study, we found two im-portant motifs for IL-8 induction that lead topartial inhibition of the INF signaling path-way. We also found new motifs for the NS5Aprotein–protein interactions (PPIs).

The NS5A immune system interceptionmechanisms that were not elaborated on herefor lack of space can be summarized as follows:cell anti-apoptotic activity and cell growth stim-ulation,66 blocking the activation of interferonregulatory factor 3 (IRF 3),67 subverting of cellsignaling pathways via interaction with Grb2and P85 phosphatidylinositol 3-kinase,68 regu-lating both mRNA and protein expression ofIL8 cytokine,69,66 interacting with different ty-rosine and serine kinases.70,5

The different mechanisms by which theNS5A protein intercepts the immune system(Fig. 5) and different signaling pathways alsointercepted all point to the clever natural ge-netic engineering of this protein. Through in-teractions with many host factors, interceptionof IFN signaling pathways and interaction withmany oncoproteins,71 the NS5A has an impor-tant role destabilizing the cell environment andfacilitating cancer.

Conflicts of Interest

The authors declare no conflicts of interest.

References

1. Sugiyama, K. 2004. Genomic structure and functionof untranslated region, structural region and non-

structural region of hepatitis C virus RNA. Nippon

Rinsho. 62(Suppl 7): 32–37.2. Zeuzem, S. 2008. Telaprevir, peginterferon alpha-

2a, and ribavirin for 28 days in chronic hepatitis Cpatients. J. Hepatol. 49: 157–159.

3. Appel, N. et al. 2006. From structure to function: Newinsights into hepatitis C virus RNA replication. J. Biol

Chem. 281: 9833–9836.4. Krekulova, L., V. Rehak & L.W. Riley. 2006. Struc-

ture and functions of hepatitis C virus proteins:15 years after. Folia Microbiol (Praha.) 51: 665–680.

5. Macdonald, A. & M. Harris. 2004. Hepatitis C virusNS5A: Tales of a promiscuous protein. J. Gen. Virol.

85: 2485–2502.6. Tellinghuisen, T.L. et al. 2004. The NS5A protein of

hepatitis C virus is a zinc metalloprotein. JBC 279:48576–48587.

7. Penin, F. et al. 2004. Structure and function of themembrane anchor domain of hepatitis C virus non-structural protein 5A. JBC 279: 40835–40843.

8. Tellinghuisen, T.L., J. Marcotrigiano & C.M. Rice.2005. Structure of the zinc-binding domain of anessential replicase component of hepatitis C virus re-veals a novel fold. Nature 435: 374–379.

9. Enomoto, N. et al. 1996. Mutations in the nonstruc-tural protein 5a gene and response to interferon inpatients with chronic hepatitis C virus 1b infection.N. Engl. J. Med. 334: 77–81.

10. Pawlotsky, J.M. 2000. Hepatitis C virus resistance toantiviral therapy. Hepatol. 32: 889–896.

11. El-Shamy, A. et al. 2007. Prediction of efficient vi-rological response to pegylated interferon/ribavirincombination therapy by NS5A sequences of hepatitisC virus and anti-NS5A antibodies in pre-treatmentsera. Microbiol. Immunol. 51: 471–482.

12. El-Shamy, A. et al. 2008. Sequence variation in hep-atitis C virus nonstructural protein 5A predicts clin-ical outcome of pegylated interferon/ribavirin com-bination therapy. Hepatol. 48: 38–47.

13. Puig-Basagoiti, F. et al. 2005. Dynamics of hepati-tis C virus NS5A quasispecies during interferon andribavirin therapy in responder and non-responderpatients with genotype 1b chronic hepatitis C. J. Gen.

Virol. 86: 1067–1075.14. Taguchi, T., M. Nagano-Fujii, M. Akutsu, et al. 2004.

Hepatitis C virus NS5A protein interacts with 2′, 5′-oligoadenylate synthetase and inhibits antiviral activ-ity of IFN in an IFN sensitivity determining region-independent manner. J. Gen. Virol. 85: 959–969.

15. Gale, M.J., M.J. Korth, N.M. Tang, et al. 1997. Ev-idence that hepatitis C virus resistance to interferonis mediated through repression of the PKR proteinkinase by the nonstructural 5A protein. J. Virol. 230:217–227.

16. Khabar, K.S.A., F. Al-Zoghaibi, M.N. Al-Ahdal, et al.1996. The alpha chemokine, interleukin 8, inhibits

184 Annals of the New York Academy of Sciences

the antiviral action of interferon alpha. J. Exp. Med.

186: 1077–1085.17. Szabo, G. 2006. Hepatitis C virus NS5A protein—A

master regulator. Gastroenterology. 130: 995–999.18. Cheng, J. 2007. DOMAC: An accurate, hybrid pro-

tein domain prediction server. Nucl. Acids Res. 35:354–356.

19. Servant, F. et al. 2002. ProDom: Automated cluster-ing of homologous domains. Brief. Bioinform. 3: 246–251.

20. Bru, C. et al. 2005. The ProDom database of proteindomain families: More emphasis on 3D. Nucl. Acids

Res. 33: 212–215.21. Gewehr, J.E. & R. Zimmer. 2006. SSEP-domain:

Protein domain prediction by alignment of secondarystructure elements and profiles. Bioinformatics 22:181–187.

22. Altschul, S.F. et al. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database searchprograms. Nucl. Acids Res. 25: 3389–3402.

23. Pearson, W.R. 1990. Rapid and sensitive sequencecomparison with FASTP and FASTA. Methods Enzy-

mol. 183: 63–98.24. Sæbø, P.E. et al. 2005. PARALIGN: Rapid and sensi-

tive sequence similarity searches powered by parallelcomputing technology. Nucl. Acids Res. 33: 535–539.

25. Walsh, T.P. et al. 2008. SCANPS: A web server foriterative protein sequence database searching by dy-namic programming, with display in a hierarchicalSCOP browser. Nucl. Acids Res. 36: W25–W29.

26. Katoh, K. et al. 2005. MAFFT version 5: Improve-ment in accuracy of multiple sequence alignment.Nucl. Acids Res. 33: 511–518.

27. Do, C.B. et al. 2005. ProbCons: Probabilis-tic consistency-based multiple sequence alignment.Genome Res. 15: 330–340.

28. Nuin, P.A.S., Z. Wang & E.R.M. Tillier. 2006. Theaccuracy of several multiple sequence alignment pro-grams for proteins. BMC Bioinform. 7: 471.

29. Ropelewski, A.J., H.B. Nicholas & D.W. Deerfield2nd. 2004. Mathematically complete nucleotide andprotein sequence searching using Ssearch. Curr. Protoc.

Bioinform. Chapter 3: Unit 3.10.30. Sarachu, M. & M. Colet. 2005. EMBOSS: A

web interface for EMBOSS. Bioinformatics 21: 540–541.

31. Pollastri, G. & A. McLysaght. 2005. Porter: A new,accurate server for protein secondary structure pre-diction. Bioinformatics 21: 19–20.

32. Rost, B., G. Yachdav & J. Liu. 2004. The predictprotein server. Nucl. Acids Res. 32: 321–326.

33. McGuffin, L.J., K. Bryson & D.T. Jones. 2000. ThePSIPRED protein structure prediction server. Bioin-

formatics 16: 404–405.34. Cheng, J. et al. 2005. SCRATCH: A protein structure

and structural feature prediction server. Nucl. Acids

Res. 33: 72–76.35. Karplus, K. et al. 2005. SAM-T04: What is new

in protein-structure prediction for CASP6. Proteins

61(Suppl 7): 135–142.36. Pollastri, G. et al. 2007. Accurate prediction of pro-

tein secondary structure and solvent accessibility byconsensus combiners of sequence and structure in-formation. BMC Bioinform. 8: 201.

37. Tosatto, S.C.E. 2005. The VICTOR Package for 3Dprotein structure modelling. J. Comput. Biol. 12(10):1316–1327. doi:10.1089/cmb.2005.12.1316

38. Kiefer, F. et al. 2008. The SWISS-MODEL repos-itory and associated resources. Nucl. Acids Res. 22:195–201.

39. Eswar, N. et al. 2008. Protein structure modeling withMODELLER. Methods Mol. Biol. 426: 145–159.

40. Kajan, L. & L. Rychlewski. 2007. Evaluation of 3D-Jury on CASP7 models. BMC Bioinform. 8: 304.

41. Wallner, B., P. Larsson & A. Elofsson. 2007.Pcons.net: Protein structure prediction meta server.Nucl. Acids Res. 35: 369–374.

42. Rychlewski, L. & D. Fischer. 2005. LiveBench-8: Thelarge-scale, continuous assessment of automated pro-tein structure prediction. Protein Sci. 14: 240–245.

43. Wu, S., J. Skolnick & Y. Zhang. 2007. Ab initio mod-eling of small proteins by iterative TASSER simula-tions. BMC Biol. 5: 17.

44. Wallner, B. & A. Elofsson. 2003. Can correct proteinmodels be identified? Protein Sci. 12: 1073–1086.

45. Hulo, N. et al. 2007. The 20 years of PROSITE. Nucl.

Acids Res. 36: D245–D249.46. Henikoff, J.G. et al. 2000. Increased coverage of pro-

tein families with the blocks database servers. Nucl.

Acids Res. 28: 228–230.47. Henikoff, S., J.G. Henikoff & S. Pietrokovski. 1999.

Blocks+: A non-redundant database of protein align-ment blocks derived from multiple compilations.Bioinformatics 15: 471–479.

48. Attwood, T.K. et al. 1997. The PRINTS database ofprotein fingerprints: A novel information resourcefor computational molecular biology. J. Chem. Inf.

Comput. Sci. 37: 417–424.49. Ponting, C.P. et al. 1999. SMART: Identification and

annotation of domains from signalling and extra-cellular protein sequences. Nucl. Acids Res. 27: 229–232.

50. Wan, J. et al. 2008. Meta-prediction of phospho-rylation sites with weighted voting and restrictedgrid search parameter selection. Nucl. Acids Res. 36:e22.

51. Polyak, S.J. et al. 2001. Hepatitis C virus nonstruc-tural 5A protein induces interleukin-8, leading topartial inhibition of the interferon-induced antiviralresponse. J. Virol. 75: 6095–6106.

El Hafnawi et al.: Natural Genetic Engineering of Hepatitis C Virus 185

52. Sims, J.E. et al. 1988. cDNA expression cloning ofthe IL-I receptor, a member of the immunoglobulinsuperfamily. Science 241.

53. O’Neill, L.A. 1995. Towards an understanding ofthe signal transduction pathways for interleukin 1.Biochim. Biophys. Acta 1266: 31–44.

54. Girard, S. et al. 2002. An altered cellular response tointerferon and up-regulation of interleukin-8 inducedby the hepatitis C viral protein NS5A uncovered bymicroarray analysis. J. Virol. 295: 272–283.

55. Khabar, K.S. & S.J. Polyak. 2002. Hepatitis C virus-host interactions: The NS5A protein and the inter-feron/chemokine systems. J. Interferon Cytokine Res. 22:1005–1012.

56. Stein, B. & A.S. Baldwin Jr. 1993. Distinct mecha-nisms for regulation of the interleukin-8 gene involvesynergism and cooperatively between C/EBP andNF-kappa B. Mol. Cell Biol. 13: 7191–7198.

57. Bayarsaihan, D., R.J. Soto & L.N. Lukens. 1998.Cloning and characterization of a novel sequence-specific single-stranded-DNA-binding protein. J.

Biochem. 331: 447–452.58. Tellinghuisen, T.L. et al. 2008. Identification of

residues required for RNA replication in domainsII and III of the hepatitis C virus NS5A protein. J.

Virol. 82: 1073–1083.59. Blindenbacher, A. et al. 2003. Expression of hepatitis

C virus proteins inhibits interferon alpha signalingin the liver of transgenic mice. Gastroenterology 124:1465–1475.

60. Duong, F.H. et al. 2004. Hepatitis C virus inhibitsinterferon signaling through up-regulation of proteinphosphatase 2A. Gastroenterology 126: 263–277.

61. Christen, V. et al. 2007. Activation of endoplasmicreticulum stress response by hepatitis viruses up-regulates protein phosphatase 2A. Hepatol. 46: 558–565.

62. Heim, M.H., D. Moradpour & H.E. Blum. 1999.Expression of hepatitis C virus proteins inhibits signaltransduction through the Jak-STAT pathway. J. Virol.

73: 8469–8475.63. Sarasin-Filipowicz, M. et al. 2008. Interferon signal-

ing and treatment outcome in chronic hepatitis C.Proc. Natl. Acad. Sci. USA 105: 7034–7039.

64. Torres-Puente, M. et al. 2008. Hepatitis C virus andthe controversial role of the interferon sensitivity de-termining region in the response to interferon treat-ment. J. Med. Virol. 80: 247–253.

65. Germanidis, G. et al. 2008. NS5A sequences ofhepatitis C virus genotype 1 and interferon re-sistance: Where are we? J. Infect Dis. 198: 154–155.

66. Mankouri, J., S. Griffin & M. Harris. 2008. The hep-atitis C virus non-structural protein NS5A alters thetrafficking profile of the epidermal growth factor re-ceptor. Traffic 9: 1497–1509.

67. Heim, M.H. 2003. A new survival trick of hepatitis Cvirus: Blocking the activation of interferon regulatoryfactor-3. Hepatol. 38: 1582–1584.

68. He, Y. et al. 2002. Subversion of cell signaling path-ways by hepatitis C virus nonstructural 5A protein viainteraction with Grb2 and P85 phosphatidylinositol3-kinase. J. Virol. 76: 9207–9217.

69. Wagoner, J. et al. 2007. Regulation of CXCL-8(interleukin-8) induction by double-stranded RNAsignaling pathways during hepatitis C virus infection.J. Virol. 81: 309–318.

70. Inubushi, S. et al. 2008. Hepatitis C virus NS5A pro-tein interacts with and negatively regulates the non-receptor protein tyrosine kinase Syk. J. Gen. Virol. 89:1231–1242.

71. Schmitz, U. & S.L. Tan. 2008. NS5A–from obscurityto new target for HCV therapy. Recent Patents Anti-

Infect Drug Disc. 3: 77–92.