Identification of Deleterious SNPs and Their Effects on Structural Level in CHRNA3 Gene
-
Upload
independent -
Category
Documents
-
view
0 -
download
0
Transcript of Identification of Deleterious SNPs and Their Effects on Structural Level in CHRNA3 Gene
ORIGINAL ARTICLE
Identification of Deleterious SNPs and Their Effectson Structural Level in CHRNA3 Gene
Vivek Chandramohan1 • Navya Nagaraju1 •
Shrikant Rathod2 • Anubhav Kaphle1 •
Uday Muddapur2
Received: 18 February 2015 / Accepted: 12 May 2015
� Springer Science+Business Media New York 2015
Abstract The aim of our study is to identify probable deleterious genetic variations
that can alter the expression and the function of the CHRNA3 gene using in silico
methods. Of the 2305 SNPs identified in the CHRNA3 gene, 115 were found to be non-
synonymous and 12 and 15nsSNPswere found to be in the 50 and 30 UTRs, respectively.Further, out of the 115 nsSNPs investigated, eight were predicted to be deleterious by
both SIFT and PredictSNP servers. Themajormutations predicted to affect the structure
of the protein are phenylalanine to valine (Y43V) and lysine to asparagine (K216N) as
shown by the trajectory run inmolecular dynamics studies. The random transition of the
protein structures over the simulation period caused by these mutations hints at how the
native state is distorted which could lead to the loss of structural stability and func-
tionality of the nicotinic acetylcholine receptors subunita-3 protein.Based on thiswork,we propose that the nsSNP with SNP id of rs75495285 and rs76821682 will have
comparatively more deleterious effects than the other predicted mutations in destabi-
lizing the protein structure.
Keywords CHRNA3 � SIFT � PredictSNP � Molecular modeling � Molecular
dynamics
Introduction
Lung cancer is one of the most common forms of malignant tumor in humans and is
the most common cause of cancer-related mortality. Chronic smoking, occupational
exposure, air pollution, and other factors are considered to be the causes of lung
& Vivek Chandramohan
1 Department of Biotechnology, Siddaganga Institute of Technology,
Tumkur 572013, Karnataka, India
2 Department of Biotechnology, KLE Dr. MSS CET, Belgaum, India
123
Biochem Genet
DOI 10.1007/s10528-015-9676-y
cancer. Smoking causes over 80% of lung cancer cases (Salim et al. 2011).
However, less than 20% of smokers develop lung cancer. The reasons for varied
cancer susceptibility among smokers are still unknown. Cigarette smoke contains
numerous carcinogens, including tar and benzopyrene. These carcinogens activate
signaling pathways that affect cell growth, differentiation, and apoptosis. Out of all
the compounds present, nicotine is the primary component responsible for the
addiction toward tobacco consumption, and this addiction greatly exacerbates the
cumulative health dangers of tobacco (Shen et al. 2012). Nicotine binds to specific
nicotinic acetylcholine receptors (nAChRs), which are encoded by a set of CHRNA
genes. CHRNA expression in the core region of the brain is also found to be closely
correlated with nicotine addiction (Shen et al. 2013). nAChRs, however, are not
only confined to cells at synapses but are also found on non-neuronal cells such as
lung epithelial cells, inferring the idea that they have other functions beyond
neurotransmission (Egleton et al. 2008), and their over-expression has been shown
to involve in increased signal transduction that promotes cell proliferation and
cancer metastasis (Singh et al. 2011). There are more than a dozen of different
nAChRs subunits encoded by the CHRNA genes, subdivided into a and b subunits
that form pentameric ion channel complexes (Roman and Koval 2009). The
CHRNA3 gene encodes an alpha-type subunit of the channel molecule, as it
contains characteristic adjacent cysteine residues. Primary lung cancer morbidity
and mortality have increased rapidly in the last 3 decades, which is attributed to
complex interactions between environmental and genetic risk factors (McMenamin
et al. 2010). Single nucleotide polymorphisms (SNPs) are the most common and
stable markers of human genetic variation and may be associated with the risk of
variety of cancers, including that of the lung (Xu et al. 2013). Studies have shown
that single nucleotide variants of the CHRNA genes have a strong association with
lung cancer risk (Hung et al. 2008; Thorgeirsson et al. 2008). Further, there have
been some studies about the association of the polymorphisms in the loci of
CHRNA3 and increased lung cancer susceptibility (Amos et al. 2010). Thus, the
present study aimed to perform a computational analysis of the total SNPs present in
the CHRNA3 gene that are recorded in the SNPs databases, and identify possible
deleterious mutations that might have functional effects. Further, to assess these
effects, we went a step ahead to produce modeled protein structures for wild and
mutant types, and check the dynamics and stability of these structures for a given
simulation time period.
Materials and Methods
Datasets
The SNPs and their related protein sequence for the CHRNA3 gene were retrieved
from the dbSNP (http://www.ncbi.nlm.nih.gov/SNP/) for computational study. The
study of functional coding nsSNPs by sequence homology-based method (SIFT and
predictSNP) was performed.
Biochem Genet
123
Establishing Deleterious nsSNPs by SIFT and PredictSNP
SIFT is a homology-based tool that presumes that important amino acids will be
conserved in the protein family. Hence, changes at well-conserved positions tend to
be predicted as deleterious. The query was submitted in the form of SNP IDs or as
protein sequences. The underlying principle of this program is that SIFT takes a
query sequence and uses multiple alignment information to predict tolerated and
deleterious substitutions for every position of the query sequence (Jia et al. 2014).
The cutoff value in the SIFT program is a tolerance index of C0.05. The higher the
tolerance index, the less functional impact a particular amino acid substitution is
likely to have. The PredictSNP is a consensus classifier that combines six of the top
performing tools for the prediction of the effects of mutation on protein function.
The obtained results are provided together with annotations extracted from the
Protein Mutant Database and the Uniprot Database (Bendl et al. 2014). The output
value of PredictSNP score belongs to the continuous interval h-1, ?1i. The
mutations are considered to be neutral for the values in the interval h-1, 0i and
deleterious for the values in the interval h0, ?1i. The absolute distance of the
PredictSNP score from zero expresses the confidence of the consensus classifier
about its prediction. For easy comparison with the confidence scores of individual
integrated tools, we transformed the confidence of the PredictSNP consensus
classifier to the observed accuracy in the same way as described for confidence
scores of the integrated tools.
Homology Modeling and Mutational Analysis
Homology modeling and structural validation of human CHRNA3 was carried out
on the basis of template structure: 4AQ5_A using Discovery studio 3.5 homology
modeling protocol. The overall stereo-chemical quality of the final models for
CHRNA3 protein was accessed by the program PROCHECK (Shahzad et al. 2013).
Mutation analysis was performed based on the results obtained from various in
silico tools. Mutants were prepared by replacing the native residues with the
corresponding mutant residues (R239G, A212D, F43V, R37H, I209M, H217Y,
Y215H, and K216N). The obtained structures have been optimized classically using
CHARMm force field implemented in DS 3.5 with conjugate gradient energy
minimization algorithms and convergence energy of 0.001 kcal/mol (Matsumoto
et al. 2013). These structures were then used for molecular simulations.
Molecular Dynamics
The molecular dynamics (MD) was performed for wild and mutant protein
structures obtained from Homology modeling studies in explicit solvent for 6 ns at a
temperature of 310 K. The temperature was maintained constant at 310 K. All
minimizations and MD simulations were performed using Discovery Studio
Molecular Dynamics Protocol 3.5. Based on the analyzed trajectory protocol,
6000 snapshots were superimposed with respect to all backbone and side chain
atoms to remove overall translation and rotation and then clustered at various
Biochem Genet
123
RMSD cutoff values based on atomic coordinates of all backbone and side chain
atoms of the protein (Novikov et al. 2013).
Results and Discussion
SNP Dataset from dbSNP
The CHRNA3 gene was investigated and the details about the gene variants were
obtained from the dbSNP database. It contained a total of 115 non-synonymous
coding SNPs (nsSNPs), 26 intron region SNPs, and 12 and 15 SNPs in 50 and 30
untranslated regions (UTRs), respectively (Fig. 1). We selected non-synonymous
coding SNPs for our investigation to analyze the structural and functional changes
in CHRAN3 protein. Furthermore, it can be seen that the number of nsSNPs in the
coding region is higher compared to the number of SNPs in the 50 and 30
untranslated regions inferring that the possible effects of the sequence variations
might be majorly on the composition and structure of gene products than on the
regulation of gene expression.
Establishment of Deleterious nsSNPs by the SIFT and PredictSNP
The conservation level of a particular position in a protein was determined by using
the sequence homology-based tools SIFT. The protein sequences of 115 nsSNPs
were submitted independently to the SIFT program to check their tolerance index.
The higher the tolerance index, the less of a functional impact a particular amino
acid substitution is likely to have and vice versa. Among the 115 nsSNPs, eight were
predicted to be deleterious, with a tolerance index score of less than 0.05. The
results are presented in (Table 1). The structural levels of alteration were
Fig. 1 Distribution of nsSNPs, 50 UTR, 30 UTR SNPs, and intron SNPs
Biochem Genet
123
determined by applying the Predict SNPs program. Protein sequence with
mutational position and amino acid variants associated with the 115 nsSNPs
investigated in this work was submitted as input to the PredictSNPs server and the
results are depicted in (Table 2).
Homology Modeling
CHRNA3 gene does not have an experimentally solved 3D structure. Hence, we
constructed a 3D model of the protein using homology modeling (Fig. 2) to analyze
any structural change resulting from the mutations in the sequence. BLAST search
identified a suitable template with PDB ID 4AQ5_A for modeling of CHRNA3
protein. Ramachandran plot for the model shows 98.6% of the residues in either the
core region or the allowed region and the remaining 1.4% of the residues in the
generously allowed region with no residue in the disallowed region as plotted.
Based on the results obtained from the above analysis, given in Tables 1 and 2, eight
experimentally validated nsSNPs [R239G, A212D, F43V, R37H, I209M, H217Y,
Y215H, and K216N] were chosen for the structural analysis (Fig. 3). Based on the
position of amino acids in the corresponding chains of the modeled structure, the
Table 1 nsSNPs predicted to
be functionally significant by
SIFT tool
SNP ID Amino acid change Tolerance index
rs8192475 R37H 0.00
rs55958820 I209M 0.01
rs76821682 K216N 0.00
rs71651684 V147A 0.00
rs61737495 R239G 0.00
rs72650603 H217Y 0.00
rs71581738 Y215H 0.50
rs75253420 A212D 0.00
rs75495285 F43V 0.00
Table 2 nsSNPs predicted to be functionally significant by PredictSNP tool in percentage
Mutation Predict SNP MAPP PhD_SNP PolyPhen-1 PolyPhen-2 SIFT SNAP
R239G 87 84 88 74 81 79 89
A212D 87 48 77 74 65 53 72
F43V 87 43 61 59 55 53 56
R37H 72 70 86 74 47 53 56
I209M 76 63 59 59 55 43 56
H217Y 61 71 77 59 40 79 58
Y215H 75 79 45 59 63 76 58
K216N 87 89 82 78 89 79 90
Biochem Genet
123
mutation analysis was performed using Discovery studio 3.5 viewer, and energy
minimization was carried out using the MD simulation protocol.
Molecular Dynamics Simulation
A comparative MD analysis of the predicted deleterious mutants R239G, A212D,
F43V, R37H, I209M, H217Y, Y215H, and K216N with the native was carried out.
In the 6 ns simulation trajectory, different parameters such as root mean square
deviation (RMSD) and potential energy were applied to analyze the level of
structural changes. The backbone RMSD was calculated from the trajectory value of
native and mutant models. The RMSD calculations and the potential energy are
plotted and are shown in the Figs. 4 and 5, respectively. The graph of RMSD versus
trajectory time shows the overall deviations in the protein structures during the
course of simulation. Our modeled structure was very stable after an initial jump of
8–13 nm at 750 ps time. Similar stability trend was observed in the mutant A212D,
which had a very low deviation in the backbone RMSD of just *0.5 nm over the
time. Mutant I209M was also pretty stable during the first half of the run, but the
structure was distorting more rapidly after 4 ns with RMSD deviation of more than
1.5 nm. Other mutants were very unstable from the beginning of the simulation. The
mutants H217Y, K216N, F43V, Y215H, R239G, and R37H showed maximum
deviation from their original structures. Among the simulated mutant structures,
F43V and K214N had maximum random distortion in their structures over the
period of 6 ns. The instability might be explained in terms of the structural
Fig. 2 Model structure of CHRNA3 protein with solid ribbon display style (red, blue and green color:alpha helix, beta sheets, and coil, respectively) (Color figure online)
Biochem Genet
123
- Alanine changes to Aspartic acid - Phenylalanine Changes to Valine
- Histidine Changes to Tyrosine K216N- Lysine Changes to Asparagine
- Arginine Changes to Glycine - Arginine Changes to Histidine
- Isoleucine changes to Methionine – Tyrosine changes to Histidine
Fig. 3 Eight experimentally validated nsSNPs [R239G, A212D, F43V, R37H, I209M, H217Y, Y215H,and K216N]—violet color wild residues and brown color mutated residues (Color figure online)
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 60000123456789
1011121314
R M
S D
(nm
)
Time (ps)
A212D H217Y I209M R37H R239G WILD Y215H K216N F43V
Fig. 4 Backbone RMSDs are shown as a function of time for WT and mutant CHRNA3 proteinstructures over the time period of 6 ns
Biochem Genet
123
differences in amino acid backbone in both the mutations. It can be inferred from
MD studies that for the above two mutations F43V and K214N, the protein
undergoes significant structure transitions when compared to the native structure.
Tallying Our Predicted Mutations with TCGA Cancer Data
After obtaining the complete MD results for our predicted mutation structures, we
used the cancer genome atlas (TCGA) data to check if any real cancer patient data
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000-4600
-4400
-4200
-4000
-3800
-3600
-3400
-3200
-3000
-2800
-2600
Pot
entia
l_en
ergy
Time (ps)
A212D F43V H217Y I209M R37H R239G WILD K216N Y215H
Fig. 5 Potential energy is shown as a function of time for WT and mutant CHRNA3 protein structuresover the time period of 6 ns
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 60000
1
2
3
4
5
6
7
8
9
10
11
12
13
14
R M
S D
(nm
)
Time (ps)
H217Y WILD Y215H K216N F43V
Fig. 6 Backbone RMSDs are shown as a function of time for WT and selected mutants from TCGAdatabase of CHRNA3 protein structures over the period of 6 ns for comparison
Biochem Genet
123
are available that matches our prediction. We used cBioportal (http://www.
cbioportal.org/index.do) to navigate through the datasets. We found one of the
mutations, Y215H, in the lung adenocarcinoma cell line RERF-LC-Ad1 with allele
frequency of 0.51. We also found the second predicted mutation H217Y in the
database, but in a different cell line—Brain cancer GMS-10. Unfortunately, we did
not find the rest of our mutations linked in any of the lung cancer data.
Our MD studies, however, have validated the data in terms of protein stability.
These two mutations also destabilize protein structures with random shape
transitions, but the other two mutations what we inferred before have more effects
on the structure stability than these two as shown in Fig. 6.
Conclusion
The CHRNA3 gene was investigated by assessing the influence of functional SNPs
by means of computation methods. Of the 2305 SNPs in the CHRNA3 gene, 115
were found to be non-synonymous and 12 and 15 nsSNPs were found to be in the 50
and 30 UTRs, respectively. Eight nsSNPs were found to be common in both the
SIFT and the PolyPhen Server. It was predicted that the major mutations in the
native protein of the CHRNA3 gene were from phenylalanine to valine (Y43V) and
lysine to asparagine (K216N). The structural effects of these mutations were
validated using MD studies that show large transitions in the protein structure over
the period of 6 ns. Our predicted data were tallied with the real cancer data from
TCGA database and one mutation out of 8 - Y215H was found in lung cancer cell
line. However, the MD studies showed that the mutations Y43V and K216N would
have comparatively more effects in destabilizing the protein structure than the
mutation Y215H, if they occurred in lung cancers.
Acknowledgments The authors wish to thank the Management, Principal, Director and Head of the
Department of the Siddaganga Institute of Technology, Tumkur, and KLE Dr. MSS CET, Belgaum. The
authors also thank KBITS for providing computational resources to carry out this study.
Conflict of interest No conflict of interest.
References
Amos CI et al (2010) Nicotinic acetylcholine receptor region on chromosome 15q25 and lung cancer risk
among African Americans: a case-control study. J Natl Cancer Inst 102:1199–1205
Bendl J et al (2014) PredictSNP: robust and accurate consensus classifier for prediction of disease-related
mutations. PLoS Comput Biol 10:e1003440
Egleton RD, Brown KC, Dasgupta P (2008) Nicotinic acetylcholine receptors in cancer: multiple roles in
proliferation and inhibition of apoptosis. Trends Pharmacol Sci 29:151–158
Hung RJ et al (2008) A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor
subunit genes on 15q25. Nature 452:633–637
Jia M et al (2014) Computational analysis of functional single nucleotide polymorphisms associated with
the CYP11B2 gene. PLoS One 9:e104311
Biochem Genet
123
Matsumoto Y et al (2013) Crystal structure of a complex of human chymase with its benzimidazole
derived inhibitor. J Synchrotron Radiat 20:914–918
McMenamin SB et al (2010) Adoption of policies to treat tobacco dependence in U.S. medical groups.
Am J Prev Med 39:449–456
Novikov GV, Sivozhelezov VS, Shaitan KV (2013) Investigation of the conformational dynamics of the
adenosine A2A receptor by means of molecular dynamics simulation. Biofizika 58:618–634
Roman J, Koval M (2009) Control of lung epithelial growth by a nicotinic acetylcholine receptor: the
other side of the coin. Am J Pathol 175:1799–1801
Salim EI, Jazieh AR, Moore MA (2011) Lung cancer incidence in the arab league countries: risk factors
and control. Asian Pac J Cancer Prev 12:17–34
Shahzad K et al (2013) A structured-based model for the decreased activity of Ala222Val and Glu429Ala
methylenetetrahydrofolate reductase (MTHFR) mutants. Bioinformation 9:929–936
Shen B et al (2012) Correlation between polymorphisms of nicotine acetylcholine acceptor subunit
CHRNA3 and lung cancer susceptibility. Mol Med Rep 6:1389–1392
Shen B et al (2013) CHRNA5 polymorphism and susceptibility to lung cancer in a Chinese population.
Braz J Med Biol Res 46:79–84
Singh S, Pillai S, Chellappan S (2011) Nicotinic acetylcholine receptor signaling in tumor growth and
metastasis. J Oncol 2011:456743
Thorgeirsson TE et al (2008) A variant associated with nicotine dependence, lung cancer and peripheral
arterial disease. Nature 452:638–642
Xu J et al (2013) Genetic variation in a microRNA-502 minding site in SET8 gene confers clinical
outcome of non-small cell lung cancer in a Chinese population. PLoS One 8:e77024
Biochem Genet
123