Molecular Mechanisms of Disease-Causing Missense Mutations

18
Molecular Mechanisms of Disease-Causing Missense Mutations Shannon Stefl 1 , Hafumi Nishi 2 , Marharyta Petukh 1 , Anna R. Panchenko 2 and Emil Alexov 1 1 - Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, Clemson, SC 29634, USA 2 - National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA Correspondence to Anna R. Panchenko and Emil Alexov: [email protected]; [email protected] http://dx.doi.org/10.1016/j.jmb.2013.07.014 Edited by M. Sternberg Abstract Genetic variations resulting in a change of amino acid sequence can have a dramatic effect on stability, hydrogen bond network, conformational dynamics, activity and many other physiologically important properties of proteins. The substitutions of only one residue in a protein sequence, so-called missense mutations, can be related to many pathological conditions and may influence susceptibility to disease and drug treatment. The plausible effects of missense mutations range from affecting the macromolecular stability to perturbing macromolecular interactions and cellular localization. Here we review the individual cases and genome-wide studies that illustrate the association between missense mutations and diseases. In addition, we emphasize that the molecular mechanisms of effects of mutations should be revealed in order to understand the disease origin. Finally, we report the current state-of-the-art methodologies that predict the effects of mutations on protein stability, the hydrogen bond network, pH dependence, conformational dynamics and protein function. Published by Elsevier Ltd. Introduction The differences in human DNA sequences con- tribute to phenotypic variations and influence an individual's susceptibility to disease and response to the environment and drug treatment [1,2]. At the same time, the DNA differences lead to phenotypic differences between populations, for example, dif- ferences in eye color [3]. The genetic variations may involve several nucleotides or only one, the latter is called single nucleotide polymorphismor SNP. Technically, a polymorphism represents a DNA variation found in more than 1% of the population [4,5]. SNPs are the most common types of genetic variations in humans [57] and occur approximately every 1200 bases in the overall human population [6,8]. Most often, SNPs occur in the non-coding region of the genome [1] whereas SNPs in coding DNA regions may result in changes of amino acid sequences either through amino acid substitutions (non-synonymous SNPs) or through the introduction of nonsense/truncation mutations [5]. At the opposite end of the frequency spectrum are the rare mutations (often referred to as rare variants in contrast to common variants), which are usually defined as mutations with minor allele frequency of less than 0.51% [9]. Historically, effects of muta- tions were discussed in the context of a Common Disease, Common Variant (CDCV)or Common Disease, Rare Variant (CDRV)debate [10]. The CDCV hypothesis argues that common disease mutations with low penetrance (the percent of individuals with disease mutations who exhibit disease phenotype) are the major contributors to genetic susceptibility to diseases. On the other hand, the CDRV hypothesis asserts that rare mutations with relatively high penetrance are the major contributors. The effects of common and rare mutations were extensively studied by the theoret- ical population genetics [11], and it was found that the frequencies of susceptibility alleles for any disease ranged from rare to high and depended on the mutation rates and the strength of purifying selection [11]. With the advancement of DNA IMF YJMBI-64156; No. of pages: 18; 4C: 3, 7 0022-2836/$ - see front matter. Published by Elsevier Ltd. J. Mol. Biol. (2013) xx, xxxxxx Review Please cite this article as: Stefl S, et al, Molecular Mechanisms of Disease-Causing Missense Mutations, J Mol Biol (2013), http:// dx.doi.org/10.1016/j.jmb.2013.07.014

Transcript of Molecular Mechanisms of Disease-Causing Missense Mutations

IMF YJMBI-64156; No. of pages: 18; 4C: 3, 7

Review

Shannon SteflAnna R. Panch

0022-2836/$ - see front m

Please cite this article adx.doi.org/10.1016/j.jmb

Molecular Mechanisms of Disease-CausingMissense Mutations

1†, Hafumi Nishi 2†, Marhar

yta Petukh1,enko2 and Emil Alexov1

1 - Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, Clemson, SC 29634, USA2 - National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda,MD 20894, USA

Correspondence to Anna R. Panchenko and Emil Alexov: [email protected]; [email protected]://dx.doi.org/10.1016/j.jmb.2013.07.014Edited by M. Sternberg

Abstract

Genetic variations resulting in a change of amino acid sequence can have a dramatic effect on stability,hydrogen bond network, conformational dynamics, activity and many other physiologically importantproperties of proteins. The substitutions of only one residue in a protein sequence, so-called missensemutations, can be related to many pathological conditions and may influence susceptibility to disease anddrug treatment. The plausible effects of missense mutations range from affecting the macromolecular stabilityto perturbing macromolecular interactions and cellular localization. Here we review the individual cases andgenome-wide studies that illustrate the association between missense mutations and diseases. In addition,we emphasize that the molecular mechanisms of effects of mutations should be revealed in order tounderstand the disease origin. Finally, we report the current state-of-the-art methodologies that predict theeffects of mutations on protein stability, the hydrogen bond network, pH dependence, conformationaldynamics and protein function.

Published by Elsevier Ltd.

Introduction

The differences in human DNA sequences con-tribute to phenotypic variations and influence anindividual's susceptibility to disease and response tothe environment and drug treatment [1,2]. At thesame time, the DNA differences lead to phenotypicdifferences between populations, for example, dif-ferences in eye color [3]. The genetic variations mayinvolve several nucleotides or only one, the latter iscalled “single nucleotide polymorphism” or SNP.Technically, a polymorphism represents a DNAvariation found in more than 1% of the population[4,5]. SNPs are the most common types of geneticvariations in humans [5–7] and occur approximatelyevery 1200 bases in the overall human population[6,8]. Most often, SNPs occur in the non-codingregion of the genome [1] whereas SNPs in codingDNA regions may result in changes of amino acidsequences either through amino acid substitutions(non-synonymous SNPs) or through the introductionof nonsense/truncation mutations [5].

atter. Published by Elsevier Ltd.

s: Stefl S, et al, Molecular Mechanisms of.2013.07.014

At the opposite end of the frequency spectrum arethe rare mutations (often referred to as rare variantsin contrast to common variants), which are usuallydefined as mutations with minor allele frequency ofless than 0.5–1% [9]. Historically, effects of muta-tions were discussed in the context of a “CommonDisease, Common Variant (CDCV)” or “CommonDisease, Rare Variant (CDRV)” debate [10]. TheCDCV hypothesis argues that common diseasemutations with low penetrance (the percent ofindividuals with disease mutations who exhibitdisease phenotype) are the major contributors togenetic susceptibility to diseases. On the other hand,the CDRV hypothesis asserts that rare mutationswith relatively high penetrance are the majorcontributors. The effects of common and raremutations were extensively studied by the theoret-ical population genetics [11], and it was found thatthe frequencies of susceptibility alleles for anydisease ranged from rare to high and depended onthe mutation rates and the strength of purifyingselection [11]. With the advancement of DNA

J. Mol. Biol. (2013) xx, xxx–xxx

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

2 Review: Molecular Mechanisms of Missense Mutations

sequencing technologies, many examples of raremutations contributing to common diseases havebeen reported and here we list several of them.Cohen et al., for example, showed that multiple raremutations contributed significantly to the plasmalevels of high-density lipoprotein (HDL) [12] andlow-density lipoprotein [13]. In the HDL study, threeHDL-related genes were analyzed and severaldozen rare mutations were identified in total. Almostall mutations were found in one HDL-related gene(ABC transporter A1), and two of these mutations(N1800H and W590S) drastically changed thephysicochemical properties of amino acids andaffected the interaction with a partner protein.The functional effect of rare mutations has been a

topic of many recent studies that tried to understandthe role of rare mutations in complex traits [14,15].Several studies concluded that the excess ofdeleterious rare mutations in the human genomewas due to recent fast population growth and weakpurifying selection. One study, focusing on a coupleof hundreds of drug target genes, reported that raremutations were more enriched for damaging variantsthan common mutations [14]. Deep sequencing ofhuman exomes also confirmed the abundance anddeleterious effects of rare mutations, namely, thatrare mutations accounted for 86% of identified singlenucleotide variants and also accounted for about96% of variants that were predicted as functionallyimportant [15]. The same study mapped raremutations on several known protein structures andfound that they were enriched in the followingcategories: ligand binding, active sites and sitesparticipating in hydrogen bonding [15].In an effort to identify a link between mutations and

chronic disease susceptibility, a lot of data wascollected from the literature for disease-causing rareand common mutations. It was shown that, when thefrequency of variants of adenoma patients andcontrol groups was compared, the frequency ofvariants was found to be higher in patients than incontrols. This suggested that disease susceptibilitycould be due in part to the effects of many rarevariants [16]. In contrast to rare mutations contribut-ing to hereditary diseases, some rare mutations mayoccur in patients who do not necessarily inherit themutation from their parents. These de novo muta-tions are considered as the most extreme cases ofrare mutations [17]. Since de novo mutations havenot experienced strong evolutionary selection, theytend to be more deleterious than inheritable raremutations. Indeed, recent whole-exome sequencingstudies showed that de novo mutations in a singlegene contributed to many rare diseases such asKabuki syndrome and others [18]. It was alsosuggested that the scarcity of diseases caused byde novo mutations was related to the number ofmutational target genes. Diseases caused by denovo mutations in a single target gene occur at very

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

low population frequencies (b1/10,000) whereasdiseases caused by de novo mutations in morethan 100 candidate genes such as intellectualdisability could be relatively common (N1/100) [17].The necessity to understand the effects of

missense mutations becomes clear when oneconsiders genetic diseases. It is known that thesubstitution of only one residue in a proteinsequence can be related to a number of pathologicalconditions such as Alzheimer's, Parkinson's andCreutzfeldt-Jakob diseases [6]. It is also well knownthat accumulation of somatic mutations can lead tocancers and that hereditary diseases are caused byone or more germline mutations [19]. The analysis ofthe impacts of missense mutations advances ourunderstanding of the relationships between proteinstructure and function and allows us to decipher themechanisms of the effects of disease mutations andthereby pathogenesis [20]. With personalized med-icine on the not-so-distant horizon, it is swiftlybecoming essential to understand the process bywhich genetic mutations lead to disease.The plausible effects of missense mutations

[21,22] range from affecting the macromolecularstability to perturbing macromolecular interactionsand cellular localization. Here we will review thecurrent state-of-the-art methods in the area ofpredicting the effects of mutations on proteinstability, the hydrogen bond network and pHdependence, conformational dynamics and activity.

Effects of Mutations on MacromolecularStability

Association between protein stability changesand human diseases

It is well established that protein function is closelyrelated to stability of monomers and complexes [23–25]; therefore, in order to assess computationally thefunctional consequences of mutations, it is essentialto identify the effect of mutations on stability andfolding free energy (energy difference betweenfolded and unfolded states). Essentially, in orderfor the macromolecule to carry out its function, themacromolecule, in most cases, should adopt aparticular three-dimensional (3D) fold and makespecific interactions with its partners. A missensemutation that affects the 3D structure and altersthe stability or binding affinity of a protein complexmay cause significant perturbations or completeabolishment of the function of this particularprotein.Typically, the change in the folding free energy

(ΔΔG) is used to quantify the magnitude of amutation's effect on stability. Methods using physicalpotential energy functions (molecular mechanics

isease-Causing Missense Mutations, J Mol Biol (2013), http://

3Review: Molecular Mechanisms of Missense Mutations

approaches or Monte Carlo simulations) are proba-bly the most insightful methods for predicting thedetails of the effects of mutations on protein stability[26]. They usually are time consuming and arefrequently applied for small-scale investigations [27].On the other side of the spectrum are methodsutilizing machine learning techniques, which are veryfast and can deliver predictions on large datasets(Table 1). A typical scheme of assessing the effect ofa mutation on macromolecular folding free energy isillustrated in Fig. 1. Two different approaches can beused to estimate ΔΔG: (a) in one case, the foldingfree energy is calculated for the wild-type andmutant proteins and then the difference can befound; (b) in the second approach, the change infree energy upon mutation is calculated for unfoldedand folded states separately and then the value ofthe energy of the unfolded state can be subtractedfrom the value of the energy of the folded state.Such detailed schemes to predict ΔΔG are typicallyemployed by methods utilizing 3D structure. How-ever, frequently structural and sequence informationare used together within the same methodologicalframework.

ΔGWT

ΔGMT

ΔGunfolded(WT-MT)

Fig. 1. The change in the folding free energy (ΔΔG) maydifference in the folding free-energy values calculated from the[ΔGWT(folding)] minus the mutant type [ΔGMT(folding)] shownfree energy between the folded states of the wild type and mubetween the unfolded states of the wild type and mutant type

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

Before outlining the existing methods for predictingeffects of missense mutations on macromolecularstability, we first review recent studies showing theconnection between human diseases and proteinstability changes. In general, the effects on proteinstability can be grouped into two distinctive catego-ries: (a) destabilizing and (b) stabilizing effects (seeTable 2 for examples). Most frequently, missensemutations are found to destabilize the correspondingprotein. Such cases include mutations in LMNAgene that are associated with muscular diseases[28], in the VWF A2 domain causing von Willebranddisease type [29], in retinal proteins causing retinaldiseases [30], in the perforin protein resulting inhemophagocytic lymphohistiocytosis [31,32] andmutations in prion proteins associated with priondiseases [33–36]. Many neurodegenerative dis-eases, such as Parkinson's disease, are alsoassociated with destabilization of the correspondingproteins [37–39]. Sometimes, the impact is local,affecting particular secondary structure element. Forexample, mutations of residues E22 and D23 in theamyloid-β protein are associated with familialAlzheimer's disease and are shown to destabilize

(folding)

(folding)

ΔGfolded(WT-MT)

be evaluated using two different methods: (1) using thetransition from an unfolded to a folded state of the wild typewith black arrows or (2) using the difference in the foldingtant type [ΔGfolded(WT-MT)] minus the folding free energy[ΔGunfolded(WT-MT)] shown with red arrows.

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

4 Review: Molecular Mechanisms of Missense Mutations

an important β-turn [40] whereas destabilizingmutations in the core of the protein lead to aninactivation of many tumor suppressors in cancer[41]. Most frequently, if the effect of mutations isrelated to protein stability, it usually significantlydestabilizes the corresponding protein [42].There are examples where missense mutations

enhance stability of the corresponding protein whilestill being deleterious. Recently, a mutation in theCLIC2 protein, H101Q, was associated with amental disorder and was predicted to increase theCLIC2 protein stability, therefore obstructing itstransport to the cell membrane [43]. Later, thesepredictions were confirmed experimentally and itwas demonstrated that, indeed, this mutation makesthe CLIC2 protein thermodynamically more stable,the residence time in the membrane shorter, and themutant interact more strongly with the ryanodinereceptor [44]. Similarly, the stability of a small

Table 1. The list of servers and programs to predict the effect

Name Input Descrip

MuStab Sequence Predicts decrease/incremutation usi

MUPro Sequence Predicts decrease/incremutation usi

SIFT Sequence Estimates deleterious effsequence homology an

MutPred Sequence Estimates deleterious effSIFT and gain/loss of str

properties predictedSNPdbe Sequence Pre-calculated database

mutations calculated bnetwork metho

PolyPhen-2 Sequence/structure Estimates deleterious effsequence homology, sit

structural feI-Mutant2.0 Sequence/structure Estimates ΔΔG upon m

SDM Structure Estimates ΔΔG uponstatistical potential

CUPSAT Structure Estimates ΔΔG upon mforce atom pair and tors

FoldX Structure Estimates ΔΔG uponempirical fo

AUTO-MUTE Structure Estimates ΔΔG uponknowledge-based st

potential and machineERIS Structure Estimates ΔΔG upon mu

force field with atoPoPMuSiC-2.0 Structure Estimates ΔΔG upon

combination of statistneural net

CC/PBSA Structure Poisson–Boltzmann calwith surface accessibili

the energy on aof structures generate

HOPE Structure/sequence Uses experimentallhomology-based structu

with WhatIf to ma

See Refs. [64] and [70] for other methods and comparison between th

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

41-residue hel ical protein, the peripheralsubunit-binding domain, was found to increase upona replacement of a surface charge with a hydrophobicresidue [45]. In silico modeling of the effects ofmutations on stability of spermine synthase (causingSnyder-Robinson syndrome) showed that sites har-boring disease-causing mutations may or may nottolerate other (other than disease-causing) amino acidsubstitutions [46]. This indicates that disease-causingeffects on the stability may be both site dependent andamino acid type dependent.

Algorithms for predicting the effect of mutationson protein stability

The previous section listed several cases wheremissense mutations produce a very prominent effecton macromolecular stability, and many efforts wereinvested to develop approaches and algorithms to

of mutations on protein molecular characteristics

tion URL

ase of stability uponng SVM

http://bioinfo.ggc.org/mustab/

ase of stability uponng SVM

http://mupro.proteomics.ics.uci.edu/

ect of mutation usingd site conservation

http://sift-dna.org

ect of mutation usinguctural or functionalfrom sequences

http://mutpred.mutdb.org/

of effects of knownased on the neurald and SIFT

http://www.rostlab.org/services/snpdbe/

ect of mutation usinge conservation andatures

http://genetics.bwh.harvard.edu/pph2/

utation using SVM http://gpcr2.biocomp.unibo.it/%7Eemidio/I-Mutant/I-Mutant.htm

mutation usingenergy function

http://www-cryst.bioc.cam.ac.uk/~sdm/sdm.php

utation using meanion angle potentials

http://cupsat.tu-bs.de/

mutation usingrce field

http://fold-x.embl-heidelberg.de

mutation usingatistical contactlearning methods

http://proteins.gmu.edu/automute

tation using physicalmic modeling

http://dokhlab.unc.edu/tools/eris/

mutation using aical potential andworks

http://babylone.ulb.ac.be/popmusic

culations combinedty term to calculaten ensembled with Concoord

http://ccpbdsa.bioinformatik.uni-saarland.de/

y determined orres with conjunctionke predictions

http://www.cmbi.ru.nl/hope/

e methods.

isease-Causing Missense Mutations, J Mol Biol (2013), http://

Table 2. Diseases associated with changes in stability as a result of mutations

Change in stability Disease Gene/mutation References

Destabilizing Muscular diseases LMNA; multiple sites [28]von Willebrand disease VWF A2; R1597W, M1528V [29]

Retinal diseases Rhodopsin; multiple sites [30]Hemophagocytic lymphohistiocytosis Perforin; multiple sites [31] and [32]Neurodegenerative/prion diseases Prion; D178N [36]

Autosomal recessive Parkinson's disease PINK1; multiple sites [37] and [39]Familial Alzheimer's Amyloid protein; Glu22, Asp23 [40]

Inactivation of tumor suppressants Multiple genes; multiple sites [41]Snyder-Robinson syndrome Spermine synthase; G56S, V132G,

I150T, Y328C[46] and [50]

Stabilizing Mental disorder CLIC2; H101Q [43]

5Review: Molecular Mechanisms of Missense Mutations

predict the change of the folding free energy uponmissense mutations. While mutations might havelarge effects on protein binding affinity and lead tomany diseases [47,48], herewe only focus on existingapproaches for predicting the effect of mutations onstability of protein monomers. In general, it is hard toclassify the existing methods into several distinctcategories (see, e.g., reviews [21,49]) because manyof them utilize a mixture of different approaches.Table 1 includes several widely used methods andprovides links to corresponding databases andservers and a short methods' description. Below, wesummarize several methods and resources.The first group of methods represents machine

learning approaches that are trained on differenttypes of data relevant to protein stability and, insome cases, take into account experimental condi-tions such as temperature, salt concentration and pHvalues. Including such parameters is important forassessing the free-energy changes upon mutationsat near physiological conditions [27]. Majority ofthese methods (MuStab [51], I-Mutant [52] andothers) incorporate different physicochemical prop-erties of amino acids and structural preferences ofdifferent sites and are trained on the experimentaldifferences of folding free energy caused by muta-tions. Some methods, like I-Mutant2.0, use supportvector machines (SVMs), make predictions basedon either structure or sequence alone [19,52] andpredict actual values of ΔΔG. Similarly, the MuProuses SVMs leveraging both sequence and struc-tural information [53]. Other methods, such asMuStab, predict only the deleterious effects ofmutations and the sign of ΔΔG values. Anotherrecently introduced method, PoPMuSiC-2.0, uses acombination of statistical potential and neuralnetworks to estimate the changes in stability; itexploits statistical potentials that take into accountthe coupling between four protein sequence andstructure descriptors and the amino acid volumevariation upon mutation [54].The second group of methods mostly exploits the

evolutionary conservation data under an assumption

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

that changes at conserved positions in the multiplesequence alignments tend to be deleterious. Al-though such approaches do not predict the effect ofmutations on protein stability directly, they aretypically used in conjunction with the abovemen-tioned methods to achieve consensus predictions.For example, a prediction of large ΔΔG caused by amutation will have a higher accuracy if supported bysequence-based analysis. There are many sequen-ce-based methods. For example, SIFT (SortingIntolerant From Tolerant [55]) method scores thenormalized probabilities for all possible substitutionsfor a site and calculates the conditional probabilitythat an amino acid is tolerated compared to the mostfrequent tolerated amino acid. PolyPhen, on theother hand, predicts damaging amino acidsubst i tu t ions using sequence-based andstructure-based features such as sequence conser-vation, structure and a position-specific independentcount matrix derived from the multiple sequencealignment [4,56]. Additionally, PolyPhen-2 useseight sequence-based predictive features, threestructure-based predictive features and a NaïveBayes classifier to predict the functional significanceof the mutation [57].The third group of methods explicitly relies on

structural information, assuming that the ability of aprotein to function properly depends on the funda-mental physicochemical properties that can bederived only from structures [23]. One researchgroup, for example, investigated the relevance ofcombining coarse-grained structure-based stabilitypredictions with a simple comparative modelingprocedure [58]. CUPSAT (Cologne University Pro-tein Stability Analysis Tool) uses structural envir-onment-specific atom potentials and torsion anglepotentials to predict ΔΔG [59] whereas SDM(Site-Directed Mutator) uses a statistical potentialenergy function [26]. Some methods use empiricallyderived energy functions that are quite accurate inpart because they are trained on the experimentaldata. The most prominent example is the empiricalforce field of FoldX that has been optimized for point

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

6 Review: Molecular Mechanisms of Missense Mutations

mutations [19] and implemented into a computeralgorithm and Web server, FOLDEF, to predictfolding free-energy change [56,60]. Another methodthat utilizes structural information is ERIS applyingthe Medusa force field while including backboneflexibility to make predictions [61]. Recently, aninteresting approach (MutPred) [62] that uses abroad range of different attributes based on proteinsequence, structure and dynamics was introduced.This method models the changes of sequence andstructural features between wild-type and mutantswhere changes are expressed as probabilities ofgain or loss of an attribute [62]. Another approach forpredicting changes of the folding free energy uponmutations is implemented in a method called CC/PBSA (Poisson–Boltzmann Surface Area) [63]. Itutilizes Concoord [64] to generate a structuralensemble of the target protein and calculates thefolding free energy with the PBSA (DelPhi [65])method. Structural information is utilized in HOPE aswell, using either experimentally available structuresor structures built by homology in conjunction withenergy calculations done with WhatIf [66].Several papers reported comparisons of the

performance between different methods [67–70]. Inorder to assess the performance, a systematicanalysis is necessary. Although there is no singlemeasure that accurately gauges the performance,some helpful measures include ROC (receiveroperating characteristic), AUC (area under thecurve), sensitivity, specificity [71]. Different factorsmay influence the prediction accuracy including thetype of substituted amino acid, protein structuralclass and flexibility, structural environment of anaffected site. All these factors should be consideredwhen judging the effectiveness of a given approach[72]. It is outside of the scope of this review to rankthe performance of the existing tools (includingDmutant [73], MultiMutate [74], SCide [75], Scpred[76], SRide [77], nsSNPAnalyzer [78], Panther [79],PhD-SNP [6], SNAP [80] and SNPs&GO [81],among other well-known tools) because differentbenchmarking papers (see reviews [67,72,82,83])report conflicting results. The main conclusion is thatvarious groups of methods complement each otherand are suited for different types of tasks.

Effect of Mutations on Hydrogen BondNetwork and Ionization States

Protein stability is determined by many differentfactors, and the formation of hydrogen bonds isamong the most important ones. Hydrogen atomsare an essential component of the atomic structureof biological macromolecules. Among all hydrogens,those carrying significant positive partial charge, theso-called polar hydrogens, are particularly importantbecause of their ability to form hydrogen bonds. In

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

structures of biological macromolecules, the hydro-gen bond is formed between a polar hydrogen and anegatively charged hydrogen acceptor, typically anoxygen atom. In water phase, the water hydrogensand oxygens form a complex network of interactionsresulting in water clusters. The arrangement ofhydrogen bonds at the macromolecular surface iseven more complicated involving interactions be-tweenmacromolecular and water atoms. The groupsof hydrogen bonds often form a cluster or a web ofinteractions resulting in so-called hydrogen bondnetwork. Hydrogen bonds and hydrogen bondnetworks participate in biological functions of mac-romolecules and provide the pH dependence asso-ciated with many biological reactions. Theycontribute to protein structural integrity, provide“proton wires” for proton translocation (the sourceof proton uptake/release) and, finally, participate inmany catalytic reactions. Below, we briefly outlinethe mechanisms of how missense mutations mayaffect the hydrogen bond networks.

Hydrogen bond networks and macromolecularstructure

As was mentioned above, the hydrogen bonds arekey constituents of biomolecular structures [84],participating in the formation of secondary structureelements [85] and tertiary and quaternary structures[86]. A mutation resulting in the removal or additionof a hydrogen donor or acceptor is expected to havea significant impact on the structural integrity. Even aconservative mutation that results in different donor–acceptor positions may still be quite deleterious forthe structure and may disrupt the entire hydrogenbond network (Table 3).Figure 2 shows one example of mutation N550K in

isoform 3 lactosylceramide α-2,3-sialyltransferase,which is associated with autosomal recessiveneurocutaneous condition. As can be seen fromthis figure, the wild-type residue, N550, is involved inseveral hydrogen bonds, which are deleted in themutant. In another example of the aldosteronesynthase deficiency, it was demonstrated thatmolecular mechanism of this disease involvedmutation R374W in CYP11B2 protein, which chan-ged the properties of the hydrogen bond network ofthe wild-type Arg residue [88]. Similarly, in patientssuffering from mitochondrial fatty acid oxidationdisorder and bearing a R595W mutation in theCPT1 protein, the disease was attributed to adisruption of the wild-type hydrogen bond networkformed by the wild-type R374 residue [89]. Inanother case, the rearrangement of the hydrogenbond network due to a missense mutation (I150T) inspermine synthase was predicted to be the maincause of the Snyder-Robinson syndrome [107].Further mutational analyses of the same proteinindicated that almost any mutation affecting this

isease-Causing Missense Mutations, J Mol Biol (2013), http://

Table 3. Diseases resulting from changes in the hydrogen bond network as a consequence of mutations

Effect on Disease Gene/mutation Reference

Structure Prolidase deficiency PEPD; 304insA [87]Aldosterone synthase deficiency CYP11B; R374W [88]

Mitochondrial fatty acid oxidation disorder CPT1; R595W [89]Snyder-Robinson syndrome Spermine synthase; I150T [46]

Cardiovascular disease predisposition Human lipoxygenase; T560M [90]Congenital hereditary cataract disease βB1-crystallin; S129R [91]

Aggregation/folding Alzheimer's disease Amyloid-β peptide aggregation; multiple sites [92]Familial Alzheimer's disease Fragment of amyloid-β protein; multiple sites [93]

Amyloidosis (with severe cardiomyopathy) Aggregation/cytotoxicity of transthyretin; S112I [94]Flexibility Aneuploidy and solid tumors CEP63; L61P [95]

Gaucher disease Acid-β-glucosidase; N370S [96]Lethal catecholaminergic polymorphic

ventricular tachycardiaCalsequestrin; R33Q [97]

Promotion of aneuploidy and tumorigenesis Centromere-associated protein-E; Y63H [98]Cellular localization Brain tumors ING1B; p33 [99]

May-Hegglin anomaly, Sebastiansyndrome and Fechtner syndrome

MYH9; multiple sites [100]

pH Human amyloidosis TTR; several residues [101]Mental retardation CLIC2; H101Q [44]Alzheimer's disease Apolipoprotein E; apoE4 isoform [102]

Activity Pyruvate dehydrogenasecomplex deficiency

Pyruvate dehydrogenase E1α subunitprotein; multiple sites

[103]

Classical homocystinuria Cystathionine β-synthase; R266K [104]Amyotrophic lateral sclerosis disease Angiogenin protein; K17I [105]Neonatal epileptic encephalopathy Pyridoxine 5′-phosphate oxidase; R229W [106]

Snyder-Robinson syndrome Spermine synthase; I150T [107]

7Review: Molecular Mechanisms of Missense Mutations

residue, involved in a wild-type hydrogen bondnetwork, will have drastic effects on the wild-typeproperties of the hydrogen bond network andmacromolecular stability [46].The effect of mutations on hydrogen bond network

is especially pronounced if the network involvesactive-site residues. For example, a recent studyshowed that the low catalytic efficiency of a mutant,T560M, found in human lipoxygenase is caused bythe alterations of a hydrogen bond network inter-

Fig. 2. The region at the mutation site E332 of thα-2,3-sialyltransferase and mutant model that was built usingleft panel and (b) the mutant E332K in the right panel, both ceE332 and its neighbors in the wild type are shown with black

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

connecting this residue with active sites; this in turnmay lead to a predisposal to cardiovascular dis-eases [90]. The wild-type hydrogen bond networkand stability of tertiary structure was also reported tobe altered by missense mutations causing congen-ital hereditary cataract disease [91]. These limitedexamples confirm the importance of the hydrogenbond network for structural integrity of macromole-cules and point to the necessity of predicting sucheffects in the context of missense mutations.

e wild-type structure of isoform 3 lactosylceramide2wmlA as a template: (a) the wild type with E332 in thentered at the mutation site. The hydrogen bonds betweenbroken lines.

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

8 Review: Molecular Mechanisms of Missense Mutations

Hydrogen bond network in protein aggregationand flexibility

The aggregation or formation of amyloid fibrils isone of the primary sources of many diseases [108–111]. Missense mutations disrupting the nativehydrogen bonds can destabilize the local structureand thus expose the hydrophobic core of the proteinto the water phase. Such an event frequently triggersaggregation and fibril formation. Several examplesare listed in Table 3.Biological reactions are frequently accompanied

by small or large protein conformational changes[112]. Consequently, the ability of macromoleculesto retain their conformational flexibility is crucial totheir function. Altering the hydrogen bond network bymissense mutation can affect the conformationalflexibility needed for allosteric regulation and con-formational gating. Several examples of an alteredprotein flexibility causing diseases are listed inTable 3. At the same time, altering the wild-typeflexibility through mutations may not necessarilyresult in the pathogenic effect [113]. A detailedanalysis is usually required in order to attribute thechange of flexibility to a particular molecular function.

Hydrogen bond networks and pH dependence

Practically, all biological processes are pH depen-dent and pH is an important regulator of cellularfunction. Different compartments of the cell havedifferent characteristic pH (see reviews [24,114–116]). Macromolecules that shuttle between differentcellular compartments, for example, prolactin recep-tor [117,118], have their pH-dependent characteris-tics precisely tuned to the local pH. Thus, in order tofunction properly and interact with its biologicalpartners, the protein needs to maintain specificpH-dependent characteristics [22] that can be verysensitive to single point mutations and hydrogenbond networks [119]. In addition, single pointmutations may affect a protein's cellular location[120]. More examples are provided in Table 3.

Hydrogen bond networks and catalytic reaction

Hydrogen bonds participate in catalytic reactionsin various ways: they coordinate the substrate andcan be formed or disrupted during the reactionprocess (donating or accepting a hydrogen viageneral acid/base residues) (Table 3). Typically,the catalytic reaction is mostly affected by asubstitution of an amino acid directly involved inthe catalysis. However, it should be mentioned thatactive-site residues are rarely found mutated andmutations often occur in the sites connected to theactive-site region via hydrogen bond networks. Themost drastic change of the hydrogen bond network iscaused by protonation/deprotonation of titratable

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

groups. Thus, the pKa shifts of catalytic residuesinduced by a mutation will disrupt the general acid/basic reaction and will dramatically affect thecatalysis.

Approaches to model effects of mutations onhydrogen bond networks

Assessing the effect of mutations on hydrogenbond networks is not a straightforward task since thepositions of protons (hydrogens) are typically notresolved experimentally but, rather, should begenerated in silico. Even more, the protonationstates of titratable groups, which are responsiblefor pH dependence of biological processes includinggeneral acid/basic catalysis, are typically unknownand have to be predicted as well.In the simplest case scenario when mutation does

not involve the protonation change, the analysis ofthe hydrogen bond network begins with placing themissing hydrogens onto the wild-type and mutant 3Dstructures. By doing so, one typically assumesdefault protonation states of titratable groups at pHequal to 7. However, many biological reactions occurat pH different from neutral and careful analysiswould require obtaining the biochemical data on theoptimal/characteristic pH [24,115,121]. At the nextstep, protons' positions are generated with thestandard molecular dynamics (MD) packages suchas NAMD [122], CHARMM [123], Amber [124],GROMOS [125], GROMACS [126] or otherstand-alone programs such as REDUCE [127] andPDB2PQR [128]. Then, one would compare hydro-gen bonds in the minimized structures of wild-typeand mutant proteins [107]. A more sophisticatedanalysis would involve the comparison of thehydrogen bonds in the snapshot structures obtainedin MD simulations. All MD packages have tools forthe analysis of hydrogen bond networks. A recentlydeveloped stand-alone program (HBonanza [129],hydrogen bond analyzer‡) allows the analysis andvisualization of hydrogen bond networks. HBonanzacan be used to analyze single structures or manystructures of an MD trajectory.Cases where mutation induces changes in the

protonation state of titratable groups are much morecomplex. It should be mentioned that such ionizationchanges may occur even if mutation does notinvolve titratable groups [46]. Predictions of proton-ation states can be done by calculating the pKavalues of titratable groups and then by assigning theappropriate charge states depending on the charac-teristic pH for a given protein. There are manyapproaches for computing pKa values that arereviewed in Ref. [130]. Some of them arestand-alone programs such as MCCE [131] andProPKA [132], while others are implemented intoWeb servers such as H++ [133].

isease-Causing Missense Mutations, J Mol Biol (2013), http://

9Review: Molecular Mechanisms of Missense Mutations

Effect of Mutations on ConformationalDynamics

Biological macromolecules may adopt differentconformations along the pathway of the correspond-ing biochemical reaction [134,135] and their intrinsicflexibility, the ability to sample alternative conforma-tions, is crucial for protein function [136,137].Furthermore, a significant fraction of macromole-cules is either disordered or has disordered seg-ments at a particular stage of the biologicalreaction [138,139]. Missense mutations can affectthe flexibility of the entire molecule or just a smallregion, can shift the equilibrium between differentconformations or can affect the entire conformationaldynamics of the molecule. For example, in a mostrecent study of the NFAT5 transcription factor,different mutations from the same DNA-bindingloop were analyzed [140]. It was shown that eventhough thesemutations are located very close to eachother in sequence and space, their effect on proteindynamics and DNA binding is drastically different.Typically, the changes in conformational dynamicsare assessed computationally via monitoring theRMSD of the wild-type and mutant structures andrecorded via the snapshots obtained in MD.Below, we outline different aspects of effects of

mutations on protein dynamics and provide exam-ples of recent finding in this regard. The mostcommonly used approach to study protein dynamicsis MD simulations, although Monte Carlo methodsand normal mode analysis can be utilized as well.Most widely used packages are NAMD [122],CHARMM [123], Amber [124], GROMOS [125],GROMACS [126], TINKER [141] and others.

Table 4. Effects of disease mutations on conformational dyna

Effect Disease associated

Conformationalflexibility

Diabetes and cancerMultiple diseases

Conformational diseasesVariegate porphyria disorder

Colorectal cancer

Classic galactosemiaBecker muscular dystrophy

Disordered regions Parkinson's diseaseOvarian cancer

CancerHuntington's disease

Order–disordertransitions

Stargardt disease (type 1)Adrenoleukodystrophy X-linkedAndrogen insensitivity syndrome

Citrullinemia (type 1)Aggregation α(1)-Antitrypsin deficiency and

cystic fibrosisCreutzfeldt-Jakob disease and

Gerstmann-Straussler-Scheinker DiseaseLocal motion Cancers

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

Effect on protein flexibility

Molecular flexibility is reflected in the ability of amacromolecule to sample alternative conformations,for example, to open/close the gate of a channel, tomediate the recognition of the receptor or to facilitatethe allosteric reactions. Typical examples are cen-trosomes, central regulators of mitosis, which areoften amplified in cancer cells. Specifically, thecentrosomal protein CEP63 is associated with ananeuploidy and solid tumors in humans. Whengenetic alterations such as mutation L61P occur,they increase flexibility of the protein, as seen in MDsimulations. This was suggested to be the cause ofthe disease [95]. Therefore, disease can be associ-ated with a change in the conformational anddynamical properties of the corresponding protein(Table 4).Another case is phosphatase and tensin homo-

log (PTEN) protein that plays essential roles incellular processes including survival, proliferation,energy metabolism and cellular architecture. Mu-tations in PTEN are implicated in diabetes andcancer. A particular mutation (H61D) was studiedusing MD simulations, and it was shown that itincreases the flexibility, radius of gyration andsolvent accessibility of PTEN [142]. The list ofexamples would be incomplete without mentioningthe “conformational diseases”, where native proteinconformers convert to pathological intermediatesthat can polymerize. Recently, a forme frustedeficiency variant of α(1)-antitrypsin (K154N) wasinvestigated by the nuclear magnetic resonancespectroscopy and it was found that this mutationalters the wild-type interaction of the side chain of

mics

Gene/mutation Reference

PTEN; H61D [142]cNTnC; L48Q [143]

α(1)-Antitrypsin variant; K154N [144]hPPO; R59Q, R59G [145]

Mitotic centromere-associatedkinesin protein; E403K

[146]

GALT; multiple sites [147]Dystrophin protein; L427P [148]α-Synuclein; multiple sites [149]HPV proteins; multiple sites [149]p53 protein; multiple sites [149]Huntingtin; multiple sites [149]

Multiple mutations [150]Multiple mutations [150]Multiple mutations [150]Multiple mutations [150]

Human serine protease inhibitor (serpin)α(1)-antitrypsin; E342K

[151] and [152]

Prion protein; multiple sites [153]

KLK3; multiple sites [154–156]

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

10 Review: Molecular Mechanisms of Missense Mutations

K154 with the backbone carbonyl oxygen of K174,affecting protein flexibility and resulting in poly-merization [144].Disease mutations might not only lead to in-

creased flexibility; on the contrary, they can restrictthe conformational transitions. One example in-cludes mutations in the human protoporphyrinogenoxidase (hPPO) gene that are responsible for thedominantly inherited disorder variegate porphyria.Two missense mutations (R59Q and R59G) wereinvestigated in a recent work [145] where MDmodeling revealed that these mutations affect thecatalytic activity of hPPO by changing its ability tosample different conformations. Analysis of mutationH101Q in the CLIC2 protein demonstrated that thismutation restricted the mobility of the N-terminaldomain and prevented the conformational changepresumed to be required for entering of CLIC2 intothe membrane [43,44]. In this regard, channels andpores are very interesting objects to study since theirselectivity is regulated by structural fragments. Forinstance, aquaporins play physiological roles inseveral organs and tissues, and their alteration isassociated with disorders of water regulation. Amutant, D184E, was shown to affect the mobility ofthe aquaporin D-loop, which acquires a higherpropensity to equilibrate in a “closed conformation”,thus affecting the rate of water flux [157].

Protein intrinsic disorder and disease mutations

Disordered proteins, existing in dynamic equilibri-um between various conformers, may provide a wayto tolerate many mutations due to the loosely packedcores. However, the relationship between disorder,stability and function is not well understood. Accord-ing to some studies, about 10% of inherited diseasemutations from HGMD database affect disorderedregions [62], and cancer-associated proteins areespecially enriched in intrinsically disordered re-gions [158]. In particular, disease-related genesencoding disordered proteins or proteins with theextended disordered regions include α-synuclein(Parkinson's disease), BRCA1 (breast and/or ovar-ian cancer), p53 (cancer) and huntingtin (Hunting-ton's disease) [149,159].There are many ways in which mutations might

impact disordered regions and lead to dysfunctionalproteins. If a residue that promotes disorder ismutated into a residue that favors structural regions,such a mutation can have a drastic effect ondisorder–order transition, post-translational modifi-cations, binding and other functions relevant tointrinsic disorder. Examples of such substitutionsa r e abundan t among subs t i t u t i o n s o fdisorder-promoting arginine into other residuestypes. Overall, about a quarter of disease mutationsin disordered regions were predicted to disruptdisorder-promoting properties. Therefore, the

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

prediction and prioritization of disease mutationsshould account for the disorder-promoting tendencyof the region where mutations occur [160,161].Several such examples are provided in Table 4.

Protein aggregation, misfolding and dynamics

Protein aggregation is driven by the exposure ofthe hydrophobic core of the macromolecule to thewater phase due to the changes in native structureand dynamics. It was shown that mutations inperipheral myelin protein 22 resulted in the commonperipheral neuropathy Charcot-Marie-Tooth dis-ease. One of them, L16P, caused misfolding thatled to a loss of function and toxic accumulation ofaggregates [162]. Another recent work studied theeffects of several mutations (V180I, F198S, V203Iand V210I) known to cause prion disease andshowed that these mutations induced the misfoldingand aggregation of the prion protein [163]. Anothercase involved mutations or deletions in the FMRPprotein, which participates in the regulation of mRNAmetabolism in brain, leading to the fragile Xsyndrome. A severe manifestation of the diseasehas been associated with the I304N mutation,located on the KH2 domain of the protein. Thismutation was found to destabilize the hydrophobiccore causing a partial unfolding and a displacementof α-helices [164].Similarly, mutationE342K in human serine protease

inhibitor (serpin) α(1)-antitrypsin causes polymeriza-tion in the endoplasmic reticulum of hepatocytesand is associated with a lack of secretion into thecirculation. It was shown that this mutation increaseslocal flexibility, favors polymerization and promotesaggregation [151]. Another case involves mutations inthe human prion protein resulting in Creutzfeldt-Jakobor Gerstmann-Straussler-Scheinker diseases. TheMD modeling suggested that these mutations pro-mote amyloid formation [153]. Other examples ofdisease-causing mutations in relation to aggregationand misfolding are shown in Table 4.

Effect of Mutations on Protein Activity

Challenges of modeling of mutation effects onprotein function

Protein structure–function relationships are com-plex and crucial for understanding and predicting theeffects of benign or disease-causing mutations onthe fitness. Proteins largely evolve through theacquisition of new mutations, the majority of themare destabilizing but neutral. At the same time, somemutations can be deleterious or damaging or mayresult in advantageous novel functions. Interestingly,mutations that modify and produce novel binding

isease-Causing Missense Mutations, J Mol Biol (2013), http://

11Review: Molecular Mechanisms of Missense Mutations

specificities were shown to have larger destabilizingeffects compared to mutations occurring on proteinsurfaces [165]. On the other hand, since proteinfunctional regions can be energetically unfavorable,mutations of functionally important residues, espe-cially of polar and charged residues, may often resultin more stable structures [166–168]. Protein stabilityis necessary but not sufficient for protein functioning,and proteins are not necessarily optimized tomaximize their stability [169,170]. Therefore, inorder to assess the damaging effect of mutations, itwould be crucial to understand how they will impactfunctionally important sites. Functional site predictionmethods can be subdivided into several categories:those that use evolutionary conservation of bindingsite motifs, those that use information about astructure of a complex and docking methods. Someof them include PHUNCTIONER [171], Firestar[172], IBIS [173], ConCavity [174] and others.The effects of missense mutations on proteins and

their function can be understood and modeled withinthe framework of the energy landscape theory thatdescribes the potential energy of a protein as afunction of conformational coordinates [95]. Differentconformational states might be characterized bydifferent functional specificities and structural differ-ences. The equilibrium between conformationalstates can be shifted by binding of different ligands,by post-translational modifications, by changing theenvironmental conditions and, finally, by mutations.Recently, human spermine synthase activity wasenhanced by engineered novel mutations [175] andit was demonstrated that the activity depended onelectrostatic field distribution, the intrinsic flexibility ofprotein domains and the overall protein stability. Thisindicates the complexity of biochemical functions

Table 5. Examples of disease-related mutations that affect pr

Gene/protein Disease Mutation

ATP-binding cassettetransporter (ABCA1)

Tangier disease(reduction of HDL

cholesterol in plasma)

N1800H

W590S

Niemann-Pick C1-likeprotein 1 (NPC1L1)

Hypocholesterolemia(reduction of low-densitylipoprotein cholesterol

in the plasma)

T61M, S881L

G402S, R1268H

Thiazide-sensitivesodium chloridecotransporter(SLC12A3)

Gitelman syndrome(salt wasting and low

blood pressure)

G439S, G741R

Chloride intracellularchannel protein2 (CLIC2)

X-linked intellectualdisability

H101Q

Histone-lysineN-methyltransferase(MLL2)

Kabuki syndrome W5065X, R5179H

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

involving various factors that, in turn, may beaffecting each other. Table 5 lists exemplarymutations and their effect on protein activity.

Case study: receptor tyrosine kinases

Here we analyze the mechanisms of cancermutations on protein stability, dynamics and activityusing an example of the well-studied humanreceptor tyrosine kinase (RTK) family that is fre-quently mutated in cancer. The connection betweencancer and kinase activation was found fairlyrecently [180], and an increased RTK activity intumor tissues was attributed to gene amplifications,enhanced transcription, translation and mutations.Several cancer mutation hotspots were identified inseveral RTKs, most of them were located in theactivation loop, P-loop and DFG loops. Moreover,driver mutations were found to be more oftenassociated with the functionally important regionsin kinases than passenger mutations [181]. Theactivation of certain RTKs is tightly linked with theirdimerization, and high RTK activity is sometimesachieved by promoting dimerization. According tolong-timescale MD simulations, it was suggestedthat cancer mutations may suppress the intrinsicdisorder on the dimer interface and stimulate theEGFR dimerization [182].Structural studies of kinases revealed different

structural perturbations in response to cancermutations. In particular, the mechanisms of kinaseactivation in cancer are probably linked to transitionsbetween the active and inactive states [183,184].Indeed, the thorough analysis of the effect of cancermutations on kinase structure and activity showedthat some mutations disturbed autoinhibitory

otein activity and function

Effect Reference

Changes the localization of protein, resultingin intracellular accumulation of ABCA1 and loss

of interaction with apoA-I

[176]

Impairs dissociation of apoA-I from ABCA1,which may affect ATP-dependent lipid translocation

[177]

Severely dysfunctional mutations; affect cholesteroluptake function, cholesterol-regulated recycling,

glycosylation and stability of the protein

[178]

Partially dysfunctional mutations; affectcholesterol-regulated recycling

[178]

Abolish sodium uptake function due to loss oflocalization at plasma membrane, which may

be caused by incomplete glycosylation

[179]

Stabilizes the protein, which in turn mayincrease ryanodine receptor activity

[44]

May affect the epigenetic control of activechromatin states via methylation of histones

[18]

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

12 Review: Molecular Mechanisms of Missense Mutations

interactions and considerably accelerated the catal-ysis [183]. At the same time, the crystal structure ofthe EGFR L858R mutant revealed that this mutationprevented the activation loop from adopting theinactive conformation [184], thereby activating thekinase. On the other hand, the secondary EGFRT790M mutation facilitated the transition betweeninactive and active conformations and increased thestability of the active conformation [185]. Enhancedmobility was observed near cancer mutations sites,and modeling of autoinhibited conformationsrevealed that these mutations disrupted the localstabilizing interactions and destabilized the inactiveform [186]. Recently, the effect of cancer mutationsin RTKs was analyzed using the dataset of allavailable structural pairs of active and inactivestates. It was shown that cancer mutations destabi-lized active states, but to a lesser degree than therandom mutations, moreover, to a lesser degreethan mutations destabilized inactive states [187].This led to kinase activation. The same study found arelationship between the statistics-based estimate ofoncogenic potential of mutation and its activationeffect calculated based on thermodynamics princi-ples. Namely, more frequent mutations had a higheractivating effect.

Multiple Mutations

A more comprehensive understanding of molecu-lar mechanisms of disease mutations may comefrom the analysis of the effects of multiple mutationsoccurring in the same gene. It was demonstratedpreviously that the number of cases where multiplemutations may occur simultaneously in one gene ishigher than the number predicted from the randommutation distribution [188]. Such multiple mutationsmay be the result of an accumulation of singlemutations in sequential cell replications or in thesame cell cycle. The latter synchronous mutationsare usually characterized by non-random proximalspacing in higher eukaryotes [189]. At the sametime, a recent study showed that the extent of theclustering of cancer mutations might differ betweenoncogenes and tumor suppressors and the formerare found to be clustered while the latter are not [41].A significant fraction of cancer-associated mutationscomes in doublets or triplets where overall about6–8% of all cancer mutations represent doublecancer mutations [187,189,190]. Similar to singlemutations, the majority of cancer multiple mutationsoccur in non-synonymous codon sites implying thatdoublets are under positive selection [187,191]. Theobserved and simulated spectra of double mutationsgenerated from the spectra of single mutations werefound to be significantly different for RTK genes,which pointed to different mechanisms underlyingsingle and double mutation spectra [187].

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

Moreover, multiple disease mutations might havea synergistic phenotypic effect manifested in en-hanced or decreased protein stability or activity. Forexample, a double mutation was found in β-amyloidprecursor protein that resulted in an increasedproduction and secretion of amyloid-β peptidecausing early-onset Alzheimer's disease [192]. Onthe other hand, double mutations in the α-galacto-sidase encoding gene, which cause Fabry disease,resulted in reduced or non-detectable activity com-pared to single mutants [193]. A positive epistasiswas found for many double cancer mutants in RTKgenes [187], namely, the effect of multiple mutationson kinase activity was higher than a total of individualmutations. This trend was especially pronounced fordouble mutations observed in more than one tumorsample.

Concluding Remarks

The effects of human missense mutations onvarious biophysical characteristics, stability, hydro-gen bond network, dynamics and activity werereviewed in this work. It was demonstrated thatmutations can frequently affect several biophysicalcharacteristics simultaneously and may or may notcause diseases (see also Ref. [194]). There is noclear threshold of how large the change of thewild-type characteristics should be in order to alterprotein function and result in disease. Predictions ofdamaging effects of mutations are further complicat-ed by the observations that enhanced protein activity[175], higher stability or binding affinity [43,44] arenot necessarily advantageous for the cell and can bedisease causing.

Acknowledgements

S.S., M.P, and E.A. acknowledge the support fromthe National Institutes of Health grant numberR01GM093937. H.N. and A.R.P. were supportedby the Intramural Research Program of the NationalLibrary of Medicine at the US National Institutes ofHealth.

Received 6 May 2013;Received in revised form 4 July 2013;

Accepted 10 July 2013

Keywords:genetic variation;

single nucleotide polymorphism;SNP;

rare mutations;diseases

isease-Causing Missense Mutations, J Mol Biol (2013), http://

13Review: Molecular Mechanisms of Missense Mutations

Present address: H. Nishi, Yokohama City University,Tsurumi-ku, Yokohama 230-0045, Japan.

† S.S. and H.N. contributed equally to this work.‡ http://www.nbcr.net/hbonanza

Abbreviations used:SNP, single nucleotide polymorphism; HDL, high-densitylipoprotein; 3D, three-dimensional; SVM, support vector

machine; MD, molecular dynamics; RTK, receptortyrosine kinase.

References

[1] Taillon-Miller P, Gu Z, Li Q, Hillier L, Kwok PY. Overlappinggenomic sequences: a treasure trove of single-nucleotidepolymorphisms. Genome Res 1998;8:748–54.

[2] De Baets G, Van Durme J, Reumers J, Maurer-Stroh S,Vanhee P, Dopazo J, et al. SNPeffect 4.0: on-line predictionof molecular and structural effects of protein-codingvariants. Nucleic Acids Res 2012;40:D935–9.

[3] Sturm RA, Frudakis TN. Eye colour: portals into pigmenta-tion genes and ancestry. Trends Genet 2004;20:327–32.

[4] Fernald GH, Capriotti E, Daneshjou R, Karczewski KJ,Altman RB. Bioinformatics challenges for personalizedmedicine. Bioinformatics 2011;27:1741–8.

[5] Mooney S. Bioinformatics approaches and resources forsingle nucleotide polymorphism functional analysis. BriefBioinform 2005;6:44–56.

[6] Capriotti E, Calabrese R, Casadio R. Predicting theinsurgence of human genetic diseases associated to singlepoint protein mutations with support vector machines andevolutionary information. Bioinformatics 2006;22:2729–34.

[7] Collins FS, Guyer MS, Chakravarti A. Variations on atheme: cataloging human DNA sequence variation. Science1997;278:1580–1.

[8] Sherry ST, Ward MH, Kholodov M, Baker J, Phan L,Smigielski EM, et al. dbSNP: the NCBI database of geneticvariation. Nucleic Acids Res 2001;29:308–11.

[9] Raychaudhuri S. Mapping rare and common causal allelesfor complex human diseases. Cell 2011;147:57–69.

[10] Schork NJ, Murray SS, Frazer KA, Topol EJ. Common vs.rare allele hypotheses for complex diseases. Curr OpinGenet Dev 2009;19:212–9.

[11] Pritchard JK. Are rare variants responsible for susceptibilityto complex diseases? Am J Hum Genet 2001;69:124–37.

[12] Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPhersonR, Hobbs HH. Multiple rare alleles contribute to low plasmalevels of HDL cholesterol. Science 2004;305:869–72.

[13] Cohen JC, Pertsemlidis A, Fahmi S, Esmail S, Vega GL,Grundy SM, et al. Multiple rare variants in NPC1L1associated with reduced sterol absorption and plasmalow-density lipoprotein levels. Proc Natl Acad Sci USA2006;103:1810–5.

[14] Nelson MR, Wegmann D, Ehm MG, Kessner D, St Jean P,Verzilli C, et al. An abundance of rare functional variants in202 drug target genes sequenced in 14,002 people.Science 2012;337:100–4.

[15] Tennessen JA, Bigham AW, O'Connor TD, Fu W, KennyEE, Gravel S, et al. Evolution and functional impact of rarecoding variation from deep sequencing of human exomes.Science 2012;337:64–9.

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

[16] Bodmer W, Bonilla C. Common and rare variants inmultifactorial susceptibility to common diseases. Nat Genet2008;40:695–701.

[17] Veltman JA, Brunner HG. De novo mutations in humangenetic disease. Nat Rev Genet 2012;13:565–75.

[18] Ng SB, Bigham AW, Buckingham KJ, Hannibal MC,McMillin MJ, Gildersleeve HI, et al. Exome sequencingidentifies MLL2 mutations as a cause of Kabuki syndrome.Nat Genet 2010;42:790–3.

[19] Shen B, Bai J, Vihinen M. Physicochemical feature-basedclassification of amino acid mutations. Protein Eng Des Sel2008;21:37–44.

[20] Kato S, Han S-Y, Liu W, Otsuka K, Shibata H, Kanamaru R,et al. Understanding the function–structure and function–mutation relationships of p53 tumor suppressor protein byhigh-resolution missense mutation analysis. Proc Natl AcadSci 2003;100:8424–9.

[21] Teng S, Michonova-Alexova E, Alexov E. Approaches andresources for prediction of the effects of non-synonymoussingle nucleotide polymorphism on protein function andinteractions. Curr Pharm Biotechnol 2008;9:123–33.

[22] Zhang Z, Miteva MA,Wang L, Alexov E. Analyzing effects ofnaturally occurring missense mutations. Comput MathMethods Med 2012;2012:15.

[23] Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, FieldsS. A fundamental protein property, thermodynamic stability,revealed solely from large-scale measurements of proteinfunction. Proc Natl Acad Sci 2012;109:16858–63.

[24] Talley K, Alexov E. On the pH-optimum of activity andstability of proteins. Proteins 2010;78:2699–706.

[25] Zhang Z, Wang L, Gao Y, Zhang J, Zhenirovskyy M, AlexovE. Predicting folding free energy changes upon single pointmutations. Bioinformatics 2012;28:664–71.

[26] Worth CL, Preissner R, Blundell TL. SDM—a server forpredicting effects of mutations on protein stability andmalfunction. Nucleic Acids Res 2011;39:W215–22.

[27] Kang S, Chen G, Xiao G. Robust prediction ofmutation-induced protein stability change by property encod-ing of amino acids. Protein Eng Des Sel 2009;22:75–83.

[28] Zwerger M, Jaalouk DE, Lombardi ML, Isermann P,Mauermann M, Dialynas G, et al. Myopathic laminmutations impair nuclear stability in cells and tissue anddisrupt nucleo-cytoskeletal coupling. Hum Mol Genet 2013;22:2335–49.

[29] Xu AJ, Springer TA. Mechanisms by which von Willebranddisease mutations destabilize the A2 domain. J Biol Chem2013;288:6317–24.

[30] Rakoczy EP, Kiel C, McKeone R, Stricher F, Serrano L.Analysis of disease-linked rhodopsin mutations based onstructure, function, and protein stability calculations. J MolBiol 2011;405:584–606.

[31] An O, Gursoy A, Gurgey A, Keskin O. Structural andfunctional analysis of perforin mutations in association withclinical data of familial hemophagocytic lymphohistiocytosistype 2 (FHL2) patients. Protein Sci 2013;22:823–39.

[32] Goransdotter Ericson K, Fadeel B, Nilsson-Ardnor S,Soderhall C, Samuelsson A, Janka G, et al. Spectrum ofperforin gene mutations in familial hemophagocytic lym-phohistiocytosis. Am J Hum Genet 2001;68:590–7.

[33] Prusiner S. Molecular biology of prion diseases. Science1991;252:1515–22.

[34] Collinge J. Prion diseases of humans and animals: theircauses and molecular basis. Annu Rev Neurosci 2001;24:519–50.

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

14 Review: Molecular Mechanisms of Missense Mutations

[35] Kretzschmar H, Tatzelt J. Prion disease: a tale of folds andstrains. Brain Pathol 2013;23:321–32.

[36] Jetha NN, Semenchenko V, Wishart DS, Cashman NR,Marziali A. Nanopore analysis of wild-type and mutant prionprotein (PrP(C)): single molecule discrimination and PrP(C)kinetics. PLoS One 2013;8:e54982.

[37] Lin W, Kang UJ. Characterization of PINK1 processing,stability, and subcellular localization. J Neurochem2008;106:464–74.

[38] Nuytemans K, Theuns J, Cruts M, Van Broeckhoven C.Genetic etiology of Parkinson disease associated withmutations in the SNCA, PARK2, PINK1, PARK7, andLRRK2 genes: a mutation update. Hum Mutat 2010;31:763–80.

[39] Morais VA, Verstreken P, Roethig A, Smet J, Snellinx A,Vanbrabant M, et al. Parkinson's disease mutations inPINK1 result in decreased complex I activity and deficientsynaptic function. EMBO Mol Med 2009;1:99–111.

[40] Grant MA, Lazo ND, Lomakin A, Condron MM, Arai H,Yamin G, et al. Familial Alzheimer's disease mutationsalter the stability of the amyloid beta-protein monomerfolding nucleus. Proc Natl Acad Sci USA 2007;104:16522–7.

[41] Stehr H, Jang SH, Duarte JM, Wierling C, Lehrach H, LappeM, et al. The structural impact of cancer-associatedmissense mutations in oncogenes and tumor suppressors.Mol Cancer 2011;10:54.

[42] Shi Z, Moult J. Structural and functional impact ofcancer-related missense somatic mutations. J Mol Biol2011;413:495–512.

[43] Witham S, Takano K, Schwartz C, Alexov E. A missensemutation in CLIC2 associated with intellectual disability ispredicted by in silico modeling to affect protein stability anddynamics. Proteins 2011;79:2444–54.

[44] Takano K, Liu D, Tarpey P, Gallant E, Lam A, Witham S,et al. An X-linked channelopathy with cardiomegaly due to aCLIC2 mutation enhancing ryanodine receptor channelactivity. Hum Mol Genet 2012;21:4497–507.

[45] Spector S, Wang M, Carp SA, Robblee J, Hendsch ZS,Fairman R, et al. Rational modification of protein stability bythe mutation of charged surface residues†. Biochemistry2000;39:872–9.

[46] Zhang Z, Norris J, Schwartz C, Alexov E. In silico and invitro investigations of the mutability of disease-causingmissense mutation sites in spermine synthase. PLoS One2011;6:e20373.

[47] Teng S, Madej T, Panchenko A, Alexov E. Modeling effectsof human single nucleotide polymorphisms on protein–protein interactions. Biophys J 2009;96:2178–88.

[48] Nishi H, Tyagi M, Teng S, Shoemaker BA, Hashimoto K,Alexov E, et al. Cancer missense mutations alter bindingproperties of proteins and their interaction networks. PLoSOne 2013;8:e66273.

[49] Zhang Z, Miteva MA,Wang L, Alexov E. Analyzing effects ofnaturally occurring missense mutations. Comput MathMethods Med 2012;2012:805827.

[50] Zhang Z, Norris J, Kalscheuer V, Wood T, Wang L,Schwartz C, et al. A Y328C missense mutation inspe rm ine syn thase causes a mi l d f o rm o fSnyder-Robinson syndrome. Hum Mol Genet 2013 [Epubahead of print, PMID: 23696453, in press].

[51] Teng S, Srivastava AK, Wang L. Sequence feature-basedprediction of protein stability changes upon amino acidsubstitutions. BMC Genomics 2010;11:S5.

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

[52] Capriotti E, Fariselli P, Casadio R. I-Mutant2.0: predictingstability changes upon mutation from the protein sequenceor structure. Nucleic Acids Res 2005;33:W306–10.

[53] Cheng J, Randall A, Baldi P. Prediction of protein stabilitychanges for single-site mutations using support vectormachines. Proteins Struct Funct Bioinf 2006;62:1125–32.

[54] DehouckY,Grosfils A, FolchB,Gilis D, Bogaerts P,RoomanM. Fast and accurate predictions of protein stability changesupon mutations using statistical potentials and neuralnetworks: PoPMuSiC-2.0. Bioinformatics 2009;25:2537–43.

[55] Ng PC, Henikoff S. SIFT: predicting amino acid changesthat affect protein function. Nucleic Acids Res 2003;31:3812–4.

[56] Ramensky V, Bork P, Sunyaev S. Human non-synonymousSNPs: server and survey. Nucleic Acids Res 2002;30:3894–900.

[57] Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE,Gerasimova A, Bork P, et al. A method and server forpredicting damaging missense mutations. Nat Methods2010;7:248–9.

[58] Gonnelli G, Rooman M, Dehouck Y. Structure-basedmutant stability predictions on proteins of unknown struc-ture. J Biotechnol 2012;161:287–93.

[59] Parthiban V, Gromiha MM, Schomburg D. CUPSAT:prediction of protein stability upon point mutations. NucleicAcids Res 2006;34:W239–42.

[60] Guerois R, Nielsen JE, Serrano L. Predicting changes in thestability of proteins and protein complexes: a study of morethan 1000 mutations. J Mol Biol 2002;320:369–87.

[61] Ding F, Dokholyan NV. Emergence of protein fold familiesthrough rational design. PLoS Comput Biol 2006;2:e85.

[62] Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN,et al. Automated inference of molecular mechanisms ofdisease from amino acid substitutions. Bioinformatics2009;25:2744–50.

[63] Benedix A, Becker CM, de Groot BL, Caflisch A, BockmannRA. Predicting free energy changes using structuralensembles. Nat Methods 2009;6:3–4.

[64] de Groot BL, van Aalten DM, Scheek RM, Amadei A, VriendG, Berendsen HJ. Prediction of protein conformationalfreedom fromdistance constraints. Proteins 1997;29:240–51.

[65] Li L, Li C, Sarkar S, Zhang J, Witham S, Zhang Z, et al.DelPhi: a comprehensive suite for DelPhi software andassociated resources. BMC Biophys 2012;5:9.

[66] Venselaar H, Te Beek TA, Kuipers RK, Hekkelman ML,Vriend G. Protein structure analysis of mutations causinginheritable diseases: an e-Science approach with lifescientist friendly interfaces. BMC Bioinf 2010;11:548.

[67] Khan S, Vihinen M. Performance of protein stabilitypredictors. Hum Mutat 2010;31:675–84.

[68] Masso M, Vaisman II. Accurate prediction of stabilitychanges in protein mutants by combining machine learningwith structure based computational mutagenesis. Bioinfor-matics 2008;24:2002–9.

[69] Potapov V, Cohen M, Schreiber G. Assessing computa-tional methods for predicting protein stability upon mutation:good on average but not in the details. Protein Eng Des Sel2009;22:553–60.

[70] Thiltgen G, Goldstein RA. Assessing predictors of changesin protein stability upon mutation using self-consistency.PLoS One 2012;7:e46084.

[71] Vihinen M. How to evaluate performance of predictionmethods? Measures and their interpretation in variationeffect analysis. BMC Genomics 2012;13:S2.

isease-Causing Missense Mutations, J Mol Biol (2013), http://

15Review: Molecular Mechanisms of Missense Mutations

[72] Thusberg J, Olatubosun A, Vihinen M. Performance ofmutation pathogenicity prediction methods on missensevariants. Hum Mutat 2011;32:358–68.

[73] Zhou H, Zhou Y. Distance-scaled, finite ideal-gas referencestate improves structure-derived potentials of mean forcefor structure selection and stability prediction. Protein Sci2002;11:2714–26.

[74] Deutsch C, Krishnamoorthy B. Four-body scoring functionfor mutagenesis. Bioinformatics 2007;23:3009–15.

[75] Dosztanyi Z, Magyar C, Tusnady G, Simon I. SCide:identification of stabilization centers in proteins. Bioinfor-matics 2003;19:899–900.

[76] Kurgan L, Cios K, Chen K. SCPRED: accurate prediction ofprotein structural class for sequences of twilight-zonesimilarity with predicting sequences. BMC Bioinf 2008;9:226.

[77] Magyar C, Gromiha MM, Pujadas G, Tusnady GE, Simon I.SRide: a server for identifying stabilizing residues inproteins. Nucleic Acids Res 2005;33:W303–5.

[78] Bao L, Zhou M, Cui Y. nsSNPAnalyzer: identifyingdisease-associated nonsynonymous single nucleotidepolymorphisms. Nucleic Acids Res 2005;33:W480–2.

[79] Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B,Daverman R, et al. PANTHER: a library of protein familiesand subfamilies indexed by function. Genome Res 2003;13:2129–41.

[80] Bromberg Y, Rost B. SNAP: predict effect ofnon-synonymous polymorphisms on function. NucleicAcids Res 2007;35:3823–35.

[81] Calabrese R, Capriotti E, Fariselli P, Martelli PL, Casadio R.Functional annotations improve the predictive score ofhuman disease-related mutations in proteins. Hum Mutat2009;30:1237–44.

[82] Gromiha MM, Huang LT. Machine learning algorithms forpredicting protein folding rates and stability of mutantproteins: comparison with statistical methods. Curr ProteinPept Sci 2011;12:490–502.

[83] Kumar S, Suleski MP, Markov GJ, Lawrence S, Marco A,Filipski AJ. Positional conservation and amino acids shapethe correct diagnosis and population frequencies of benignand damaging personal amino acid mutations. GenomeRes 2009;19:1562–9.

[84] Nisius L, Grzesiek S. Key stabilizing elements of proteinstructure identified through pressure and temperatureperturbation of its hydrogen bond network. Nat Chem2012;4:711–7.

[85] Gong S, Worth CL, Bickerton GR, Lee S, Tanramluk D,Blundell TL. Structural and functional restraints in theevolution of protein families and superfamilies. BiochemSoc Trans 2009;37:727–33.

[86] Horowitz S, Trievel RC. Carbon-oxygen hydrogen bondingin biological structure and function. J Biol Chem 2012;287:41576–82.

[87] Pandit RA, Chen CJ, Butt TA, Islam N. Identification andanalysis of a novel mutation in PEPD gene in two Kashmirisiblings with prolidase enzyme deficiency. Gene 2013;516:316–9.

[88] Nguyen HH, Hannemann F, HartmannMF, Malunowicz EM,Wudy SA, Bernhardt R. Five novel mutations in CYP11B2gene detected in patients with aldosterone synthasedeficiency type I: functional characterization and structuralanalyses. Mol Genet Metab 2010;100:357–64.

[89] Fontaine M, Dessein AF, Douillard C, Dobbelaere D, BrivetM, Boutron A, et al. A novel mutation in CPT1A resulting inhepatic CPT deficiency. JIMD Rep 2012;6:7–14.

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

[90] Schurmann K, Anton M, Ivanov I, Richter C, Kuhn H,Walther M. Molecular basis for the reduced catalytic activityof the naturally occurring T560M mutant of human12/15-lipoxygenase that has been implicated in coronaryartery disease. J Biol Chem 2011;286:23920–7.

[91] Wang S, Zhao WJ, Liu H, Gong H, Yan YB. IncreasingbetaB1-crystallin sensitivity to proteolysis caused by thecongenital cataract-microcornea syndrome mutationS129R. Biochim Biophys Acta 2013;1832:302–11.

[92] Attanasio F, Convertino M, Magno A, Caflisch A, CorazzaA, Haridas H, et al. Carnosine inhibits Abeta42 aggrega-tion by perturbing the H-bond network in and around thecentral hydrophobic cluster. ChemBioChem 2013;14:583–92.

[93] Murray MM, Krone MG, Bernstein SL, Baumketner A,Condron MM, Lazo ND, et al. Amyloid beta-protein:experiment and theory on the 21–30 fragment. J PhysChem B 2009;113:6041–6.

[94] Mizuguchi M, Yokoyama T, Nabeshima Y, Kawano K,Tanaka I, Niimura N. Quaternary structure, aggregation andcytotoxicity of transthyretin. Amyloid 2012;19:5–7.

[95] Kumar A, Purohit R. Computational investigation of patho-genic nsSNPs in CEP63 protein. Gene 2012;503:75–82.

[96] Liou B, Kazimierczuk A, Zhang M, Scott CR, Hegde RS,Grabowski GA. Analyses of variant acid β-glucosidases:effects of gaucher disease mutations. J Biol Chem2006;281:4242–53.

[97] Bal NC, Sharon A, Gupta SC, Jena N, Shaikh S, Gyorke S,et al. The catecholaminergic polymorphic ventricular tachy-cardia mutation R33Q disrupts the N-terminal structuralmotif that regulates reversible calsequestrin polymerization.J Biol Chem 2010;285:17188–96.

[98] Kumar A, Purohit R. Computational screening and molec-ular dynamics simulation of disease associated nsSNPs inCENP-E. Mutat Res 2012;738–739:28–37.

[99] Vieyra D, Senger DL, Toyama T, Muzik H, Brasher PM,Johnston RN, et al. Altered subcellular localization and lowfrequency of mutations of ING1 in human brain tumors. ClinCancer Res 2003;9:5952–61.

[100] Kunishima S, Matsushita T, Kojima T, Sako M, Kimura F, JoEK, et al. Immunofluorescence analysis of neutrophilnonmuscle myosin heavy chain-A in MYH9 disorders:association of subcellular localization with MYH9mutations.Lab Invest 2003;83:115–22.

[101] Yokoyama T, Mizuguchi M, Nabeshima Y, Kusaka K,Yamada T, Hosoya T, et al. Hydrogen-bond network andpH sensitivity in transthyretin: neutron crystal structure ofhuman transthyretin. J Struct Biol 2012;177:283–90.

[102] Garai K, Baban B, Frieden C. Self-association and stabilityof the ApoE isoforms at low pH: implications for ApoE-lipidinteractions. Biochemistry 2011;50:6356–64.

[103] Imbard A, Boutron A, Vequaud C, Zater M, de Lonlay P, deBaulny HO, et al. Molecular characterization of 82 patientswith pyruvate dehydrogenase complex deficiency. Struc-tural implications of novel amino acid substitutions in E1protein. Mol Genet Metab 2011;104:507–16.

[104] Smith AT, Su Y, Stevens DJ, Majtan T, Kraus JP, BurstynJN. Effect of the disease-causing R266K mutation on theheme and PLP environments of human cystathioninebeta-synthase. Biochemistry 2012;51:6360–70.

[105] Padhi AK, Kumar H, Vasaikar SV, Jayaram B, Gomes J.Mechanisms of loss of functions of human angiogeninvariants implicated in amyotrophic lateral sclerosis. PLoSOne 2012;7:e32479.

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

16 Review: Molecular Mechanisms of Missense Mutations

[106] Musayev FN, Di Salvo ML, Saavedra MA, Contestabile R,Ghatge MS, Haynes A, et al. Molecular basis of reducedpyridoxine 5′-phosphate oxidase catalytic activity in neonatalepileptic encephalopathy disorder. J Biol Chem 2009;284:30949–56.

[107] Zhang Z, Teng S, Wang L, Schwartz CE, Alexov E.Computational analysis of missense mutations causingSnyder-Robinson syndrome. Hum Mutat 2010;31:1043–9.

[108] Gorbenko G, Trusova V. Protein aggregation in a membraneenvironment. Adv Protein Chem Struct Biol 2011;84:113–42.

[109] Goto Y, Yagi H, Yamaguchi K, Chatani E, Ban T. Structure,formation and propagation of amyloid fibrils. Curr PharmDes 2008;14:3205–18.

[110] Khare SD, Dokholyan NV. Molecular mechanisms ofpolypeptide aggregation in human diseases. Curr ProteinPept Sci 2007;8:573–9.

[111] Naeem A, Fazili NA. Defective protein folding and aggre-gation as the basis of neurodegenerative diseases: thedarker aspect of proteins. Cell Biochem Biophys 2011;61:237–50.

[112] Wilbur JD, Hwang PK, Brodsky FM, Fletterick RJ. Accom-modat ion of st ructura l rearrangements in thehuntingtin-interacting protein 1 coiled-coil domain. ActaCrystallogr Sect D Biol Crystallogr 2010;66:314–8.

[113] Mason AB, Halbrooks PJ, James NG, Byrne SL, Grady JK,Chasteen ND, et al. Structural and functional consequencesof the substitution of glycine 65 with arginine in the N-lobe ofhuman transferrin. Biochemistry 2009;48:1945–53.

[114] Garcia-Moreno B. Adaptations of proteins to cellular andsubcellular pH. J Biol 2009;8:98.

[115] Mitra RC, Zhang Z, Alexov E. In silico modeling ofpH-optimum of protein–protein binding. Proteins 2011;79:925–36.

[116] Chan P, Warwicker J. Evidence for the adaptation of proteinpH-dependence to subcellular pH. BMC Biol 2009;7:69.

[117] Kulkarni MV, Tettamanzi MC, Murphy JW, Keeler C,Myszka DG, Chayen NE, et al. Two independent histidines,one in human prolactin and one in its receptor, are critical forpH-dependent receptor recognition and activation. J BiolChem 2010;285:38524–33.

[118] Wang L, Witham S, Zhang Z, Li L, Hodsdon M, Alexov E. Insilico investigation of pH-dependence of prolactin andhuman growth hormone binding to human prolactinreceptor. 2013.

[119] Reichold M, Zdebik AA, Lieberer E, Rapedius M, Schmidt K,Bandulik S, et al. KCNJ10 gene mutations causing EASTsyndrome (epilepsy, ataxia, sensorineural deafness, andtubulopathy) disrupt channel function. Proc Natl Acad SciUSA 2010;107:14490–5.

[120] Hanemann CO, D'Urso D, Gabreels-Festen AA, Muller HW.Mutation-dependent alteration in cellular distribution ofperipheral myelin protein 22 in nerve biopsies fromCharcot-Marie-Tooth type 1A. Brain 2000;123:1001–6.

[121] PetukhM, Stefl S, Alexov E. The role of protonation states inligand-receptor recognition and binding. Curr Pharm Des2012;19:4182–90.

[122] Nelson MT, Humphrey W, Gursoy A, Dalke A, Kalé LV,Skeel RD, et al. NAMD: a parallel, object-oriented moleculardynamics program. Int J Supercomp Applic High Perf Comp1996;10:251–68.

[123] Brooks BR, Bruccoleri RE, Olafson BD, States DJ,Swaminathan S, Karplus M. CHARMM: a program formacromolecular energy, minimization, and dynamics cal-culations. J Comput Chem 1983;4:187–217.

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

[124] Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, MerzKM, et al. The Amber biomolecular simulation programs. JComput Chem 2005;26:1668–88.

[125] ChristenM, Hunenberger PH, Bakowies D, Baron R, Burgi R,Geerke DP, et al. The GROMOS software for biomolecularsimulation: GROMOS05. J Comput Chem 2005;26:1719–51.

[126] Berendsen HJC, van der Spoel D, van Drunen R.GROMACS: a message-passing parallel molecular dynam-ics implementation. Comput Phys Commun 1995;91:43–56.

[127] Word JM, Lovell SC, Richardson JS, Richardson DC.Asparagine and glutamine: using hydrogen atom contacts inthe choice of side-chain amide orientation. J Mol Biol1999;285:1735–47.

[128] Dolinsky TJ, Czodrowski P, Li H, Nielsen JE, Jensen JH,Klebe G, et al. PDB2PQR: expanding and upgradingautomated preparation of biomolecular structures formolecular simulations. Nucleic Acids Res 2007;35:W522–5.

[129] Durrant JD, McCammon JA. HBonanza: a computeralgorithm for molecular-dynamics-trajectory hydrogen-bondanalysis. J Mol Graphics Modell 2011;31:5–9.

[130] Murray AE. A routine method for the quantification ofphysical change in melanocytic naevi using digital imageprocessing. J Audiov Media Med 1988;11:52–7.

[131] Georgescu RE, Alexov EG, Gunner MR. Combiningconformational flexibility and continuum electrostatics forcalculating pK(a)s in proteins. Biophys J 2002;83:1731–48.

[132] Li H, Robertson AD, Jensen JH. Very fast empiricalprediction and rationalization of protein pKa values. Pro-teins 2005;61:704–21.

[133] Gordon JC, Myers JB, Folta T, Shoja V, Heath LS, OnufrievA. H++: a server for estimating pKas and adding missinghydrogens to macromolecules. Nucleic Acids Res 2005;33:W368–71.

[134] Maier T, Leibundgut M, Boehringer D, Ban N. Structure andfunction of eukaryotic fatty acid synthases. Q Rev Biophys2010;43:373–422.

[135] Svetlov V, Nudler E. Macromolecular micromovements:how RNA polymerase translocates. Curr Opin Struct Biol2009;19:701–7.

[136] Kokkinidis M, Glykos NM, Fadouloglou VE. Protein flexibilityand enzymatic catalysis. Adv Protein Chem Struct Biol2012;87:181–218.

[137] Chow MK, Lomas DA, Bottomley SP. Promiscuousbeta-strand interactions and the conformational diseases.Curr Med Chem 2004;11:491–9.

[138] Oldfield CJ, Xue B, Van YY, Ulrich EL, Markley JL, DunkerAK, et al. Utilization of protein intrinsic disorder knowledgein structural proteomics. Biochim Biophys Acta 2013;1834:487–98.

[139] Xue B, Dunker AK, Uversky VN. Orderly order in proteinintrinsic disorder distribution: disorder in 3500 proteomesfrom viruses and the three domains of life. J Biomol StructDyn 2012;30:137–49.

[140] Li M, Shoemaker BA, Thangudu RR, Ferraris JD, Burg MB,Panchenko AR. Mutations in DNA-binding loop of NFAT5transcription factor produce unique outcomes on protein–DNA binding and dynamics. J Phys Chem B 2013 [Epubahead of print, PMID: 23734591, in press].

[141] Ponder J, Richards F. TINKER molecular modelingpackage. J Comput Chem 1987;8:1016–24.

[142] George Priya Doss C, Rajith B. A new insight into structuraland functional impact of single-nucleotide polymorphisms inPTEN gene. Cell Biochem Biophys 2012;66:249–63.

isease-Causing Missense Mutations, J Mol Biol (2013), http://

17Review: Molecular Mechanisms of Missense Mutations

[143] Wang D, Robertson IM, Li MX, McCully ME, Crane ML, LuoZ, et al. Structural and functional consequences of thecardiac troponin C L48Q Ca(2+)-sensitizing mutation.Biochemistry 2012;51:4473–87.

[144] Nyon MP, Segu L, Cabrita LD, Levy GR, Kirkpatrick J,Roussel BD, et al. Structural dynamics associated withintermediate formation in an archetypal conformationaldisease. Structure 2012;20:504–12.

[145] Wang B, Wen X, Qin X, Wang Z, Tan Y, Shen Y, et al.Quantitative structural insight into human variegate por-phyria disease. J Biol Chem 2013;17:11731–40.

[146] Kumar A, Rajendran V, Sethumadhavan R, Purohit R.Evidence of colorectal cancer-associated mutation in MCAK:a computational report. Cell Biochem Biophys 2013;1:55.

[147] Tang M, Facchiano A, Rachamadugu R, Calderon F, MaoR, Milanesi L, et al. Correlation assessment among clinicalphenotypes, expression analysis andmolecular modeling of14 novel variations in the human galactose-1-phosphateuridylyltransferase gene. Hum Mutat 2012;33:1107–15.

[148] Acsadi G, Moore SA, Cheron A, Delalande O, Bennett L,Kupsky W, et al. Novel mutation in spectrin-like repeat 1 ofdystrophin central domain causes protein misfolding andmild Becker muscular dystrophy. J Biol Chem 2012;287:18153–62.

[149] Uversky VN, Oldfield CJ, Midic U, Xie H, Xue B, Vucetic S,et al. Unfoldomics of humandiseases: linking protein intrinsicdisorder with diseases. BMC Genomics 2009;10:S7.

[150] Uversky VN, Iakoucheva LM, Dunker AK. Protein disorderand human genetic disease. eLS. John Wiley & Sons, Ltd;2001.

[151] Kass I, Knaupp AS, Bottomley SP, Buckle AM. Conforma-tional properties of the disease-causing Z variant ofalpha1-antitrypsin revealed by theory and experiment.Biophys J 2012;102:2856–65.

[152] Doring G. Serine proteinase inhibitor therapy inalpha(1)-antitrypsin inhibitor deficiency and cystic fibrosis.Pediatr Pulmonol 1999;28:363–75.

[153] Jahandideh S, Zhi D. Systematic investigation of predictedeffect of nonsynonymous SNPs in human prion proteingene: a molecular modeling and molecular dynamics study.J Biomol Struct Dyn 2013 [Epub ahead of print, PMID:23527686, in press].

[154] Zhang Z, Liu M, Li B, Wang Y, Yue J, Liang L, et al.Exploring the mechanism of a regulatory SNP of KLK3 bymolecular dynamics simulation. J Biomol Struct Dyn2013;31:426–40.

[155] Kote-Jarai Z, Amin Al Olama A, Leongamornlert D,Tymrakiewicz M, Saunders E, Guy M, et al. Identificationof a novel prostate cancer susceptibility variant in the KLK3gene transcript. Hum Genet 2011;129:687–94.

[156] O'Mara TA, Nagle CM, Batra J, Kedda MA, Clements JA,Spurdle AB. Kallikrein-related peptidase 3 (KLK3/PSA)single nucleotide polymorphisms and ovarian cancersurvival. Twin Res Hum Genet 2011;14:323–7.

[157] Nicchia GP, Ficarella R, Rossi A, Giangreco I, Nicolotti O,Carotti A, et al. D184Emutation in aquaporin-4 gene impairswater permeability and links to deafness. Neuroscience2011;197:80–8.

[158] Iakoucheva LM, Brown CJ, Lawson JD, Obradovic Z,Dunker AK. Intrinsic disorder in cell-signaling andcancer-associated proteins. J Mol Biol 2002;323:573–84.

[159] Bugg CW, Isas JM, Fischer T, Patterson PH, Langen R.Structural features and domain organization of huntingtinfibrils. J Biol Chem 2012;287:31739–46.

Please cite this article as: Stefl S, et al, Molecular Mechanisms ofdx.doi.org/10.1016/j.jmb.2013.07.014

[160] Hu Y, Liu Y, Jung J, Dunker AK, Wang Y. Changes inpredicted protein disorder tendency may contribute todisease risk. BMC Genomics 2011;12:S2.

[161] Vacic V, Iakoucheva LM. Disease mutations in disorderedregions—exception to the rule? Mol Biosyst 2012;8:27–32.

[162] Sakakura M, Hadziselimovic A, Wang Z, Schey KL,Sanders CR. Structural basis for the Trembler-J phenotypeof Charcot-Marie-Tooth disease. Structure 2011;19:1160–9.

[163] van der Kamp MW, Daggett V. Pathogenic mutations in thehydrophobic core of the human prion protein can promotestructural instability and misfolding. J Mol Biol 2010;404:732–48.

[164] Di Marino D, Achsel T, Lacoux C, Falconi M, Bagni C.Molecular dynamics simulations show how the FMRPIle304Asn mutation destabilizes the KH2 domain structureand affects its function. J Biomol Struct Dyn 2013 [Epubahead of print, PMID: 23527791, in press].

[165] Tokuriki N, Stricher F, Serrano L, Tawfik DS. How proteinstability and new functions trade off. PLoS Comput Biol2008;4:e1000002.

[166] Dessailly BH, LensinkMF,Wodak SJ. Relating destabilizingregions to known functional sites in proteins. BMC Bioinf2007;8:141.

[167] Elcock AH. Prediction of functionally important residuesbased solely on the computed energetics of proteinstructure. J Mol Biol 2001;312:885–96.

[168] Studer RA, Dessailly BH, Orengo CA. Residue mutationsand their impact on protein structure and function: detectingbeneficial and pathogenic changes. Biochem J 2013;449:581–94.

[169] Meiering EM, Serrano L, Fersht AR. Effect of active siteresidues in barnase on activity and stability. J Mol Biol1992;225:585–9.

[170] Shoichet BK, Baase WA, Kuroki R, Matthews BW. Arelationship between protein stability and protein function.Proc Natl Acad Sci USA 1995;92:452–6.

[171] Pazos F, Sternberg MJ. Automated prediction of proteinfunction and detection of functional sites from structure.Proc Natl Acad Sci USA 2004;101:14754–9.

[172] Lopez G, Valencia A, Tress ML. firestar—prediction offunctionally important residues using structural templatesand alignment reliability. Nucleic Acids Res 2007;35:W573–7.

[173] Shoemaker BA, Zhang D, Tyagi M, Thangudu RR, Fong JH,Marchler-Bauer A, et al. IBIS (Inferred BiomolecularInteraction Server) reports, predicts and integrates multipletypes of conserved interactions for proteins. Nucleic AcidsRes 2012;40:D834–40.

[174] Capra JA, Laskowski RA, Thornton JM, Singh M,Funkhouser TA. Predicting protein ligand binding sites bycombining evolutionary sequence conservation and 3Dstructure. PLoS Comput Biol 2009;5:e1000585.

[175] Zhang Z, Zheng Y, Petukh M, Pegg A, Ikeguchi Y, Alexov E.Enhancing human spermine synthase activity by engi-neered mutations. PLoS Comput Biol 2013;9:e1002924.

[176] Singaraja RR, Visscher H, James ER, Chroni A, CoutinhoJM, Brunham LR, et al. Specific mutations in ABCA1 havediscrete effects on ABCA1 function and lipid phenotypesboth in vivo and in vitro. Circ Res 2006;99:389–97.

[177] Nagao K, Zhao Y, Takahashi K, Kimura Y, Ueda K. Sodiumtaurocholate-dependent lipid efflux by ABCA1: effects ofW590S mutation on lipid translocation and apolipoproteinA-I dissociation. J Lipid Res 2009;50:1165–72.

Disease-Causing Missense Mutations, J Mol Biol (2013), http://

18 Review: Molecular Mechanisms of Missense Mutations

[178] Wang LJ, Wang J, Li N, Ge L, Li BL, Song BL. Molecularcharacterization of the NPC1L1 variants identified fromcholesterol low absorbers. J Biol Chem 2011;286:7397–408.

[179] De Jong JC, Van Der Vliet WA, Van Den Heuvel LP,Willems PH, Knoers NV, Bindels RJ. Functional expressionof mutations in the human NaCl cotransporter: evidence forimpaired routingmechanisms in Gitelman's syndrome. J AmSoc Nephrol 2002;13:1442–8.

[180] Martin GS. The road to Src. Oncogene 2004;23:7910–7.[181] Izarzugaza JM, Redfern OC, Orengo CA, Valencia A.

Cancer-associated mutations are preferentially distributedin protein kinase functional sites. Proteins 2009;77:892–903.

[182] Shan Y, Eastwood MP, Zhang X, Kim ET, Arkhipov A, DrorRO, et al. Oncogenic mutations counteract intrinsic disorderin the EGFR kinase and promote receptor dimerization. Cell2012;149:860–70.

[183] Yun CH, Boggon TJ, Li Y, Woo MS, Greulich H, MeyersonM, et al. Structures of lung cancer-derived EGFR mutantsand inhibitor complexes: mechanism of activation andinsights into differential inhibitor sensitivity. Cancer Cell2007;11:217–27.

[184] Zhang X, Gureasko J, Shen K, Cole PA, Kuriyan J. Anallosteric mechanism for activation of the kinase domain ofepidermal growth factor receptor. Cell 2006;125:1137–49.

[185] Yun CH, Mengwasser KE, Toms AV, Woo MS, Greulich H,Wong KK, et al. The T790M mutation in EGFR kinasecauses drug resistance by increasing the affinity for ATP.Proc Natl Acad Sci USA 2008;105:2070–5.

[186] Dixit A, Verkhivker GM. Hierarchical modeling of activationmechanisms in the ABL and EGFR kinase domains:

Please cite this article as: Stefl S, et al, Molecular Mechanisms of Ddx.doi.org/10.1016/j.jmb.2013.07.014

thermodynamic and mechanistic catalysts of kinase activa-tion by cancer mutations. PLoS Comput Biol 2009;5:e1000487.

[187] Hashimoto K, Rogozin IB, Panchenko AR. Oncogenicpotential is related to activating effect of cancer single anddouble somatic mutations in receptor tyrosine kinases. HumMutat 2012;33:1566–75.

[188] Colgin LM, Hackmann AF, Emond MJ, Monnat RJ. Theunexpected landscape of in vivo somatic mutation in ahuman epithelial cell lineage. Proc Natl Acad Sci USA2002;99:1437–42.

[189] Chen JM, Ferec C, Cooper DN. Closely spaced multiplemutations as potential signatures of transient hypermut-ability in human genes. Hum Mutat 2009;30:1435–48.

[190] Chen Z, Feng J, Buzin CH, Sommer SS. Epidemiology ofdoublet/multiplet mutations in lung cancers: evidence that asubset arises by chronocoordinate events. PLoS One2008;3:e3714.

[191] Meng L, Lin L, Zhang H, Nassiri M, Morales AR, Nadji M.Multiple mutations of the p53 gene in human mammarycarcinoma. Mutat Res 1999;435:263–9.

[192] Haass C, Lemere CA, Capell A, Citron M, Seubert P,Schenk D, et al. The Swedish mutation causes early-onsetAlzheimer's disease by beta-secretase cleavage within thesecretory pathway. Nat Med 1995;1:1291–6.

[193] Yasuda M, Shabbeer J, Benson SD, Maire I, Burnett RM,Desnick RJ. Fabry disease: characterization of alpha-galac-tosidase A double mutations and the D313Y plasma enzymepseudodeficiency allele. Hum Mutat 2003;22:486–92.

[194] Gong S, Blundell TL. Structural and functional restraints onthe occurrence of single amino acid variations in humanproteins. PLoS One 2010;5:e9186.

isease-Causing Missense Mutations, J Mol Biol (2013), http://