Methionine to isothreonine conversion as a source of false discovery identifications of genetically...

10
Methionine to isothreonine conversion as a source of false discovery identifications of genetically encoded variants in proteogenomics Alexey L. Chernobrovkin a, b , Arthur T. Kopylov a , Victor G. Zgoda a , Alexander A. Moysa a , Mikhail A. Pyatnitskiy a , Ksenia G. Kuznetsova a , Irina Y. Ilina a , Maria A. Karpova a , Dmitry S. Karpov a, c , Alexander V. Veselovsky a , Mark V. Ivanov d, e , Mikhail V. Gorshkov d, e , Alexander I. Archakov a , Sergei A. Moshkovskii a, f , a Institute of Biomedical Chemistry, Moscow 119121, Russia b Karolinska Institutet, Stockholm SE-171 77, Sweden c Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia d Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia e Moscow Institute of Physics and Technology (State University), Moscow region, Dolgoprudny 141700, Russia f Pirogov Russian National Research Medical University, Moscow 117997, Russia ARTICLE INFO ABSTRACT Article history: Received 21 December 2014 Accepted 7 March 2015 Searching deep proteome data for 9 NCI-60 cancer cell lines obtained earlier by Moghaddas Gholami et al. (Cell Reports, 2013) against a database from cancer genomes returned a variant tryptic peptide fragment 57-72 of molecular chaperone HSC70, in which methionine residue at 61 position is replaced by threonine, or isothreonine (homoserine), residue. However, no traces of the corresponding genetic alteration were found in the cell line genomes reported by Abaan et al. (Cancer Research, 2013). Studying on the background of this modification led us to conclude that a conversion of methionine into isothreonine resulted from iodoacetamide treatment of the probe during a sample preparation step. We found that up to 10% of methionine containing peptides experienced the above conversion for the datasets under study. The artifact was confirmed by model experiment with bovine albumin, where three of four methionine residues were partly converted to isothreonine by conventional iodoacetamide treatment. This experimental side reaction has to be taken into account when searching for genetically encoded peptide variants in the proteogenomics studies. Biological significance A lot of effort is currently put into proteogenomics of cancer. Studies detect non- synonymous cancer mutations at protein level by search of high-throughput LCMS/MS data against customized genomic databases. In such studies, much attention is paid to Keywords: Proteogenomics Tandem mass spectrometry Cancer cell line Methionine Isothreonine Iodoacetamide JOURNAL OF PROTEOMICS 120 (2015) 169 178 Abbreviations: HSC70, heat shock cognate 71 kDa protein; isoT, isothreonine; HCD, high energy collision dissociation; FDR, false discovery rate; IAA, iodoacetamide; PSM, peptide-spectrum match. Corresponding author at: Institute of Biomedical Chemistry, 10 Pogodinskaya Str., Moscow 119121, Russia. Fax: +7 499 245 0857. E-mail address: [email protected] (S.A. Moshkovskii). http://dx.doi.org/10.1016/j.jprot.2015.03.003 1874-3919/© 2015 Elsevier B.V. All rights reserved. Available online at www.sciencedirect.com ScienceDirect www.elsevier.com/locate/jprot

Transcript of Methionine to isothreonine conversion as a source of false discovery identifications of genetically...

J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Ava i l ab l e on l i ne a t wwwsc i enced i r ec t com

ScienceDirectwwwe l sev i e r com loca te j p ro t

Methionine to isothreonine conversion as a source

of false discovery identifications of geneticallyencoded variants in proteogenomics

Alexey L Chernobrovkinab Arthur T Kopylova Victor G Zgodaa Alexander A MoysaaMikhail A Pyatnitskiya Ksenia G Kuznetsovaa Irina Y Ilinaa Maria A KarpovaaDmitry S Karpovac Alexander V Veselovskya Mark V Ivanovde Mikhail V GorshkovdeAlexander I Archakova Sergei A MoshkovskiiafaInstitute of Biomedical Chemistry Moscow 119121 RussiabKarolinska Institutet Stockholm SE-171 77 SwedencEngelhardt Institute of Molecular Biology Russian Academy of Sciences Moscow 119991 RussiadInstitute for Energy Problems of Chemical Physics Russian Academy of Sciences Moscow 119334 RussiaeMoscow Institute of Physics and Technology (State University) Moscow region Dolgoprudny 141700 RussiafPirogov Russian National Research Medical University Moscow 117997 Russia

A R T I C L E I N F O

Abbreviations HSC70 heat shock cognatdiscovery rate IAA iodoacetamide PSM pep Corresponding author at Institute of BiomeE-mail address smoshmailru (SA Mos

httpdxdoiorg101016jjprot2015030031874-3919copy 2015 Elsevier BV All rights rese

A B S T R A C T

Article historyReceived 21 December 2014Accepted 7 March 2015

Searching deep proteome data for 9 NCI-60 cancer cell lines obtained earlier by MoghaddasGholami et al (Cell Reports 2013) against a database from cancer genomes returned avariant tryptic peptide fragment 57-72 of molecular chaperone HSC70 in which methionineresidue at 61 position is replaced by threonine or isothreonine (homoserine) residueHowever no traces of the corresponding genetic alteration were found in the cell linegenomes reported by Abaan et al (Cancer Research 2013) Studying on the background ofthis modification led us to conclude that a conversion of methionine into isothreonineresulted from iodoacetamide treatment of the probe during a sample preparation step Wefound that up to 10 of methionine containing peptides experienced the above conversionfor the datasets under study The artifact was confirmed by model experiment with bovinealbumin where three of four methionine residues were partly converted to isothreonineby conventional iodoacetamide treatment This experimental side reaction has to betaken into account when searching for genetically encoded peptide variants in theproteogenomics studies

Biological significanceA lot of effort is currently put into proteogenomics of cancer Studies detect non-synonymous cancer mutations at protein level by search of high-throughput LCndashMSMSdata against customized genomic databases In such studies much attention is paid to

KeywordsProteogenomicsTandem mass spectrometryCancer cell lineMethionineIsothreonineIodoacetamide

e 71 kDa protein isoT isothreonine HCD high energy collision dissociation FDR falsetide-spectrum matchdical Chemistry 10 Pogodinskaya Str Moscow 119121 Russia Fax +7 499 245 0857hkovskii)

rved

170 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

potential false positive identifications Here we describe one possible cause of such falseidentifications an artifact of sample preparation which mimics methionine to threoninenucleic acid-encoded variant The methionine to isothreonine conversion should be takeninto consideration for correct interpretation of proteogenomic data

copy 2015 Elsevier BV All rights reserved

1 Introduction

Recent introduction of mass-spectrometers with high massaccuracy and high resolving power such as the Orbitrap [1] toproteomics has dramatically improved the capabilities ofHPLC-MSMS-based bottom-up proteome analysis Theseadvances enabled a dozen of thousands of tryptic peptidesand several thousands of protein groups being identified withhigh confidence in a single experimental run [2ndash4] In theseexperiments the LCndashMSMS data is analyzed by a databasesearch using an applicable search engine [5] or a variety ofdifferent search engines applied to the same datasets [6]However database search generates ldquothe streetlight effectrdquoie only those peptides may be found which are predictedfrom the custom genomic database Thus a plethora of massspectra used to be abandoned and not assigned to any peptidesequence A number of alternative approaches to peptideidentifications based on de novo sequencing have also beendeveloped [7ndash11] A method for accounting of the missinginformation in the protein database using a coding genomepolymorphism data has been explored recently [12] Ingeneral a use of one or more custom DNA andor mRNAsequence databases for LCndashMSMS data search is becoming acurrent trend in identifying the encoded variants of aminoacid sequence that originated from single amino acid poly-morphism or alternative splicing [13] These areas of researchas well as the studies on genome re-annotations usingproteomics data are often referred to as proteogenomics[14] Cancer proteogenomics is especially an important case ofusing the customized sequence databases because of a largeamount of biologically relevant non-synonymous somaticmutations across the tumor genomes [15ndash17]

One of the challenges associated with the large size of thedatabase that combined several genomes is the growing level offalse positive identifications whichmay constitute a significantportion of proteogenomic results [18] Moreover a number ofchemical and post-translational modifications mimic singleamino acid variants One of the most obvious examples isspontaneous asparagine and glutamine deamidations thatresulted in aspartic and glutamic acid residues respectively[19] Thesemodifications occurring both in vivo and in vitro areindistinguishable from genetically encoded point mutations ofAsn to Asp or Gln to Glu in the proteins

In this work we describe yet the other chemical modifi-cation which may produce false positive identification ofpeptides in shotgun proteogenomics study by imitation ofgenetically encoded mutations This is methionine toisothreonine (homoserine) in vitro conversion which mimicsmethionine to threonine mutation Analyzing publicly avail-able data of NCI-60 cancer cell line proteomes [20] we foundthat the scale of this artifact can be significant to affect theinterpretation of the results of proteogenomics studies Inorder to explore the above conversion and distinguishMSMS

spectra of peptides containing isothreonine or threonineresidues at the same locations in the sequences syntheticpeptides were obtained and their mass spectra were ana-lyzed Bovine serum albumin was used to confirm methio-nine to isothreonine conversion during conventional samplepreparation for shotgun proteome analysis

2 Materials and methods

21 Modified database for search of shotgun proteome data

NCI-60 cell line data were downloaded from the web at httpwzwtumdeproteomicsnci60 [20] The concatenated data-base comprised the UniProt complete human proteomedatabase (release from January 2013 87638 records) and thecolon cancer database generated from available genomicdata [21] (127486 records) were used for peptide identifica-tion The search of the data against the modified databasewas performed using XTandem Mascot and Andromedasearch engines using so-called separate FDR or one-by-oneapproach as described in [1622]

22 Peptide and protein identification and quantification

In order to determine the relative abundances of peptides andproteins intensity-based label-free quantification was usedwith a minimum of two unique peptides All raw files werereprocessed using MaxQuant package version 1528 [23]which has Andromeda as a database search engine TheMaxQuant analysis included an initial search with precursormass tolerance of 20 ppm the results of which were used formass recalibration In the main Andromeda search precursorand the fragment mass tolerances were set to 6 ppm and20 ppm respectively The search included variable modifica-tions such as methionine oxidation deamidation of aspara-gine and glutamine and N-terminal acetylation as well ascarbamidomethyl cysteine as a fixed modification The mini-mal peptide length was set to seven amino acids and amaximum of two missed cleavages was allowed The falsediscovery rate (FDR) was set to 1 for both peptide and proteinidentifications

Global analysis of М(iso)Т conversion in NCI-60 cell linedeep proteomes was performed using XTandem searchengine [24] version 201210011 and MPscore post-searchvalidation tools [25] The precursor and the fragment masstolerances were 15 ppm and 003 Da for the XTandem searchrespectively Carbamidomethylation of C (+57021464) wasapplied as fixed modification and N-terminal acetylation(+42010565) was used as potential modification Enzymespecificity was set to ldquotrypsinrdquo with maximum of 2 missedcleavages The UniProt complete human proteome databasewas used for all XTandem searches The total number of

171J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

identified peptides and the number of methionine containingpeptides at 1 FDR at the PSM level were calculated Afterthat MT conversion (minus2999281) was added to the searchparameters as potential modification The new output fileswith identified peptides were filtered to leave only peptidescontaining M gt T conversion At the final step these peptideswere filtered to 1 FDR for the PSMs

23 LCndashMSMS of synthetic peptides with HCD fragmentation

Purified peptides NQVATNP(isoT)NTVFDAK and NQVATNPTNTVFDAK (JPT Peptides Berlin Germany) were separatelydiluted to 1 mgL concentration in thewateracetonitrileformicacid solution (96401 vv) The peptides were injected to theUltiMate 3000 RSLC nano-system (Dionex CA USA)

The peptide solutions (the injection volume and theamount were 01 μL and 100 pg respectively) were thenloaded onto μ-Precolumn (300 μm times 5 mm 5 μm particlesize 100 A pore size) column (Santa Clara CA USA) at 5 μLminflow rate in 008 (vv) formic acid and 0015 (vv)trifluoracetic acid in 4 (vv) acetonitrile Then peptideswere eluted from the analytical column Acclaim C18 PepMapRSLC (150 mm times 75 μm 2 μm particle size 100 A pore sizeSanta Clara CA USA) at 03 μLmin flow rate in gradient ofmobile phases 008 (vv) formic acid and 0015 (vv)trifluoracetic acid in water (mobile phase A) and 008 (vv)formic acid and 0015 (vv) trifluoracetic acid in 80 (vv)acetonitrile (mobile phase B) The gradient used at 03 μLminwas as follows 0ndash3 min 4 eluent В 3ndash28 min linearincrease to 55 eluent В 28ndash31 min linear increase to 98eluent В 31ndash37 min washing the column by 98 eluent В37ndash40 min linear decrease to 4 eluent В 40ndash47 min equil-ibrating the column by 4 eluent В

An Orbitrap Q-Exactive mass spectrometer (Thermo CAUSA) was used for the experiments The instrument wasoperated in positive ionization mode using a targeted-MSMSmode with 140 K resolution for higher-energy collisiondissociation (HCD) MSMS scans The AGC target value wasset to 5deg105 ions for maximum ion trap injection time of60 ms The isolation window was set to 2 Th and thenormalized collision energy was 23 for both peptides Thetarget mass of 81040483 with charge state of 2+ was acquiredfrom 10 to 13 min using the scheduled targeted-MSMS modeThe first fixed mass was set to 100 mz

24 Direct injection MSMS of synthetic peptides with CIDfragmentation

An Orbitrap Elite mass spectrometer (Thermo Fisher Scientif-ic Bremen Germany) equipped with a HESI ion source wasoperated in positivemodewith a spray voltage of 4 kV Sheathand auxiliary gas pressure were set to 10 and 2 arbitrary unitscorrespondently

Survey scan MS data (mz from 100 to 1200) were acquiredat resolving power of 60 K (at mz 400) at direct infusion rateof 5 μLmin For accuratemassmeasurements the lock-massoption was activated mz 445120025 for internal recalibra-tion in real time For all MS or MSMS data acquired 100 scanswere collected and the injection time varied from 50 to500 ms For acquiringMSMS spectra the parent ionmassmz

888 was isolated in 2 Thwindow and the normalized collisionenergy of 31 was applied for both peptides

25 Molecular modeling of peptides of interest to predictpreferable conformers

Tripeptides ATN and A(isoT)N were considered to predictpreferable conformers Partial atomic charges were calculatedby semi-empirical quantum-mechanical PM3 method [26]RandomSearch software of SYBYL81 package was used topredict peptide conformers Search parameters were asfollows rotation permitted for all covalent bonds of threonineand asparagine residues except peptide bonds Tripos forcefield the number of iterations generating conformers 1000energy threshold 3 kcalmol RMS 02 Aring molecule minimiza-tion 500 iterations

26 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methioninemodifications

Samples of bovine serum albumin (A9418 Sigma-AldrichUSA) were prepared to test three different reduction andalkylation schemes Each sample contained 50 μg albumindissolved in 50 μL of 50 mM tetraethylammonium chloride(TEABC Sigma-Aldrich USA) solution Then 55 μL of 5 (wv)sodium deoxycholate in 50 mM TrisHCl pH 85 was added toeach sample After mixing 12 μL of 05 M sodium dithiothre-itol (DTT Acros Organics USA) in 50 mM TEABC was added toeach sample up to 10 mM DTT The sample 1 was thenincubated for 20 min at 56 degC whereas samples 2 and 3were maintained for 10 min at 95 degC

After that 55 μL of 05 M iodoacetamide (IAA Sigma-Aldrich USA) solution in 50 mM TEABC was added to eachsample to the final concentration of 50 mM IAA Samples 1and 2 were then incubated for 30 min at room temperaturein the dark The sample 3 was incubated for 10 min at 95 degCalso in the dark

The protein was then digested with trypsin (Trypsin GoldPromega) The enzyme was added in the ratio 150 (ww) andthe resultant mixture was incubated overnight at 37 degCEnzymatic digestion was terminated by the addition of aceticacid (5 wv)

Samples were shaken for 30 min at 25 degC (500 rpm) andcentrifuged at 9300 g for 15 min at 20 degC (Centrifuge 5415REppendorf) Supernatants were added to the filter unit (30 kDacutoff Millipore USA) and centrifuged at 13400 g for 20 minat 20 degC After that 200 μL of 50 (vv) formic acid was addedto each filter unit and samples were centrifuged at 13400 g for20 min at 20 degC to dryness Samples were vacuum dried usingEppendorf Concentrator Plus at 45 degC

For LCndashMSMS analysis dried samples were treated simi-larly to synthetic peptides (see section 23)

Mass spectra were analyzed using MaxQuant [23] package1528 as described above with minor changes ie theAndromeda search included methionine oxidation and me-thionine to threonine conversion as variable modificationsand FDR for peptide modifications was set to 1 Each samplewas studied in three technical repeats

172 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

173J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

3 Results and discussion

31 Methionine-61 of HSC70 chaperone is changed in 8 of 9NCI-60 cell lines

Out of all NCI-60 cancer cell lines analyzed in [20] the deepproteome data using prefractionation and HCD fragmentationwere obtained for nine cell lines including MCF7 breast cancerSK-OV-3 ovarian cancer U251malignant glioblastoma RXF-393renal cancer COLO205 colon cancer NCI-H460 lung cancer M14malignant melanoma CCRF-CEM acute lymphocytic leukemiaand PC-3 prostate cancer cell lines These deep proteome datawere search against a sequence database containing additionalsequences derived from colon cancer exome data [21] (theinformation about non-synonymous SNPs was extracted fromthe Supplementary Material provided by the authors) For eachSNP an additional protein sequence was generated by intro-ducing a corresponding amino acid substitution The resultingcolon cancer database contained 127486 protein sequencesThe search was performed by the Mascot and the Andromedasearch engines by the separate FDR or one-by-one approach asgenerally described in [1622]

Note that in this part we used a protein sequence databasewhich did not exactly correspond to the cell line genomeseven to the COLO205 line which originated from colon cancerIt was shown in [27] that there were not many common pointmutations between different patients despite the fact thatthesemutations were accumulated in common genes such asknown proto-oncogenes and suppressors Surprisingly one ofthe mutant peptides was returned by the search engines withhigh score in 8 out of 9 cell lines analyzed in all but the renalcancer cell line It was the peptide 57(NQVATNPTNTVFDAK)71bearing an amino acid substitution of methionine to threo-nine in the position of 61 of the human Heat shock cognate71 kDa protein (HSC70 gene name HSPA8) Moreover in 6 outof 9 cell lines the above modified sequence was identified alsoin the form of peptide 57(NQVATNPTNTVFDAKR)72 with amissed cleavage

32 Conversion of methionine-61 of HSC70 chaperone is notencoded by nucleic acid

It is also interesting to note that the point mutation that maycause M61T substitution in HSC70 is rare and found in onlyone colon cancer patient of more than two hundred tumorpatients studied in [20] Soon after our finding the exomes ofall NCI-60 cell lines become available [27] Contrary to ourexpectations mutations involving replacement of M residueby T in the position 61 of HSC70 were not found in theexomes The first possible explanation of M61T peptideidentifications in the proteome is the contamination of thecell cultures by a foreign protein eg derived from the mediaor external cell line such as HeLa However after performinga protein BLAST search we found no identical peptide in any

Fig 1 ndash Exemplary Orbitrap HCD spectra of 57(NQVA(TisoT)NPTNpeptide from shotgun cancer cell line data [20] synthetic NQVA(NQVATNPTNTVFDAK peptide (C) All spectra used in this work a

living organism Moreover no human genome databasecontained any traces of corresponding coding SNP exceptthe above-mentioned colon cancer exome in the COSMICdatabase [28] RNA editing may be another explanation forthe M61T replacement In this case the AUG codon should beconverted to ACG and such editing is not described asopposed to C-to-U deamination [29] Thus we are left withthe only hypothesis the conversion of methionine-61 of theHSC70 chaperone is not encoded genetically but occurseither post-translationally or during the sample preparationbefore LCndashMSMS analysis

33 Methionine to isothreonine conversion during proteomesample processing

Non-genetic origin of the observed methionine conversionallows proposing that methionine may be converted toisothreonine (homoserine) instead of threonine Such a reactionis known from oxidative treatment of protein methionineresidues by cyanogen bromide In some cases instead of peptidebond cleavage and formation of C-terminal homoserine lactonethemethionine residuemay be converted to isothreonine [30] Inthe search this transition is considered as a variablemodificationafter cyanogen bromide treatment (see eg wwwmatrixsciencecomhelpenzyme_helphtml for the Mascot search engine)

Side reactions during cysteine alkylationwith iodoacetamideused routinely in sample preparation for the proteomeanalysis may also be responsible for in vitro methioninemodifications As itwas shownbeforemethionine alkylation byiodoacetamide with a formation of S-carbamidomethyl methi-onine and subsequent neutral loss of 2-(methylthio)acetamideduring CID may mimic a neutral loss of phosphoric acid inphosphopeptides which contain phosphoserine (pSer) orphosphothreonine (pThr) [31]

Both iodoacetamideand iodoacetatemayalkylatemethionineat any pH [32] The resulting sulfonium salts S-carbamidomethylmethionine or S-carboxymethyl methionine go to homoserineand homoserine lactone at 100 degC and approximately neutral pH[33] Similar conditions were used in sample preparation beforeelectrophoresis for the cell line proteome datasets of interest [20]Thus some degree of methionine-to-isothreonine conversionshould be expected

34 Synthetic peptide analysis supporting methionine-61 toisothreonine amino acid conversion in HSC70

To reveal the origin of methionine conversion in the peptide ofinterest two synthetic analogs of 57(T61)71 and 57(isoT61)71from HSC70 were analyzed by tandem mass spectrometryemploying HCD fragmentation The purpose of this analysiswas distinguishing between the two peptides using their MSMSprofiles and comparing them to the corresponding MSMSprofiles observed in the cell lines

The mass-spectra attributed to the peptide of interestfrom the cell line shotgun data and the mass-spectra from

TVFDAK)71 peptide of human HSC70 protein (A) candidateisoT)NPTNTVFDAK peptide (B) and syntheticre shown in Supplementary Tables 1 and 2

174 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

both synthetic peptides were remarkably similar as shownin Fig 1 The only significant difference was observed inthe intensity of the peak at mz 628306 In three series of 5LCndashMSMS runs the intensity of this peak in the spectra ofthreonine-containing peptide was relatively low repre-senting about 3 of the maximal peak intensity (see Figs 12 and Supplementary Tables 12) In the spectra ofisothreonine-containing peptide the same fragment yieldssignificantly higher intensity peak of 30 of the maximalpeak Relative intensity of the corresponding peak in theNCI-60 proteome mass-spectra was found to be about 18(Fig 2)

Mass spectra of the synthetic peptide by direct injection toOrbitrap MS with CID fragmentation were also compared andthe remarkable similarity was obtained with their HCDprofiles (Supplementary Fig 1) with a higher intensity of thepeak at mz 628306 in the spectra of isothreonine-containingpeptide

The peak at mz 628306 represents a b6-ion with asequence of NQVA(TisoT)N+ Interestingly the correspondingy10-ion at mz 992505 PTNTVFDAKR+ has similar intensityin all three datasets (Fig 2) These data show that underthe conditions used the NQVATN+ ion is significantly lessstable compared with its isothreonine-containing isomerThis difference between the isomers may be explained bydifference in hydrogen bond networks in threonine andisothreonine peptides It has been shown elsewhere thathydrogen bonding influenced fragmentation process in iontraps such as by CID and even could be used to study suchbonding inmolecules [34] We further usedmolecularmodelingof peptides to predict difference in preferable hydrogen bonding(Section 35)

Thus the spectra of the NCI-60 proteome data are moresimilar to the isothreonine-containing peptide therebysupporting the hypothesis of methionine conversion to thesaid amino acid However this evidence is not enough forunambiguous decision towards methionine to isothreonine

Fig 2 ndash Box-and-whiskers diagrams of relative intensity of 6283(right panel) in Orbitrap spectra of 57(NQVA(TisoT)NPTNTVFDAKcancer cell line data [20] synthetic NQVA(isoT)NPTNTVFDAK (Isoexperiments 12 exemplary spectra were taken from 8 cancer ceanalytical series were made each containing 5 LCndashMSMS runs opercentage of maximum intensity in each spectrum (Supplemen

conversion in shotgun data This observation should besupported by experimentation with model proteins (seeSection 37 below)

35 Molecular modeling to predict differential hydrogen bondnetworks likely responsible for differential stability of TisoTpeptide fragment during HCD and CID fragmentation

Tripeptides ATN and A(isoT)N were modeled to estimatedifferential hydrogen bonding of hydroxyl moieties of threo-nine or isothreonine residues For ATN tripeptide 210 stableconformers were generated which have a range of internalenergy minus82 to minus38 kcalmol According to the calculationmore preferable conformers of this peptide have hydrogenbonding between the hydroxyl moiety of threonine andthe backbone oxygen atom of neighboring alanine residue(Fig 3A) In total there were 70 conformers with suchhydrogen bonding among all 210 structures In 25th and10th percentiles of conformers with minimal estimatedinternal energy ratios of such structures were 64 and 55respectively

For A(isoT)N tripeptide 179 stable conformers were gener-ated Overall an internal energy calculated for these structureswas higher than for ATN conformers minus279 to minus002 kcalmolPreferable conformations of this peptide contained hydrogenbonding between the hydroxyl moiety of isothreonine andthe backbone secondary amine of the same residue (Fig 3B)Such conformations reached 36 of total number of con-formers being inferior to structures where isothreoninehydroxyl moiety had no hydrogen bonding predicted (51)At the same time such a formation of internal hydrogenbond in isothreonine residue had benefit in energy In 25thand 10th percentiles of conformers with minimal estimatedinternal energy ratios of such conformers increased to 67and 89 respectively

Among predicted structures of both peptides the only statehas similar hydrogen bonding of hydroxyl moiety namely

06 Da b6-ion peak (left panel) and 992505 Da y10-ion peak)71 peptide of human HSC70 protein (Shotgun) from shotgunThr) and NQVATNPTNTVFDAK peptides (Thr) For shotgunll line deep proteome data [20] For each synthetic peptide 3f the same (N = 15) Relative intensity is calculated as atary Tables 1 and 2)

Fig 3 ndash Exemplary models of Ala-Thr-Asn (A) and Ala-isoThr-Asn (B) tripeptides with minimal internal energy Hydrogennitrogen and oxygen atoms are shown in light blue deep blue and red respectively Hydrogen bonds are depicted as dashedyellow lines N-terminal alanine residue is to the left in each panel Note hydrogen bonding between the hydroxyl moiety ofthreonine and the backbone oxygen atom of alanine residue (A) and between the hydroxyl moiety of isothreonine and thebackbone secondary amine of the same residue (B)

175J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

its bond with the backbone secondary amine of asparagineHowever such a structure formed in 16 and 8 of all cases inthreonine and isothreonine peptide respectively All possiblepredicted hydrogen bonds in tripeptides of interest are listed inSupplementary Table 3

Thus the calculationshave shown that inmodel tripeptideshydrogen bonding networks of threonine and isothreonineweredramatically different It may be further hypothesized that suchdifference causes differential MSMS spectra after CID or HCDfragmentation However to define generic principles of discrim-ination between threonine and isothreonine-containing pep-tides using MSMS further experimentation is needed which isout of the scope of the present work

36 Quantitative analysis of genetically encoded 57(NQVAMNPTNTVFDAK)71 and its M61(isoT) modificationin cancer cell line shotgun data

In the further analysis we studied deep proteome data for 9cell line to find whether the wild-type 57(M61)71 peptide of

Fig 4 ndash Label-free quantitation data of 57(NQVA(isoT)NPTNTVFDAmissed cleavage (57(isoT61)72) and their wild-type counterparts(57(M61)71) from shotgun deep proteome data of 8 NCI-60 cancer(wwwmaxquantorg [23]) Relative intensities are shown in log s

HSC70 was present in those cell lines Indeed in all sampleswhere M-to-isoT peptide was found the wild-type peptides57-71 tryptic peptide and 57-72 peptide with missed cleavagewere also observed Label-free quantitative analysis of relativecontent of both variants in the spectra has shown that theabundance of the modified peptide was considerably lowerthan the abundance of the wild-type peptide The averageintensity of modified M-to-isoT peptides amounted to only23 (standard deviation 13 CI 11 to 43) of such intensityfor the wild-type peptide (Fig 4)

37 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methionine toisothreonine conversion

Samples of bovine serum albumin (BSA) were prepared totest three different reduction and alkylation schemes Thealbumin was treated by sodium deoxycholate to mimic aprocedure that was routinely used in our lab to extractproteins from cells before shotgun proteome analysis [35]

K)71 variant peptide (57(isoT61)71) its +R72 peptide with one 57(NQVAMNPTNTVFDAK)71 (57(M61)71) and +R72 peptidecell lines [20] Data were obtained using MaxQuant softwarecale

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

170 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

potential false positive identifications Here we describe one possible cause of such falseidentifications an artifact of sample preparation which mimics methionine to threoninenucleic acid-encoded variant The methionine to isothreonine conversion should be takeninto consideration for correct interpretation of proteogenomic data

copy 2015 Elsevier BV All rights reserved

1 Introduction

Recent introduction of mass-spectrometers with high massaccuracy and high resolving power such as the Orbitrap [1] toproteomics has dramatically improved the capabilities ofHPLC-MSMS-based bottom-up proteome analysis Theseadvances enabled a dozen of thousands of tryptic peptidesand several thousands of protein groups being identified withhigh confidence in a single experimental run [2ndash4] In theseexperiments the LCndashMSMS data is analyzed by a databasesearch using an applicable search engine [5] or a variety ofdifferent search engines applied to the same datasets [6]However database search generates ldquothe streetlight effectrdquoie only those peptides may be found which are predictedfrom the custom genomic database Thus a plethora of massspectra used to be abandoned and not assigned to any peptidesequence A number of alternative approaches to peptideidentifications based on de novo sequencing have also beendeveloped [7ndash11] A method for accounting of the missinginformation in the protein database using a coding genomepolymorphism data has been explored recently [12] Ingeneral a use of one or more custom DNA andor mRNAsequence databases for LCndashMSMS data search is becoming acurrent trend in identifying the encoded variants of aminoacid sequence that originated from single amino acid poly-morphism or alternative splicing [13] These areas of researchas well as the studies on genome re-annotations usingproteomics data are often referred to as proteogenomics[14] Cancer proteogenomics is especially an important case ofusing the customized sequence databases because of a largeamount of biologically relevant non-synonymous somaticmutations across the tumor genomes [15ndash17]

One of the challenges associated with the large size of thedatabase that combined several genomes is the growing level offalse positive identifications whichmay constitute a significantportion of proteogenomic results [18] Moreover a number ofchemical and post-translational modifications mimic singleamino acid variants One of the most obvious examples isspontaneous asparagine and glutamine deamidations thatresulted in aspartic and glutamic acid residues respectively[19] Thesemodifications occurring both in vivo and in vitro areindistinguishable from genetically encoded point mutations ofAsn to Asp or Gln to Glu in the proteins

In this work we describe yet the other chemical modifi-cation which may produce false positive identification ofpeptides in shotgun proteogenomics study by imitation ofgenetically encoded mutations This is methionine toisothreonine (homoserine) in vitro conversion which mimicsmethionine to threonine mutation Analyzing publicly avail-able data of NCI-60 cancer cell line proteomes [20] we foundthat the scale of this artifact can be significant to affect theinterpretation of the results of proteogenomics studies Inorder to explore the above conversion and distinguishMSMS

spectra of peptides containing isothreonine or threonineresidues at the same locations in the sequences syntheticpeptides were obtained and their mass spectra were ana-lyzed Bovine serum albumin was used to confirm methio-nine to isothreonine conversion during conventional samplepreparation for shotgun proteome analysis

2 Materials and methods

21 Modified database for search of shotgun proteome data

NCI-60 cell line data were downloaded from the web at httpwzwtumdeproteomicsnci60 [20] The concatenated data-base comprised the UniProt complete human proteomedatabase (release from January 2013 87638 records) and thecolon cancer database generated from available genomicdata [21] (127486 records) were used for peptide identifica-tion The search of the data against the modified databasewas performed using XTandem Mascot and Andromedasearch engines using so-called separate FDR or one-by-oneapproach as described in [1622]

22 Peptide and protein identification and quantification

In order to determine the relative abundances of peptides andproteins intensity-based label-free quantification was usedwith a minimum of two unique peptides All raw files werereprocessed using MaxQuant package version 1528 [23]which has Andromeda as a database search engine TheMaxQuant analysis included an initial search with precursormass tolerance of 20 ppm the results of which were used formass recalibration In the main Andromeda search precursorand the fragment mass tolerances were set to 6 ppm and20 ppm respectively The search included variable modifica-tions such as methionine oxidation deamidation of aspara-gine and glutamine and N-terminal acetylation as well ascarbamidomethyl cysteine as a fixed modification The mini-mal peptide length was set to seven amino acids and amaximum of two missed cleavages was allowed The falsediscovery rate (FDR) was set to 1 for both peptide and proteinidentifications

Global analysis of М(iso)Т conversion in NCI-60 cell linedeep proteomes was performed using XTandem searchengine [24] version 201210011 and MPscore post-searchvalidation tools [25] The precursor and the fragment masstolerances were 15 ppm and 003 Da for the XTandem searchrespectively Carbamidomethylation of C (+57021464) wasapplied as fixed modification and N-terminal acetylation(+42010565) was used as potential modification Enzymespecificity was set to ldquotrypsinrdquo with maximum of 2 missedcleavages The UniProt complete human proteome databasewas used for all XTandem searches The total number of

171J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

identified peptides and the number of methionine containingpeptides at 1 FDR at the PSM level were calculated Afterthat MT conversion (minus2999281) was added to the searchparameters as potential modification The new output fileswith identified peptides were filtered to leave only peptidescontaining M gt T conversion At the final step these peptideswere filtered to 1 FDR for the PSMs

23 LCndashMSMS of synthetic peptides with HCD fragmentation

Purified peptides NQVATNP(isoT)NTVFDAK and NQVATNPTNTVFDAK (JPT Peptides Berlin Germany) were separatelydiluted to 1 mgL concentration in thewateracetonitrileformicacid solution (96401 vv) The peptides were injected to theUltiMate 3000 RSLC nano-system (Dionex CA USA)

The peptide solutions (the injection volume and theamount were 01 μL and 100 pg respectively) were thenloaded onto μ-Precolumn (300 μm times 5 mm 5 μm particlesize 100 A pore size) column (Santa Clara CA USA) at 5 μLminflow rate in 008 (vv) formic acid and 0015 (vv)trifluoracetic acid in 4 (vv) acetonitrile Then peptideswere eluted from the analytical column Acclaim C18 PepMapRSLC (150 mm times 75 μm 2 μm particle size 100 A pore sizeSanta Clara CA USA) at 03 μLmin flow rate in gradient ofmobile phases 008 (vv) formic acid and 0015 (vv)trifluoracetic acid in water (mobile phase A) and 008 (vv)formic acid and 0015 (vv) trifluoracetic acid in 80 (vv)acetonitrile (mobile phase B) The gradient used at 03 μLminwas as follows 0ndash3 min 4 eluent В 3ndash28 min linearincrease to 55 eluent В 28ndash31 min linear increase to 98eluent В 31ndash37 min washing the column by 98 eluent В37ndash40 min linear decrease to 4 eluent В 40ndash47 min equil-ibrating the column by 4 eluent В

An Orbitrap Q-Exactive mass spectrometer (Thermo CAUSA) was used for the experiments The instrument wasoperated in positive ionization mode using a targeted-MSMSmode with 140 K resolution for higher-energy collisiondissociation (HCD) MSMS scans The AGC target value wasset to 5deg105 ions for maximum ion trap injection time of60 ms The isolation window was set to 2 Th and thenormalized collision energy was 23 for both peptides Thetarget mass of 81040483 with charge state of 2+ was acquiredfrom 10 to 13 min using the scheduled targeted-MSMS modeThe first fixed mass was set to 100 mz

24 Direct injection MSMS of synthetic peptides with CIDfragmentation

An Orbitrap Elite mass spectrometer (Thermo Fisher Scientif-ic Bremen Germany) equipped with a HESI ion source wasoperated in positivemodewith a spray voltage of 4 kV Sheathand auxiliary gas pressure were set to 10 and 2 arbitrary unitscorrespondently

Survey scan MS data (mz from 100 to 1200) were acquiredat resolving power of 60 K (at mz 400) at direct infusion rateof 5 μLmin For accuratemassmeasurements the lock-massoption was activated mz 445120025 for internal recalibra-tion in real time For all MS or MSMS data acquired 100 scanswere collected and the injection time varied from 50 to500 ms For acquiringMSMS spectra the parent ionmassmz

888 was isolated in 2 Thwindow and the normalized collisionenergy of 31 was applied for both peptides

25 Molecular modeling of peptides of interest to predictpreferable conformers

Tripeptides ATN and A(isoT)N were considered to predictpreferable conformers Partial atomic charges were calculatedby semi-empirical quantum-mechanical PM3 method [26]RandomSearch software of SYBYL81 package was used topredict peptide conformers Search parameters were asfollows rotation permitted for all covalent bonds of threonineand asparagine residues except peptide bonds Tripos forcefield the number of iterations generating conformers 1000energy threshold 3 kcalmol RMS 02 Aring molecule minimiza-tion 500 iterations

26 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methioninemodifications

Samples of bovine serum albumin (A9418 Sigma-AldrichUSA) were prepared to test three different reduction andalkylation schemes Each sample contained 50 μg albumindissolved in 50 μL of 50 mM tetraethylammonium chloride(TEABC Sigma-Aldrich USA) solution Then 55 μL of 5 (wv)sodium deoxycholate in 50 mM TrisHCl pH 85 was added toeach sample After mixing 12 μL of 05 M sodium dithiothre-itol (DTT Acros Organics USA) in 50 mM TEABC was added toeach sample up to 10 mM DTT The sample 1 was thenincubated for 20 min at 56 degC whereas samples 2 and 3were maintained for 10 min at 95 degC

After that 55 μL of 05 M iodoacetamide (IAA Sigma-Aldrich USA) solution in 50 mM TEABC was added to eachsample to the final concentration of 50 mM IAA Samples 1and 2 were then incubated for 30 min at room temperaturein the dark The sample 3 was incubated for 10 min at 95 degCalso in the dark

The protein was then digested with trypsin (Trypsin GoldPromega) The enzyme was added in the ratio 150 (ww) andthe resultant mixture was incubated overnight at 37 degCEnzymatic digestion was terminated by the addition of aceticacid (5 wv)

Samples were shaken for 30 min at 25 degC (500 rpm) andcentrifuged at 9300 g for 15 min at 20 degC (Centrifuge 5415REppendorf) Supernatants were added to the filter unit (30 kDacutoff Millipore USA) and centrifuged at 13400 g for 20 minat 20 degC After that 200 μL of 50 (vv) formic acid was addedto each filter unit and samples were centrifuged at 13400 g for20 min at 20 degC to dryness Samples were vacuum dried usingEppendorf Concentrator Plus at 45 degC

For LCndashMSMS analysis dried samples were treated simi-larly to synthetic peptides (see section 23)

Mass spectra were analyzed using MaxQuant [23] package1528 as described above with minor changes ie theAndromeda search included methionine oxidation and me-thionine to threonine conversion as variable modificationsand FDR for peptide modifications was set to 1 Each samplewas studied in three technical repeats

172 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

173J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

3 Results and discussion

31 Methionine-61 of HSC70 chaperone is changed in 8 of 9NCI-60 cell lines

Out of all NCI-60 cancer cell lines analyzed in [20] the deepproteome data using prefractionation and HCD fragmentationwere obtained for nine cell lines including MCF7 breast cancerSK-OV-3 ovarian cancer U251malignant glioblastoma RXF-393renal cancer COLO205 colon cancer NCI-H460 lung cancer M14malignant melanoma CCRF-CEM acute lymphocytic leukemiaand PC-3 prostate cancer cell lines These deep proteome datawere search against a sequence database containing additionalsequences derived from colon cancer exome data [21] (theinformation about non-synonymous SNPs was extracted fromthe Supplementary Material provided by the authors) For eachSNP an additional protein sequence was generated by intro-ducing a corresponding amino acid substitution The resultingcolon cancer database contained 127486 protein sequencesThe search was performed by the Mascot and the Andromedasearch engines by the separate FDR or one-by-one approach asgenerally described in [1622]

Note that in this part we used a protein sequence databasewhich did not exactly correspond to the cell line genomeseven to the COLO205 line which originated from colon cancerIt was shown in [27] that there were not many common pointmutations between different patients despite the fact thatthesemutations were accumulated in common genes such asknown proto-oncogenes and suppressors Surprisingly one ofthe mutant peptides was returned by the search engines withhigh score in 8 out of 9 cell lines analyzed in all but the renalcancer cell line It was the peptide 57(NQVATNPTNTVFDAK)71bearing an amino acid substitution of methionine to threo-nine in the position of 61 of the human Heat shock cognate71 kDa protein (HSC70 gene name HSPA8) Moreover in 6 outof 9 cell lines the above modified sequence was identified alsoin the form of peptide 57(NQVATNPTNTVFDAKR)72 with amissed cleavage

32 Conversion of methionine-61 of HSC70 chaperone is notencoded by nucleic acid

It is also interesting to note that the point mutation that maycause M61T substitution in HSC70 is rare and found in onlyone colon cancer patient of more than two hundred tumorpatients studied in [20] Soon after our finding the exomes ofall NCI-60 cell lines become available [27] Contrary to ourexpectations mutations involving replacement of M residueby T in the position 61 of HSC70 were not found in theexomes The first possible explanation of M61T peptideidentifications in the proteome is the contamination of thecell cultures by a foreign protein eg derived from the mediaor external cell line such as HeLa However after performinga protein BLAST search we found no identical peptide in any

Fig 1 ndash Exemplary Orbitrap HCD spectra of 57(NQVA(TisoT)NPTNpeptide from shotgun cancer cell line data [20] synthetic NQVA(NQVATNPTNTVFDAK peptide (C) All spectra used in this work a

living organism Moreover no human genome databasecontained any traces of corresponding coding SNP exceptthe above-mentioned colon cancer exome in the COSMICdatabase [28] RNA editing may be another explanation forthe M61T replacement In this case the AUG codon should beconverted to ACG and such editing is not described asopposed to C-to-U deamination [29] Thus we are left withthe only hypothesis the conversion of methionine-61 of theHSC70 chaperone is not encoded genetically but occurseither post-translationally or during the sample preparationbefore LCndashMSMS analysis

33 Methionine to isothreonine conversion during proteomesample processing

Non-genetic origin of the observed methionine conversionallows proposing that methionine may be converted toisothreonine (homoserine) instead of threonine Such a reactionis known from oxidative treatment of protein methionineresidues by cyanogen bromide In some cases instead of peptidebond cleavage and formation of C-terminal homoserine lactonethemethionine residuemay be converted to isothreonine [30] Inthe search this transition is considered as a variablemodificationafter cyanogen bromide treatment (see eg wwwmatrixsciencecomhelpenzyme_helphtml for the Mascot search engine)

Side reactions during cysteine alkylationwith iodoacetamideused routinely in sample preparation for the proteomeanalysis may also be responsible for in vitro methioninemodifications As itwas shownbeforemethionine alkylation byiodoacetamide with a formation of S-carbamidomethyl methi-onine and subsequent neutral loss of 2-(methylthio)acetamideduring CID may mimic a neutral loss of phosphoric acid inphosphopeptides which contain phosphoserine (pSer) orphosphothreonine (pThr) [31]

Both iodoacetamideand iodoacetatemayalkylatemethionineat any pH [32] The resulting sulfonium salts S-carbamidomethylmethionine or S-carboxymethyl methionine go to homoserineand homoserine lactone at 100 degC and approximately neutral pH[33] Similar conditions were used in sample preparation beforeelectrophoresis for the cell line proteome datasets of interest [20]Thus some degree of methionine-to-isothreonine conversionshould be expected

34 Synthetic peptide analysis supporting methionine-61 toisothreonine amino acid conversion in HSC70

To reveal the origin of methionine conversion in the peptide ofinterest two synthetic analogs of 57(T61)71 and 57(isoT61)71from HSC70 were analyzed by tandem mass spectrometryemploying HCD fragmentation The purpose of this analysiswas distinguishing between the two peptides using their MSMSprofiles and comparing them to the corresponding MSMSprofiles observed in the cell lines

The mass-spectra attributed to the peptide of interestfrom the cell line shotgun data and the mass-spectra from

TVFDAK)71 peptide of human HSC70 protein (A) candidateisoT)NPTNTVFDAK peptide (B) and syntheticre shown in Supplementary Tables 1 and 2

174 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

both synthetic peptides were remarkably similar as shownin Fig 1 The only significant difference was observed inthe intensity of the peak at mz 628306 In three series of 5LCndashMSMS runs the intensity of this peak in the spectra ofthreonine-containing peptide was relatively low repre-senting about 3 of the maximal peak intensity (see Figs 12 and Supplementary Tables 12) In the spectra ofisothreonine-containing peptide the same fragment yieldssignificantly higher intensity peak of 30 of the maximalpeak Relative intensity of the corresponding peak in theNCI-60 proteome mass-spectra was found to be about 18(Fig 2)

Mass spectra of the synthetic peptide by direct injection toOrbitrap MS with CID fragmentation were also compared andthe remarkable similarity was obtained with their HCDprofiles (Supplementary Fig 1) with a higher intensity of thepeak at mz 628306 in the spectra of isothreonine-containingpeptide

The peak at mz 628306 represents a b6-ion with asequence of NQVA(TisoT)N+ Interestingly the correspondingy10-ion at mz 992505 PTNTVFDAKR+ has similar intensityin all three datasets (Fig 2) These data show that underthe conditions used the NQVATN+ ion is significantly lessstable compared with its isothreonine-containing isomerThis difference between the isomers may be explained bydifference in hydrogen bond networks in threonine andisothreonine peptides It has been shown elsewhere thathydrogen bonding influenced fragmentation process in iontraps such as by CID and even could be used to study suchbonding inmolecules [34] We further usedmolecularmodelingof peptides to predict difference in preferable hydrogen bonding(Section 35)

Thus the spectra of the NCI-60 proteome data are moresimilar to the isothreonine-containing peptide therebysupporting the hypothesis of methionine conversion to thesaid amino acid However this evidence is not enough forunambiguous decision towards methionine to isothreonine

Fig 2 ndash Box-and-whiskers diagrams of relative intensity of 6283(right panel) in Orbitrap spectra of 57(NQVA(TisoT)NPTNTVFDAKcancer cell line data [20] synthetic NQVA(isoT)NPTNTVFDAK (Isoexperiments 12 exemplary spectra were taken from 8 cancer ceanalytical series were made each containing 5 LCndashMSMS runs opercentage of maximum intensity in each spectrum (Supplemen

conversion in shotgun data This observation should besupported by experimentation with model proteins (seeSection 37 below)

35 Molecular modeling to predict differential hydrogen bondnetworks likely responsible for differential stability of TisoTpeptide fragment during HCD and CID fragmentation

Tripeptides ATN and A(isoT)N were modeled to estimatedifferential hydrogen bonding of hydroxyl moieties of threo-nine or isothreonine residues For ATN tripeptide 210 stableconformers were generated which have a range of internalenergy minus82 to minus38 kcalmol According to the calculationmore preferable conformers of this peptide have hydrogenbonding between the hydroxyl moiety of threonine andthe backbone oxygen atom of neighboring alanine residue(Fig 3A) In total there were 70 conformers with suchhydrogen bonding among all 210 structures In 25th and10th percentiles of conformers with minimal estimatedinternal energy ratios of such structures were 64 and 55respectively

For A(isoT)N tripeptide 179 stable conformers were gener-ated Overall an internal energy calculated for these structureswas higher than for ATN conformers minus279 to minus002 kcalmolPreferable conformations of this peptide contained hydrogenbonding between the hydroxyl moiety of isothreonine andthe backbone secondary amine of the same residue (Fig 3B)Such conformations reached 36 of total number of con-formers being inferior to structures where isothreoninehydroxyl moiety had no hydrogen bonding predicted (51)At the same time such a formation of internal hydrogenbond in isothreonine residue had benefit in energy In 25thand 10th percentiles of conformers with minimal estimatedinternal energy ratios of such conformers increased to 67and 89 respectively

Among predicted structures of both peptides the only statehas similar hydrogen bonding of hydroxyl moiety namely

06 Da b6-ion peak (left panel) and 992505 Da y10-ion peak)71 peptide of human HSC70 protein (Shotgun) from shotgunThr) and NQVATNPTNTVFDAK peptides (Thr) For shotgunll line deep proteome data [20] For each synthetic peptide 3f the same (N = 15) Relative intensity is calculated as atary Tables 1 and 2)

Fig 3 ndash Exemplary models of Ala-Thr-Asn (A) and Ala-isoThr-Asn (B) tripeptides with minimal internal energy Hydrogennitrogen and oxygen atoms are shown in light blue deep blue and red respectively Hydrogen bonds are depicted as dashedyellow lines N-terminal alanine residue is to the left in each panel Note hydrogen bonding between the hydroxyl moiety ofthreonine and the backbone oxygen atom of alanine residue (A) and between the hydroxyl moiety of isothreonine and thebackbone secondary amine of the same residue (B)

175J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

its bond with the backbone secondary amine of asparagineHowever such a structure formed in 16 and 8 of all cases inthreonine and isothreonine peptide respectively All possiblepredicted hydrogen bonds in tripeptides of interest are listed inSupplementary Table 3

Thus the calculationshave shown that inmodel tripeptideshydrogen bonding networks of threonine and isothreonineweredramatically different It may be further hypothesized that suchdifference causes differential MSMS spectra after CID or HCDfragmentation However to define generic principles of discrim-ination between threonine and isothreonine-containing pep-tides using MSMS further experimentation is needed which isout of the scope of the present work

36 Quantitative analysis of genetically encoded 57(NQVAMNPTNTVFDAK)71 and its M61(isoT) modificationin cancer cell line shotgun data

In the further analysis we studied deep proteome data for 9cell line to find whether the wild-type 57(M61)71 peptide of

Fig 4 ndash Label-free quantitation data of 57(NQVA(isoT)NPTNTVFDAmissed cleavage (57(isoT61)72) and their wild-type counterparts(57(M61)71) from shotgun deep proteome data of 8 NCI-60 cancer(wwwmaxquantorg [23]) Relative intensities are shown in log s

HSC70 was present in those cell lines Indeed in all sampleswhere M-to-isoT peptide was found the wild-type peptides57-71 tryptic peptide and 57-72 peptide with missed cleavagewere also observed Label-free quantitative analysis of relativecontent of both variants in the spectra has shown that theabundance of the modified peptide was considerably lowerthan the abundance of the wild-type peptide The averageintensity of modified M-to-isoT peptides amounted to only23 (standard deviation 13 CI 11 to 43) of such intensityfor the wild-type peptide (Fig 4)

37 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methionine toisothreonine conversion

Samples of bovine serum albumin (BSA) were prepared totest three different reduction and alkylation schemes Thealbumin was treated by sodium deoxycholate to mimic aprocedure that was routinely used in our lab to extractproteins from cells before shotgun proteome analysis [35]

K)71 variant peptide (57(isoT61)71) its +R72 peptide with one 57(NQVAMNPTNTVFDAK)71 (57(M61)71) and +R72 peptidecell lines [20] Data were obtained using MaxQuant softwarecale

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

171J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

identified peptides and the number of methionine containingpeptides at 1 FDR at the PSM level were calculated Afterthat MT conversion (minus2999281) was added to the searchparameters as potential modification The new output fileswith identified peptides were filtered to leave only peptidescontaining M gt T conversion At the final step these peptideswere filtered to 1 FDR for the PSMs

23 LCndashMSMS of synthetic peptides with HCD fragmentation

Purified peptides NQVATNP(isoT)NTVFDAK and NQVATNPTNTVFDAK (JPT Peptides Berlin Germany) were separatelydiluted to 1 mgL concentration in thewateracetonitrileformicacid solution (96401 vv) The peptides were injected to theUltiMate 3000 RSLC nano-system (Dionex CA USA)

The peptide solutions (the injection volume and theamount were 01 μL and 100 pg respectively) were thenloaded onto μ-Precolumn (300 μm times 5 mm 5 μm particlesize 100 A pore size) column (Santa Clara CA USA) at 5 μLminflow rate in 008 (vv) formic acid and 0015 (vv)trifluoracetic acid in 4 (vv) acetonitrile Then peptideswere eluted from the analytical column Acclaim C18 PepMapRSLC (150 mm times 75 μm 2 μm particle size 100 A pore sizeSanta Clara CA USA) at 03 μLmin flow rate in gradient ofmobile phases 008 (vv) formic acid and 0015 (vv)trifluoracetic acid in water (mobile phase A) and 008 (vv)formic acid and 0015 (vv) trifluoracetic acid in 80 (vv)acetonitrile (mobile phase B) The gradient used at 03 μLminwas as follows 0ndash3 min 4 eluent В 3ndash28 min linearincrease to 55 eluent В 28ndash31 min linear increase to 98eluent В 31ndash37 min washing the column by 98 eluent В37ndash40 min linear decrease to 4 eluent В 40ndash47 min equil-ibrating the column by 4 eluent В

An Orbitrap Q-Exactive mass spectrometer (Thermo CAUSA) was used for the experiments The instrument wasoperated in positive ionization mode using a targeted-MSMSmode with 140 K resolution for higher-energy collisiondissociation (HCD) MSMS scans The AGC target value wasset to 5deg105 ions for maximum ion trap injection time of60 ms The isolation window was set to 2 Th and thenormalized collision energy was 23 for both peptides Thetarget mass of 81040483 with charge state of 2+ was acquiredfrom 10 to 13 min using the scheduled targeted-MSMS modeThe first fixed mass was set to 100 mz

24 Direct injection MSMS of synthetic peptides with CIDfragmentation

An Orbitrap Elite mass spectrometer (Thermo Fisher Scientif-ic Bremen Germany) equipped with a HESI ion source wasoperated in positivemodewith a spray voltage of 4 kV Sheathand auxiliary gas pressure were set to 10 and 2 arbitrary unitscorrespondently

Survey scan MS data (mz from 100 to 1200) were acquiredat resolving power of 60 K (at mz 400) at direct infusion rateof 5 μLmin For accuratemassmeasurements the lock-massoption was activated mz 445120025 for internal recalibra-tion in real time For all MS or MSMS data acquired 100 scanswere collected and the injection time varied from 50 to500 ms For acquiringMSMS spectra the parent ionmassmz

888 was isolated in 2 Thwindow and the normalized collisionenergy of 31 was applied for both peptides

25 Molecular modeling of peptides of interest to predictpreferable conformers

Tripeptides ATN and A(isoT)N were considered to predictpreferable conformers Partial atomic charges were calculatedby semi-empirical quantum-mechanical PM3 method [26]RandomSearch software of SYBYL81 package was used topredict peptide conformers Search parameters were asfollows rotation permitted for all covalent bonds of threonineand asparagine residues except peptide bonds Tripos forcefield the number of iterations generating conformers 1000energy threshold 3 kcalmol RMS 02 Aring molecule minimiza-tion 500 iterations

26 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methioninemodifications

Samples of bovine serum albumin (A9418 Sigma-AldrichUSA) were prepared to test three different reduction andalkylation schemes Each sample contained 50 μg albumindissolved in 50 μL of 50 mM tetraethylammonium chloride(TEABC Sigma-Aldrich USA) solution Then 55 μL of 5 (wv)sodium deoxycholate in 50 mM TrisHCl pH 85 was added toeach sample After mixing 12 μL of 05 M sodium dithiothre-itol (DTT Acros Organics USA) in 50 mM TEABC was added toeach sample up to 10 mM DTT The sample 1 was thenincubated for 20 min at 56 degC whereas samples 2 and 3were maintained for 10 min at 95 degC

After that 55 μL of 05 M iodoacetamide (IAA Sigma-Aldrich USA) solution in 50 mM TEABC was added to eachsample to the final concentration of 50 mM IAA Samples 1and 2 were then incubated for 30 min at room temperaturein the dark The sample 3 was incubated for 10 min at 95 degCalso in the dark

The protein was then digested with trypsin (Trypsin GoldPromega) The enzyme was added in the ratio 150 (ww) andthe resultant mixture was incubated overnight at 37 degCEnzymatic digestion was terminated by the addition of aceticacid (5 wv)

Samples were shaken for 30 min at 25 degC (500 rpm) andcentrifuged at 9300 g for 15 min at 20 degC (Centrifuge 5415REppendorf) Supernatants were added to the filter unit (30 kDacutoff Millipore USA) and centrifuged at 13400 g for 20 minat 20 degC After that 200 μL of 50 (vv) formic acid was addedto each filter unit and samples were centrifuged at 13400 g for20 min at 20 degC to dryness Samples were vacuum dried usingEppendorf Concentrator Plus at 45 degC

For LCndashMSMS analysis dried samples were treated simi-larly to synthetic peptides (see section 23)

Mass spectra were analyzed using MaxQuant [23] package1528 as described above with minor changes ie theAndromeda search included methionine oxidation and me-thionine to threonine conversion as variable modificationsand FDR for peptide modifications was set to 1 Each samplewas studied in three technical repeats

172 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

173J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

3 Results and discussion

31 Methionine-61 of HSC70 chaperone is changed in 8 of 9NCI-60 cell lines

Out of all NCI-60 cancer cell lines analyzed in [20] the deepproteome data using prefractionation and HCD fragmentationwere obtained for nine cell lines including MCF7 breast cancerSK-OV-3 ovarian cancer U251malignant glioblastoma RXF-393renal cancer COLO205 colon cancer NCI-H460 lung cancer M14malignant melanoma CCRF-CEM acute lymphocytic leukemiaand PC-3 prostate cancer cell lines These deep proteome datawere search against a sequence database containing additionalsequences derived from colon cancer exome data [21] (theinformation about non-synonymous SNPs was extracted fromthe Supplementary Material provided by the authors) For eachSNP an additional protein sequence was generated by intro-ducing a corresponding amino acid substitution The resultingcolon cancer database contained 127486 protein sequencesThe search was performed by the Mascot and the Andromedasearch engines by the separate FDR or one-by-one approach asgenerally described in [1622]

Note that in this part we used a protein sequence databasewhich did not exactly correspond to the cell line genomeseven to the COLO205 line which originated from colon cancerIt was shown in [27] that there were not many common pointmutations between different patients despite the fact thatthesemutations were accumulated in common genes such asknown proto-oncogenes and suppressors Surprisingly one ofthe mutant peptides was returned by the search engines withhigh score in 8 out of 9 cell lines analyzed in all but the renalcancer cell line It was the peptide 57(NQVATNPTNTVFDAK)71bearing an amino acid substitution of methionine to threo-nine in the position of 61 of the human Heat shock cognate71 kDa protein (HSC70 gene name HSPA8) Moreover in 6 outof 9 cell lines the above modified sequence was identified alsoin the form of peptide 57(NQVATNPTNTVFDAKR)72 with amissed cleavage

32 Conversion of methionine-61 of HSC70 chaperone is notencoded by nucleic acid

It is also interesting to note that the point mutation that maycause M61T substitution in HSC70 is rare and found in onlyone colon cancer patient of more than two hundred tumorpatients studied in [20] Soon after our finding the exomes ofall NCI-60 cell lines become available [27] Contrary to ourexpectations mutations involving replacement of M residueby T in the position 61 of HSC70 were not found in theexomes The first possible explanation of M61T peptideidentifications in the proteome is the contamination of thecell cultures by a foreign protein eg derived from the mediaor external cell line such as HeLa However after performinga protein BLAST search we found no identical peptide in any

Fig 1 ndash Exemplary Orbitrap HCD spectra of 57(NQVA(TisoT)NPTNpeptide from shotgun cancer cell line data [20] synthetic NQVA(NQVATNPTNTVFDAK peptide (C) All spectra used in this work a

living organism Moreover no human genome databasecontained any traces of corresponding coding SNP exceptthe above-mentioned colon cancer exome in the COSMICdatabase [28] RNA editing may be another explanation forthe M61T replacement In this case the AUG codon should beconverted to ACG and such editing is not described asopposed to C-to-U deamination [29] Thus we are left withthe only hypothesis the conversion of methionine-61 of theHSC70 chaperone is not encoded genetically but occurseither post-translationally or during the sample preparationbefore LCndashMSMS analysis

33 Methionine to isothreonine conversion during proteomesample processing

Non-genetic origin of the observed methionine conversionallows proposing that methionine may be converted toisothreonine (homoserine) instead of threonine Such a reactionis known from oxidative treatment of protein methionineresidues by cyanogen bromide In some cases instead of peptidebond cleavage and formation of C-terminal homoserine lactonethemethionine residuemay be converted to isothreonine [30] Inthe search this transition is considered as a variablemodificationafter cyanogen bromide treatment (see eg wwwmatrixsciencecomhelpenzyme_helphtml for the Mascot search engine)

Side reactions during cysteine alkylationwith iodoacetamideused routinely in sample preparation for the proteomeanalysis may also be responsible for in vitro methioninemodifications As itwas shownbeforemethionine alkylation byiodoacetamide with a formation of S-carbamidomethyl methi-onine and subsequent neutral loss of 2-(methylthio)acetamideduring CID may mimic a neutral loss of phosphoric acid inphosphopeptides which contain phosphoserine (pSer) orphosphothreonine (pThr) [31]

Both iodoacetamideand iodoacetatemayalkylatemethionineat any pH [32] The resulting sulfonium salts S-carbamidomethylmethionine or S-carboxymethyl methionine go to homoserineand homoserine lactone at 100 degC and approximately neutral pH[33] Similar conditions were used in sample preparation beforeelectrophoresis for the cell line proteome datasets of interest [20]Thus some degree of methionine-to-isothreonine conversionshould be expected

34 Synthetic peptide analysis supporting methionine-61 toisothreonine amino acid conversion in HSC70

To reveal the origin of methionine conversion in the peptide ofinterest two synthetic analogs of 57(T61)71 and 57(isoT61)71from HSC70 were analyzed by tandem mass spectrometryemploying HCD fragmentation The purpose of this analysiswas distinguishing between the two peptides using their MSMSprofiles and comparing them to the corresponding MSMSprofiles observed in the cell lines

The mass-spectra attributed to the peptide of interestfrom the cell line shotgun data and the mass-spectra from

TVFDAK)71 peptide of human HSC70 protein (A) candidateisoT)NPTNTVFDAK peptide (B) and syntheticre shown in Supplementary Tables 1 and 2

174 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

both synthetic peptides were remarkably similar as shownin Fig 1 The only significant difference was observed inthe intensity of the peak at mz 628306 In three series of 5LCndashMSMS runs the intensity of this peak in the spectra ofthreonine-containing peptide was relatively low repre-senting about 3 of the maximal peak intensity (see Figs 12 and Supplementary Tables 12) In the spectra ofisothreonine-containing peptide the same fragment yieldssignificantly higher intensity peak of 30 of the maximalpeak Relative intensity of the corresponding peak in theNCI-60 proteome mass-spectra was found to be about 18(Fig 2)

Mass spectra of the synthetic peptide by direct injection toOrbitrap MS with CID fragmentation were also compared andthe remarkable similarity was obtained with their HCDprofiles (Supplementary Fig 1) with a higher intensity of thepeak at mz 628306 in the spectra of isothreonine-containingpeptide

The peak at mz 628306 represents a b6-ion with asequence of NQVA(TisoT)N+ Interestingly the correspondingy10-ion at mz 992505 PTNTVFDAKR+ has similar intensityin all three datasets (Fig 2) These data show that underthe conditions used the NQVATN+ ion is significantly lessstable compared with its isothreonine-containing isomerThis difference between the isomers may be explained bydifference in hydrogen bond networks in threonine andisothreonine peptides It has been shown elsewhere thathydrogen bonding influenced fragmentation process in iontraps such as by CID and even could be used to study suchbonding inmolecules [34] We further usedmolecularmodelingof peptides to predict difference in preferable hydrogen bonding(Section 35)

Thus the spectra of the NCI-60 proteome data are moresimilar to the isothreonine-containing peptide therebysupporting the hypothesis of methionine conversion to thesaid amino acid However this evidence is not enough forunambiguous decision towards methionine to isothreonine

Fig 2 ndash Box-and-whiskers diagrams of relative intensity of 6283(right panel) in Orbitrap spectra of 57(NQVA(TisoT)NPTNTVFDAKcancer cell line data [20] synthetic NQVA(isoT)NPTNTVFDAK (Isoexperiments 12 exemplary spectra were taken from 8 cancer ceanalytical series were made each containing 5 LCndashMSMS runs opercentage of maximum intensity in each spectrum (Supplemen

conversion in shotgun data This observation should besupported by experimentation with model proteins (seeSection 37 below)

35 Molecular modeling to predict differential hydrogen bondnetworks likely responsible for differential stability of TisoTpeptide fragment during HCD and CID fragmentation

Tripeptides ATN and A(isoT)N were modeled to estimatedifferential hydrogen bonding of hydroxyl moieties of threo-nine or isothreonine residues For ATN tripeptide 210 stableconformers were generated which have a range of internalenergy minus82 to minus38 kcalmol According to the calculationmore preferable conformers of this peptide have hydrogenbonding between the hydroxyl moiety of threonine andthe backbone oxygen atom of neighboring alanine residue(Fig 3A) In total there were 70 conformers with suchhydrogen bonding among all 210 structures In 25th and10th percentiles of conformers with minimal estimatedinternal energy ratios of such structures were 64 and 55respectively

For A(isoT)N tripeptide 179 stable conformers were gener-ated Overall an internal energy calculated for these structureswas higher than for ATN conformers minus279 to minus002 kcalmolPreferable conformations of this peptide contained hydrogenbonding between the hydroxyl moiety of isothreonine andthe backbone secondary amine of the same residue (Fig 3B)Such conformations reached 36 of total number of con-formers being inferior to structures where isothreoninehydroxyl moiety had no hydrogen bonding predicted (51)At the same time such a formation of internal hydrogenbond in isothreonine residue had benefit in energy In 25thand 10th percentiles of conformers with minimal estimatedinternal energy ratios of such conformers increased to 67and 89 respectively

Among predicted structures of both peptides the only statehas similar hydrogen bonding of hydroxyl moiety namely

06 Da b6-ion peak (left panel) and 992505 Da y10-ion peak)71 peptide of human HSC70 protein (Shotgun) from shotgunThr) and NQVATNPTNTVFDAK peptides (Thr) For shotgunll line deep proteome data [20] For each synthetic peptide 3f the same (N = 15) Relative intensity is calculated as atary Tables 1 and 2)

Fig 3 ndash Exemplary models of Ala-Thr-Asn (A) and Ala-isoThr-Asn (B) tripeptides with minimal internal energy Hydrogennitrogen and oxygen atoms are shown in light blue deep blue and red respectively Hydrogen bonds are depicted as dashedyellow lines N-terminal alanine residue is to the left in each panel Note hydrogen bonding between the hydroxyl moiety ofthreonine and the backbone oxygen atom of alanine residue (A) and between the hydroxyl moiety of isothreonine and thebackbone secondary amine of the same residue (B)

175J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

its bond with the backbone secondary amine of asparagineHowever such a structure formed in 16 and 8 of all cases inthreonine and isothreonine peptide respectively All possiblepredicted hydrogen bonds in tripeptides of interest are listed inSupplementary Table 3

Thus the calculationshave shown that inmodel tripeptideshydrogen bonding networks of threonine and isothreonineweredramatically different It may be further hypothesized that suchdifference causes differential MSMS spectra after CID or HCDfragmentation However to define generic principles of discrim-ination between threonine and isothreonine-containing pep-tides using MSMS further experimentation is needed which isout of the scope of the present work

36 Quantitative analysis of genetically encoded 57(NQVAMNPTNTVFDAK)71 and its M61(isoT) modificationin cancer cell line shotgun data

In the further analysis we studied deep proteome data for 9cell line to find whether the wild-type 57(M61)71 peptide of

Fig 4 ndash Label-free quantitation data of 57(NQVA(isoT)NPTNTVFDAmissed cleavage (57(isoT61)72) and their wild-type counterparts(57(M61)71) from shotgun deep proteome data of 8 NCI-60 cancer(wwwmaxquantorg [23]) Relative intensities are shown in log s

HSC70 was present in those cell lines Indeed in all sampleswhere M-to-isoT peptide was found the wild-type peptides57-71 tryptic peptide and 57-72 peptide with missed cleavagewere also observed Label-free quantitative analysis of relativecontent of both variants in the spectra has shown that theabundance of the modified peptide was considerably lowerthan the abundance of the wild-type peptide The averageintensity of modified M-to-isoT peptides amounted to only23 (standard deviation 13 CI 11 to 43) of such intensityfor the wild-type peptide (Fig 4)

37 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methionine toisothreonine conversion

Samples of bovine serum albumin (BSA) were prepared totest three different reduction and alkylation schemes Thealbumin was treated by sodium deoxycholate to mimic aprocedure that was routinely used in our lab to extractproteins from cells before shotgun proteome analysis [35]

K)71 variant peptide (57(isoT61)71) its +R72 peptide with one 57(NQVAMNPTNTVFDAK)71 (57(M61)71) and +R72 peptidecell lines [20] Data were obtained using MaxQuant softwarecale

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

172 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

173J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

3 Results and discussion

31 Methionine-61 of HSC70 chaperone is changed in 8 of 9NCI-60 cell lines

Out of all NCI-60 cancer cell lines analyzed in [20] the deepproteome data using prefractionation and HCD fragmentationwere obtained for nine cell lines including MCF7 breast cancerSK-OV-3 ovarian cancer U251malignant glioblastoma RXF-393renal cancer COLO205 colon cancer NCI-H460 lung cancer M14malignant melanoma CCRF-CEM acute lymphocytic leukemiaand PC-3 prostate cancer cell lines These deep proteome datawere search against a sequence database containing additionalsequences derived from colon cancer exome data [21] (theinformation about non-synonymous SNPs was extracted fromthe Supplementary Material provided by the authors) For eachSNP an additional protein sequence was generated by intro-ducing a corresponding amino acid substitution The resultingcolon cancer database contained 127486 protein sequencesThe search was performed by the Mascot and the Andromedasearch engines by the separate FDR or one-by-one approach asgenerally described in [1622]

Note that in this part we used a protein sequence databasewhich did not exactly correspond to the cell line genomeseven to the COLO205 line which originated from colon cancerIt was shown in [27] that there were not many common pointmutations between different patients despite the fact thatthesemutations were accumulated in common genes such asknown proto-oncogenes and suppressors Surprisingly one ofthe mutant peptides was returned by the search engines withhigh score in 8 out of 9 cell lines analyzed in all but the renalcancer cell line It was the peptide 57(NQVATNPTNTVFDAK)71bearing an amino acid substitution of methionine to threo-nine in the position of 61 of the human Heat shock cognate71 kDa protein (HSC70 gene name HSPA8) Moreover in 6 outof 9 cell lines the above modified sequence was identified alsoin the form of peptide 57(NQVATNPTNTVFDAKR)72 with amissed cleavage

32 Conversion of methionine-61 of HSC70 chaperone is notencoded by nucleic acid

It is also interesting to note that the point mutation that maycause M61T substitution in HSC70 is rare and found in onlyone colon cancer patient of more than two hundred tumorpatients studied in [20] Soon after our finding the exomes ofall NCI-60 cell lines become available [27] Contrary to ourexpectations mutations involving replacement of M residueby T in the position 61 of HSC70 were not found in theexomes The first possible explanation of M61T peptideidentifications in the proteome is the contamination of thecell cultures by a foreign protein eg derived from the mediaor external cell line such as HeLa However after performinga protein BLAST search we found no identical peptide in any

Fig 1 ndash Exemplary Orbitrap HCD spectra of 57(NQVA(TisoT)NPTNpeptide from shotgun cancer cell line data [20] synthetic NQVA(NQVATNPTNTVFDAK peptide (C) All spectra used in this work a

living organism Moreover no human genome databasecontained any traces of corresponding coding SNP exceptthe above-mentioned colon cancer exome in the COSMICdatabase [28] RNA editing may be another explanation forthe M61T replacement In this case the AUG codon should beconverted to ACG and such editing is not described asopposed to C-to-U deamination [29] Thus we are left withthe only hypothesis the conversion of methionine-61 of theHSC70 chaperone is not encoded genetically but occurseither post-translationally or during the sample preparationbefore LCndashMSMS analysis

33 Methionine to isothreonine conversion during proteomesample processing

Non-genetic origin of the observed methionine conversionallows proposing that methionine may be converted toisothreonine (homoserine) instead of threonine Such a reactionis known from oxidative treatment of protein methionineresidues by cyanogen bromide In some cases instead of peptidebond cleavage and formation of C-terminal homoserine lactonethemethionine residuemay be converted to isothreonine [30] Inthe search this transition is considered as a variablemodificationafter cyanogen bromide treatment (see eg wwwmatrixsciencecomhelpenzyme_helphtml for the Mascot search engine)

Side reactions during cysteine alkylationwith iodoacetamideused routinely in sample preparation for the proteomeanalysis may also be responsible for in vitro methioninemodifications As itwas shownbeforemethionine alkylation byiodoacetamide with a formation of S-carbamidomethyl methi-onine and subsequent neutral loss of 2-(methylthio)acetamideduring CID may mimic a neutral loss of phosphoric acid inphosphopeptides which contain phosphoserine (pSer) orphosphothreonine (pThr) [31]

Both iodoacetamideand iodoacetatemayalkylatemethionineat any pH [32] The resulting sulfonium salts S-carbamidomethylmethionine or S-carboxymethyl methionine go to homoserineand homoserine lactone at 100 degC and approximately neutral pH[33] Similar conditions were used in sample preparation beforeelectrophoresis for the cell line proteome datasets of interest [20]Thus some degree of methionine-to-isothreonine conversionshould be expected

34 Synthetic peptide analysis supporting methionine-61 toisothreonine amino acid conversion in HSC70

To reveal the origin of methionine conversion in the peptide ofinterest two synthetic analogs of 57(T61)71 and 57(isoT61)71from HSC70 were analyzed by tandem mass spectrometryemploying HCD fragmentation The purpose of this analysiswas distinguishing between the two peptides using their MSMSprofiles and comparing them to the corresponding MSMSprofiles observed in the cell lines

The mass-spectra attributed to the peptide of interestfrom the cell line shotgun data and the mass-spectra from

TVFDAK)71 peptide of human HSC70 protein (A) candidateisoT)NPTNTVFDAK peptide (B) and syntheticre shown in Supplementary Tables 1 and 2

174 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

both synthetic peptides were remarkably similar as shownin Fig 1 The only significant difference was observed inthe intensity of the peak at mz 628306 In three series of 5LCndashMSMS runs the intensity of this peak in the spectra ofthreonine-containing peptide was relatively low repre-senting about 3 of the maximal peak intensity (see Figs 12 and Supplementary Tables 12) In the spectra ofisothreonine-containing peptide the same fragment yieldssignificantly higher intensity peak of 30 of the maximalpeak Relative intensity of the corresponding peak in theNCI-60 proteome mass-spectra was found to be about 18(Fig 2)

Mass spectra of the synthetic peptide by direct injection toOrbitrap MS with CID fragmentation were also compared andthe remarkable similarity was obtained with their HCDprofiles (Supplementary Fig 1) with a higher intensity of thepeak at mz 628306 in the spectra of isothreonine-containingpeptide

The peak at mz 628306 represents a b6-ion with asequence of NQVA(TisoT)N+ Interestingly the correspondingy10-ion at mz 992505 PTNTVFDAKR+ has similar intensityin all three datasets (Fig 2) These data show that underthe conditions used the NQVATN+ ion is significantly lessstable compared with its isothreonine-containing isomerThis difference between the isomers may be explained bydifference in hydrogen bond networks in threonine andisothreonine peptides It has been shown elsewhere thathydrogen bonding influenced fragmentation process in iontraps such as by CID and even could be used to study suchbonding inmolecules [34] We further usedmolecularmodelingof peptides to predict difference in preferable hydrogen bonding(Section 35)

Thus the spectra of the NCI-60 proteome data are moresimilar to the isothreonine-containing peptide therebysupporting the hypothesis of methionine conversion to thesaid amino acid However this evidence is not enough forunambiguous decision towards methionine to isothreonine

Fig 2 ndash Box-and-whiskers diagrams of relative intensity of 6283(right panel) in Orbitrap spectra of 57(NQVA(TisoT)NPTNTVFDAKcancer cell line data [20] synthetic NQVA(isoT)NPTNTVFDAK (Isoexperiments 12 exemplary spectra were taken from 8 cancer ceanalytical series were made each containing 5 LCndashMSMS runs opercentage of maximum intensity in each spectrum (Supplemen

conversion in shotgun data This observation should besupported by experimentation with model proteins (seeSection 37 below)

35 Molecular modeling to predict differential hydrogen bondnetworks likely responsible for differential stability of TisoTpeptide fragment during HCD and CID fragmentation

Tripeptides ATN and A(isoT)N were modeled to estimatedifferential hydrogen bonding of hydroxyl moieties of threo-nine or isothreonine residues For ATN tripeptide 210 stableconformers were generated which have a range of internalenergy minus82 to minus38 kcalmol According to the calculationmore preferable conformers of this peptide have hydrogenbonding between the hydroxyl moiety of threonine andthe backbone oxygen atom of neighboring alanine residue(Fig 3A) In total there were 70 conformers with suchhydrogen bonding among all 210 structures In 25th and10th percentiles of conformers with minimal estimatedinternal energy ratios of such structures were 64 and 55respectively

For A(isoT)N tripeptide 179 stable conformers were gener-ated Overall an internal energy calculated for these structureswas higher than for ATN conformers minus279 to minus002 kcalmolPreferable conformations of this peptide contained hydrogenbonding between the hydroxyl moiety of isothreonine andthe backbone secondary amine of the same residue (Fig 3B)Such conformations reached 36 of total number of con-formers being inferior to structures where isothreoninehydroxyl moiety had no hydrogen bonding predicted (51)At the same time such a formation of internal hydrogenbond in isothreonine residue had benefit in energy In 25thand 10th percentiles of conformers with minimal estimatedinternal energy ratios of such conformers increased to 67and 89 respectively

Among predicted structures of both peptides the only statehas similar hydrogen bonding of hydroxyl moiety namely

06 Da b6-ion peak (left panel) and 992505 Da y10-ion peak)71 peptide of human HSC70 protein (Shotgun) from shotgunThr) and NQVATNPTNTVFDAK peptides (Thr) For shotgunll line deep proteome data [20] For each synthetic peptide 3f the same (N = 15) Relative intensity is calculated as atary Tables 1 and 2)

Fig 3 ndash Exemplary models of Ala-Thr-Asn (A) and Ala-isoThr-Asn (B) tripeptides with minimal internal energy Hydrogennitrogen and oxygen atoms are shown in light blue deep blue and red respectively Hydrogen bonds are depicted as dashedyellow lines N-terminal alanine residue is to the left in each panel Note hydrogen bonding between the hydroxyl moiety ofthreonine and the backbone oxygen atom of alanine residue (A) and between the hydroxyl moiety of isothreonine and thebackbone secondary amine of the same residue (B)

175J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

its bond with the backbone secondary amine of asparagineHowever such a structure formed in 16 and 8 of all cases inthreonine and isothreonine peptide respectively All possiblepredicted hydrogen bonds in tripeptides of interest are listed inSupplementary Table 3

Thus the calculationshave shown that inmodel tripeptideshydrogen bonding networks of threonine and isothreonineweredramatically different It may be further hypothesized that suchdifference causes differential MSMS spectra after CID or HCDfragmentation However to define generic principles of discrim-ination between threonine and isothreonine-containing pep-tides using MSMS further experimentation is needed which isout of the scope of the present work

36 Quantitative analysis of genetically encoded 57(NQVAMNPTNTVFDAK)71 and its M61(isoT) modificationin cancer cell line shotgun data

In the further analysis we studied deep proteome data for 9cell line to find whether the wild-type 57(M61)71 peptide of

Fig 4 ndash Label-free quantitation data of 57(NQVA(isoT)NPTNTVFDAmissed cleavage (57(isoT61)72) and their wild-type counterparts(57(M61)71) from shotgun deep proteome data of 8 NCI-60 cancer(wwwmaxquantorg [23]) Relative intensities are shown in log s

HSC70 was present in those cell lines Indeed in all sampleswhere M-to-isoT peptide was found the wild-type peptides57-71 tryptic peptide and 57-72 peptide with missed cleavagewere also observed Label-free quantitative analysis of relativecontent of both variants in the spectra has shown that theabundance of the modified peptide was considerably lowerthan the abundance of the wild-type peptide The averageintensity of modified M-to-isoT peptides amounted to only23 (standard deviation 13 CI 11 to 43) of such intensityfor the wild-type peptide (Fig 4)

37 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methionine toisothreonine conversion

Samples of bovine serum albumin (BSA) were prepared totest three different reduction and alkylation schemes Thealbumin was treated by sodium deoxycholate to mimic aprocedure that was routinely used in our lab to extractproteins from cells before shotgun proteome analysis [35]

K)71 variant peptide (57(isoT61)71) its +R72 peptide with one 57(NQVAMNPTNTVFDAK)71 (57(M61)71) and +R72 peptidecell lines [20] Data were obtained using MaxQuant softwarecale

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

173J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

3 Results and discussion

31 Methionine-61 of HSC70 chaperone is changed in 8 of 9NCI-60 cell lines

Out of all NCI-60 cancer cell lines analyzed in [20] the deepproteome data using prefractionation and HCD fragmentationwere obtained for nine cell lines including MCF7 breast cancerSK-OV-3 ovarian cancer U251malignant glioblastoma RXF-393renal cancer COLO205 colon cancer NCI-H460 lung cancer M14malignant melanoma CCRF-CEM acute lymphocytic leukemiaand PC-3 prostate cancer cell lines These deep proteome datawere search against a sequence database containing additionalsequences derived from colon cancer exome data [21] (theinformation about non-synonymous SNPs was extracted fromthe Supplementary Material provided by the authors) For eachSNP an additional protein sequence was generated by intro-ducing a corresponding amino acid substitution The resultingcolon cancer database contained 127486 protein sequencesThe search was performed by the Mascot and the Andromedasearch engines by the separate FDR or one-by-one approach asgenerally described in [1622]

Note that in this part we used a protein sequence databasewhich did not exactly correspond to the cell line genomeseven to the COLO205 line which originated from colon cancerIt was shown in [27] that there were not many common pointmutations between different patients despite the fact thatthesemutations were accumulated in common genes such asknown proto-oncogenes and suppressors Surprisingly one ofthe mutant peptides was returned by the search engines withhigh score in 8 out of 9 cell lines analyzed in all but the renalcancer cell line It was the peptide 57(NQVATNPTNTVFDAK)71bearing an amino acid substitution of methionine to threo-nine in the position of 61 of the human Heat shock cognate71 kDa protein (HSC70 gene name HSPA8) Moreover in 6 outof 9 cell lines the above modified sequence was identified alsoin the form of peptide 57(NQVATNPTNTVFDAKR)72 with amissed cleavage

32 Conversion of methionine-61 of HSC70 chaperone is notencoded by nucleic acid

It is also interesting to note that the point mutation that maycause M61T substitution in HSC70 is rare and found in onlyone colon cancer patient of more than two hundred tumorpatients studied in [20] Soon after our finding the exomes ofall NCI-60 cell lines become available [27] Contrary to ourexpectations mutations involving replacement of M residueby T in the position 61 of HSC70 were not found in theexomes The first possible explanation of M61T peptideidentifications in the proteome is the contamination of thecell cultures by a foreign protein eg derived from the mediaor external cell line such as HeLa However after performinga protein BLAST search we found no identical peptide in any

Fig 1 ndash Exemplary Orbitrap HCD spectra of 57(NQVA(TisoT)NPTNpeptide from shotgun cancer cell line data [20] synthetic NQVA(NQVATNPTNTVFDAK peptide (C) All spectra used in this work a

living organism Moreover no human genome databasecontained any traces of corresponding coding SNP exceptthe above-mentioned colon cancer exome in the COSMICdatabase [28] RNA editing may be another explanation forthe M61T replacement In this case the AUG codon should beconverted to ACG and such editing is not described asopposed to C-to-U deamination [29] Thus we are left withthe only hypothesis the conversion of methionine-61 of theHSC70 chaperone is not encoded genetically but occurseither post-translationally or during the sample preparationbefore LCndashMSMS analysis

33 Methionine to isothreonine conversion during proteomesample processing

Non-genetic origin of the observed methionine conversionallows proposing that methionine may be converted toisothreonine (homoserine) instead of threonine Such a reactionis known from oxidative treatment of protein methionineresidues by cyanogen bromide In some cases instead of peptidebond cleavage and formation of C-terminal homoserine lactonethemethionine residuemay be converted to isothreonine [30] Inthe search this transition is considered as a variablemodificationafter cyanogen bromide treatment (see eg wwwmatrixsciencecomhelpenzyme_helphtml for the Mascot search engine)

Side reactions during cysteine alkylationwith iodoacetamideused routinely in sample preparation for the proteomeanalysis may also be responsible for in vitro methioninemodifications As itwas shownbeforemethionine alkylation byiodoacetamide with a formation of S-carbamidomethyl methi-onine and subsequent neutral loss of 2-(methylthio)acetamideduring CID may mimic a neutral loss of phosphoric acid inphosphopeptides which contain phosphoserine (pSer) orphosphothreonine (pThr) [31]

Both iodoacetamideand iodoacetatemayalkylatemethionineat any pH [32] The resulting sulfonium salts S-carbamidomethylmethionine or S-carboxymethyl methionine go to homoserineand homoserine lactone at 100 degC and approximately neutral pH[33] Similar conditions were used in sample preparation beforeelectrophoresis for the cell line proteome datasets of interest [20]Thus some degree of methionine-to-isothreonine conversionshould be expected

34 Synthetic peptide analysis supporting methionine-61 toisothreonine amino acid conversion in HSC70

To reveal the origin of methionine conversion in the peptide ofinterest two synthetic analogs of 57(T61)71 and 57(isoT61)71from HSC70 were analyzed by tandem mass spectrometryemploying HCD fragmentation The purpose of this analysiswas distinguishing between the two peptides using their MSMSprofiles and comparing them to the corresponding MSMSprofiles observed in the cell lines

The mass-spectra attributed to the peptide of interestfrom the cell line shotgun data and the mass-spectra from

TVFDAK)71 peptide of human HSC70 protein (A) candidateisoT)NPTNTVFDAK peptide (B) and syntheticre shown in Supplementary Tables 1 and 2

174 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

both synthetic peptides were remarkably similar as shownin Fig 1 The only significant difference was observed inthe intensity of the peak at mz 628306 In three series of 5LCndashMSMS runs the intensity of this peak in the spectra ofthreonine-containing peptide was relatively low repre-senting about 3 of the maximal peak intensity (see Figs 12 and Supplementary Tables 12) In the spectra ofisothreonine-containing peptide the same fragment yieldssignificantly higher intensity peak of 30 of the maximalpeak Relative intensity of the corresponding peak in theNCI-60 proteome mass-spectra was found to be about 18(Fig 2)

Mass spectra of the synthetic peptide by direct injection toOrbitrap MS with CID fragmentation were also compared andthe remarkable similarity was obtained with their HCDprofiles (Supplementary Fig 1) with a higher intensity of thepeak at mz 628306 in the spectra of isothreonine-containingpeptide

The peak at mz 628306 represents a b6-ion with asequence of NQVA(TisoT)N+ Interestingly the correspondingy10-ion at mz 992505 PTNTVFDAKR+ has similar intensityin all three datasets (Fig 2) These data show that underthe conditions used the NQVATN+ ion is significantly lessstable compared with its isothreonine-containing isomerThis difference between the isomers may be explained bydifference in hydrogen bond networks in threonine andisothreonine peptides It has been shown elsewhere thathydrogen bonding influenced fragmentation process in iontraps such as by CID and even could be used to study suchbonding inmolecules [34] We further usedmolecularmodelingof peptides to predict difference in preferable hydrogen bonding(Section 35)

Thus the spectra of the NCI-60 proteome data are moresimilar to the isothreonine-containing peptide therebysupporting the hypothesis of methionine conversion to thesaid amino acid However this evidence is not enough forunambiguous decision towards methionine to isothreonine

Fig 2 ndash Box-and-whiskers diagrams of relative intensity of 6283(right panel) in Orbitrap spectra of 57(NQVA(TisoT)NPTNTVFDAKcancer cell line data [20] synthetic NQVA(isoT)NPTNTVFDAK (Isoexperiments 12 exemplary spectra were taken from 8 cancer ceanalytical series were made each containing 5 LCndashMSMS runs opercentage of maximum intensity in each spectrum (Supplemen

conversion in shotgun data This observation should besupported by experimentation with model proteins (seeSection 37 below)

35 Molecular modeling to predict differential hydrogen bondnetworks likely responsible for differential stability of TisoTpeptide fragment during HCD and CID fragmentation

Tripeptides ATN and A(isoT)N were modeled to estimatedifferential hydrogen bonding of hydroxyl moieties of threo-nine or isothreonine residues For ATN tripeptide 210 stableconformers were generated which have a range of internalenergy minus82 to minus38 kcalmol According to the calculationmore preferable conformers of this peptide have hydrogenbonding between the hydroxyl moiety of threonine andthe backbone oxygen atom of neighboring alanine residue(Fig 3A) In total there were 70 conformers with suchhydrogen bonding among all 210 structures In 25th and10th percentiles of conformers with minimal estimatedinternal energy ratios of such structures were 64 and 55respectively

For A(isoT)N tripeptide 179 stable conformers were gener-ated Overall an internal energy calculated for these structureswas higher than for ATN conformers minus279 to minus002 kcalmolPreferable conformations of this peptide contained hydrogenbonding between the hydroxyl moiety of isothreonine andthe backbone secondary amine of the same residue (Fig 3B)Such conformations reached 36 of total number of con-formers being inferior to structures where isothreoninehydroxyl moiety had no hydrogen bonding predicted (51)At the same time such a formation of internal hydrogenbond in isothreonine residue had benefit in energy In 25thand 10th percentiles of conformers with minimal estimatedinternal energy ratios of such conformers increased to 67and 89 respectively

Among predicted structures of both peptides the only statehas similar hydrogen bonding of hydroxyl moiety namely

06 Da b6-ion peak (left panel) and 992505 Da y10-ion peak)71 peptide of human HSC70 protein (Shotgun) from shotgunThr) and NQVATNPTNTVFDAK peptides (Thr) For shotgunll line deep proteome data [20] For each synthetic peptide 3f the same (N = 15) Relative intensity is calculated as atary Tables 1 and 2)

Fig 3 ndash Exemplary models of Ala-Thr-Asn (A) and Ala-isoThr-Asn (B) tripeptides with minimal internal energy Hydrogennitrogen and oxygen atoms are shown in light blue deep blue and red respectively Hydrogen bonds are depicted as dashedyellow lines N-terminal alanine residue is to the left in each panel Note hydrogen bonding between the hydroxyl moiety ofthreonine and the backbone oxygen atom of alanine residue (A) and between the hydroxyl moiety of isothreonine and thebackbone secondary amine of the same residue (B)

175J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

its bond with the backbone secondary amine of asparagineHowever such a structure formed in 16 and 8 of all cases inthreonine and isothreonine peptide respectively All possiblepredicted hydrogen bonds in tripeptides of interest are listed inSupplementary Table 3

Thus the calculationshave shown that inmodel tripeptideshydrogen bonding networks of threonine and isothreonineweredramatically different It may be further hypothesized that suchdifference causes differential MSMS spectra after CID or HCDfragmentation However to define generic principles of discrim-ination between threonine and isothreonine-containing pep-tides using MSMS further experimentation is needed which isout of the scope of the present work

36 Quantitative analysis of genetically encoded 57(NQVAMNPTNTVFDAK)71 and its M61(isoT) modificationin cancer cell line shotgun data

In the further analysis we studied deep proteome data for 9cell line to find whether the wild-type 57(M61)71 peptide of

Fig 4 ndash Label-free quantitation data of 57(NQVA(isoT)NPTNTVFDAmissed cleavage (57(isoT61)72) and their wild-type counterparts(57(M61)71) from shotgun deep proteome data of 8 NCI-60 cancer(wwwmaxquantorg [23]) Relative intensities are shown in log s

HSC70 was present in those cell lines Indeed in all sampleswhere M-to-isoT peptide was found the wild-type peptides57-71 tryptic peptide and 57-72 peptide with missed cleavagewere also observed Label-free quantitative analysis of relativecontent of both variants in the spectra has shown that theabundance of the modified peptide was considerably lowerthan the abundance of the wild-type peptide The averageintensity of modified M-to-isoT peptides amounted to only23 (standard deviation 13 CI 11 to 43) of such intensityfor the wild-type peptide (Fig 4)

37 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methionine toisothreonine conversion

Samples of bovine serum albumin (BSA) were prepared totest three different reduction and alkylation schemes Thealbumin was treated by sodium deoxycholate to mimic aprocedure that was routinely used in our lab to extractproteins from cells before shotgun proteome analysis [35]

K)71 variant peptide (57(isoT61)71) its +R72 peptide with one 57(NQVAMNPTNTVFDAK)71 (57(M61)71) and +R72 peptidecell lines [20] Data were obtained using MaxQuant softwarecale

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

174 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

both synthetic peptides were remarkably similar as shownin Fig 1 The only significant difference was observed inthe intensity of the peak at mz 628306 In three series of 5LCndashMSMS runs the intensity of this peak in the spectra ofthreonine-containing peptide was relatively low repre-senting about 3 of the maximal peak intensity (see Figs 12 and Supplementary Tables 12) In the spectra ofisothreonine-containing peptide the same fragment yieldssignificantly higher intensity peak of 30 of the maximalpeak Relative intensity of the corresponding peak in theNCI-60 proteome mass-spectra was found to be about 18(Fig 2)

Mass spectra of the synthetic peptide by direct injection toOrbitrap MS with CID fragmentation were also compared andthe remarkable similarity was obtained with their HCDprofiles (Supplementary Fig 1) with a higher intensity of thepeak at mz 628306 in the spectra of isothreonine-containingpeptide

The peak at mz 628306 represents a b6-ion with asequence of NQVA(TisoT)N+ Interestingly the correspondingy10-ion at mz 992505 PTNTVFDAKR+ has similar intensityin all three datasets (Fig 2) These data show that underthe conditions used the NQVATN+ ion is significantly lessstable compared with its isothreonine-containing isomerThis difference between the isomers may be explained bydifference in hydrogen bond networks in threonine andisothreonine peptides It has been shown elsewhere thathydrogen bonding influenced fragmentation process in iontraps such as by CID and even could be used to study suchbonding inmolecules [34] We further usedmolecularmodelingof peptides to predict difference in preferable hydrogen bonding(Section 35)

Thus the spectra of the NCI-60 proteome data are moresimilar to the isothreonine-containing peptide therebysupporting the hypothesis of methionine conversion to thesaid amino acid However this evidence is not enough forunambiguous decision towards methionine to isothreonine

Fig 2 ndash Box-and-whiskers diagrams of relative intensity of 6283(right panel) in Orbitrap spectra of 57(NQVA(TisoT)NPTNTVFDAKcancer cell line data [20] synthetic NQVA(isoT)NPTNTVFDAK (Isoexperiments 12 exemplary spectra were taken from 8 cancer ceanalytical series were made each containing 5 LCndashMSMS runs opercentage of maximum intensity in each spectrum (Supplemen

conversion in shotgun data This observation should besupported by experimentation with model proteins (seeSection 37 below)

35 Molecular modeling to predict differential hydrogen bondnetworks likely responsible for differential stability of TisoTpeptide fragment during HCD and CID fragmentation

Tripeptides ATN and A(isoT)N were modeled to estimatedifferential hydrogen bonding of hydroxyl moieties of threo-nine or isothreonine residues For ATN tripeptide 210 stableconformers were generated which have a range of internalenergy minus82 to minus38 kcalmol According to the calculationmore preferable conformers of this peptide have hydrogenbonding between the hydroxyl moiety of threonine andthe backbone oxygen atom of neighboring alanine residue(Fig 3A) In total there were 70 conformers with suchhydrogen bonding among all 210 structures In 25th and10th percentiles of conformers with minimal estimatedinternal energy ratios of such structures were 64 and 55respectively

For A(isoT)N tripeptide 179 stable conformers were gener-ated Overall an internal energy calculated for these structureswas higher than for ATN conformers minus279 to minus002 kcalmolPreferable conformations of this peptide contained hydrogenbonding between the hydroxyl moiety of isothreonine andthe backbone secondary amine of the same residue (Fig 3B)Such conformations reached 36 of total number of con-formers being inferior to structures where isothreoninehydroxyl moiety had no hydrogen bonding predicted (51)At the same time such a formation of internal hydrogenbond in isothreonine residue had benefit in energy In 25thand 10th percentiles of conformers with minimal estimatedinternal energy ratios of such conformers increased to 67and 89 respectively

Among predicted structures of both peptides the only statehas similar hydrogen bonding of hydroxyl moiety namely

06 Da b6-ion peak (left panel) and 992505 Da y10-ion peak)71 peptide of human HSC70 protein (Shotgun) from shotgunThr) and NQVATNPTNTVFDAK peptides (Thr) For shotgunll line deep proteome data [20] For each synthetic peptide 3f the same (N = 15) Relative intensity is calculated as atary Tables 1 and 2)

Fig 3 ndash Exemplary models of Ala-Thr-Asn (A) and Ala-isoThr-Asn (B) tripeptides with minimal internal energy Hydrogennitrogen and oxygen atoms are shown in light blue deep blue and red respectively Hydrogen bonds are depicted as dashedyellow lines N-terminal alanine residue is to the left in each panel Note hydrogen bonding between the hydroxyl moiety ofthreonine and the backbone oxygen atom of alanine residue (A) and between the hydroxyl moiety of isothreonine and thebackbone secondary amine of the same residue (B)

175J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

its bond with the backbone secondary amine of asparagineHowever such a structure formed in 16 and 8 of all cases inthreonine and isothreonine peptide respectively All possiblepredicted hydrogen bonds in tripeptides of interest are listed inSupplementary Table 3

Thus the calculationshave shown that inmodel tripeptideshydrogen bonding networks of threonine and isothreonineweredramatically different It may be further hypothesized that suchdifference causes differential MSMS spectra after CID or HCDfragmentation However to define generic principles of discrim-ination between threonine and isothreonine-containing pep-tides using MSMS further experimentation is needed which isout of the scope of the present work

36 Quantitative analysis of genetically encoded 57(NQVAMNPTNTVFDAK)71 and its M61(isoT) modificationin cancer cell line shotgun data

In the further analysis we studied deep proteome data for 9cell line to find whether the wild-type 57(M61)71 peptide of

Fig 4 ndash Label-free quantitation data of 57(NQVA(isoT)NPTNTVFDAmissed cleavage (57(isoT61)72) and their wild-type counterparts(57(M61)71) from shotgun deep proteome data of 8 NCI-60 cancer(wwwmaxquantorg [23]) Relative intensities are shown in log s

HSC70 was present in those cell lines Indeed in all sampleswhere M-to-isoT peptide was found the wild-type peptides57-71 tryptic peptide and 57-72 peptide with missed cleavagewere also observed Label-free quantitative analysis of relativecontent of both variants in the spectra has shown that theabundance of the modified peptide was considerably lowerthan the abundance of the wild-type peptide The averageintensity of modified M-to-isoT peptides amounted to only23 (standard deviation 13 CI 11 to 43) of such intensityfor the wild-type peptide (Fig 4)

37 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methionine toisothreonine conversion

Samples of bovine serum albumin (BSA) were prepared totest three different reduction and alkylation schemes Thealbumin was treated by sodium deoxycholate to mimic aprocedure that was routinely used in our lab to extractproteins from cells before shotgun proteome analysis [35]

K)71 variant peptide (57(isoT61)71) its +R72 peptide with one 57(NQVAMNPTNTVFDAK)71 (57(M61)71) and +R72 peptidecell lines [20] Data were obtained using MaxQuant softwarecale

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

Fig 3 ndash Exemplary models of Ala-Thr-Asn (A) and Ala-isoThr-Asn (B) tripeptides with minimal internal energy Hydrogennitrogen and oxygen atoms are shown in light blue deep blue and red respectively Hydrogen bonds are depicted as dashedyellow lines N-terminal alanine residue is to the left in each panel Note hydrogen bonding between the hydroxyl moiety ofthreonine and the backbone oxygen atom of alanine residue (A) and between the hydroxyl moiety of isothreonine and thebackbone secondary amine of the same residue (B)

175J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

its bond with the backbone secondary amine of asparagineHowever such a structure formed in 16 and 8 of all cases inthreonine and isothreonine peptide respectively All possiblepredicted hydrogen bonds in tripeptides of interest are listed inSupplementary Table 3

Thus the calculationshave shown that inmodel tripeptideshydrogen bonding networks of threonine and isothreonineweredramatically different It may be further hypothesized that suchdifference causes differential MSMS spectra after CID or HCDfragmentation However to define generic principles of discrim-ination between threonine and isothreonine-containing pep-tides using MSMS further experimentation is needed which isout of the scope of the present work

36 Quantitative analysis of genetically encoded 57(NQVAMNPTNTVFDAK)71 and its M61(isoT) modificationin cancer cell line shotgun data

In the further analysis we studied deep proteome data for 9cell line to find whether the wild-type 57(M61)71 peptide of

Fig 4 ndash Label-free quantitation data of 57(NQVA(isoT)NPTNTVFDAmissed cleavage (57(isoT61)72) and their wild-type counterparts(57(M61)71) from shotgun deep proteome data of 8 NCI-60 cancer(wwwmaxquantorg [23]) Relative intensities are shown in log s

HSC70 was present in those cell lines Indeed in all sampleswhere M-to-isoT peptide was found the wild-type peptides57-71 tryptic peptide and 57-72 peptide with missed cleavagewere also observed Label-free quantitative analysis of relativecontent of both variants in the spectra has shown that theabundance of the modified peptide was considerably lowerthan the abundance of the wild-type peptide The averageintensity of modified M-to-isoT peptides amounted to only23 (standard deviation 13 CI 11 to 43) of such intensityfor the wild-type peptide (Fig 4)

37 Bovine serum albumin treatment as a model of samplepreparation for shotgun proteomics to confirm methionine toisothreonine conversion

Samples of bovine serum albumin (BSA) were prepared totest three different reduction and alkylation schemes Thealbumin was treated by sodium deoxycholate to mimic aprocedure that was routinely used in our lab to extractproteins from cells before shotgun proteome analysis [35]

K)71 variant peptide (57(isoT61)71) its +R72 peptide with one 57(NQVAMNPTNTVFDAK)71 (57(M61)71) and +R72 peptidecell lines [20] Data were obtained using MaxQuant softwarecale

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

176 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Using various schemes we would like to estimate possible roleof heating during methionine modification by iodoacetamide(IAA) BSA sample 1 was treated by dithiotreitol (DTT) andIAA at ambient temperature without heating samples 2 and3 were heated to 95 degC before and during IAA treatment

Generally BSA sequence contains five methionine resi-dues Of themMet-1 is cleaved during natural processing andin addition is followed by arginine residue and cleaved as apart of dipeptide by trypsin Met-208 belongs to pentapeptidecleaved by trypsin and has low chance to be detected byshotgun analysis In accordance to these assumptions wecould detect the rest of three methionine-containing BSApeptides in all three samples analyzed As expected for all ofthem methionine-to-isothreonine alternative peptides weredetected (spectra exemplified at Fig 5) Despite withoutpeptide standards we cannot confirm isothreonine vsthreonine residues in them the only possible source of theirgeneration in these samples is IAA treatment Table 1summarizes spectral counts for all peptides of interestincluding intact methionine-containing peptides as well asmethionine-oxidized and methionine-to-isothreonine con-verted species The data on methionine modification inalbumin yields some interesting observations First Met-to-isoThr conversion by IAA is a common event even withoutheating when it concerns 2 to 8 of all methionine residuesSecond the short heating to 95 degC sometimes used insample preparation can increase the ratio of Met-to-isoThr

Fig 5 ndash Differences in MSMS spectra between the intact and mepeptide 106(ETYGDMADCCEK)117 (A) Part of the MSMS spectru106(ETYGDMADCCEK)117 with parental mass 1477516 Da (B) Paalbumin tryptic peptide 106(ETYGDisoTADCCEK)117 with parent

conversion up to 37 of all methionines the level beingcomparable to the ratio of methionine oxidation Notably ofthree considered methionine residues of BSA Met-571 wasless prone both to oxidation by oxygen and the modificationto isothreonine by IAA It could be caused by the chemicalenvironment of this residue which protected it from attack bymechanism yet unknown

Overall the experiment with BSA treatment was in thegood corresponding with the assumptions about methionine-to-isothreonine conversion stated above

38 Global analysis of М(iso)Т conversion in deep proteomedata for NCI-60 cell line

M(iso)T conversions were studied in deep proteome data of 9cell lines The results are presented in Table 2 On average52000 peptides were identified and 9 of these peptidescontained methionine The average percentage of peptideswith M(iso)T conversion was 55 and both non-modifiedand modified versions were identified for 81 of themPeptides with this modification were not considered if theyhad a threonine version of sequence among wild-type peptidesin the database The fraction of these peptides was near 1(data not shown) Thus under experimental conditions used in[20] methionine to threonine conversion was a frequent eventand could potentially influence the results of proteogenomicanalysis

thionine-converted forms of bovine serum albumin trypticm of the unmodified bovine serum albumin tryptic peptidert of the MSMS spectrum of the modified bovine serumal mass 1447523 Da

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

Table 1 ndash Spectral counts for methionine-containing peptides of bovine serum albumin (BSA) digest including intactmethionine-oxidized and methionine-to-isothreonine converted species All three samples of BSA are treated bydithiotreitol and iodoacetamide trypsin digested Sample 1 was treated at ambient temperature without heating sample2 and 3 were heated to 95 degC before and during IAA treatment Data are obtained by MaxQuant [23] free software

Methi-oninepositionin BSA

Peptidesequence

Total number ofPSMs for peptides

of interest

Methionine-oxidizedpeptides PSMs

( total)

Methionine-to-isothreonineconverted peptides

PSMs ( total)

Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3

M111 ETYGDMADCCEK 50 46 53 8 (16) 14 (30) 11 (21) 4 (8) 4 (9) 10 (19)M469 MPCTEDYLSLILNR 159 183 268 25 (16) 34 (19) 52 (19) 6 (4) 6 (3) 98 (37)M571 TVMENFVAFVDK 124 113 95 6 (5) 3 (3) 5 (5) 3 (2) 3 (3) 5 (5)

Completely cleaved trypsin peptides were detected predominantly All data include corresponding peptides with trypsin missed cleavages

177J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

4 Conclusions

Analysis of deep proteome data for 9 NCI-60 cancer cell linesusing the database from cancer genomes returned high-confident identification of a tryptic peptide of the abundantmolecular chaperone HSC70 in which methionine in theposition 61was converted to threonine or isothreonine residuesNo traces of such genetic alteration were found in the own cellline genomes and strong evidencewas reported that this changeisnot encodedby thenucleic acidMoreover the LCndashMSMSdataof the synthetic peptides with threonine or isothreonine(homoserine) residues pointed at the isothreonine residueproduced frommethionine in the cell lines

After examination of the sample preparation protocol usedin the paper [20] we propose that the alkylation of methionineresidue by iodacetamide followed by homoserine formationafter sample boiling is responsible for the effect We foundsignificant yield for the reaction in the datasets under studyThere were 2 to 3 of wild-type sequences modified in theHSC70 protein

Global analysis of spectra for the cancer cell lines usingMet to (iso)-Thr as a variablemodificationdemonstrated thatupto 10 of methionine-containing peptides identified from thedata had their corresponding Met to iso-Thr variant Now wecan claim these variants to be isothreonine-containing dueto the absence of genetic coding for them in the publishedgenomes

Table 2 ndash The number of total identified peptides peptides withthe 9 NCI60 deep proteome cell lines [20] at 1 PSM FDR

Cell line Identifiedpeptides

Identified peptideswith Met

COLON_COLO205 52479 6057RENAL_RXF393 61878 9741PROSTATE_PC3 50203 5213NSCLC_H460 39792 2862CNS_U251 52353 6083OVAR_SKOV3 48110 4000MELAN_M14 55169 4924BREAST_MCF7 53451 4647LEUK_CCRFCEM 38894 1348

Due to indirect evidence of the Met to iso-Thr conversionin open data produced by [20] we have managed anexperiment that mimicked conventional conditions for shot-gun analysis using the bovine albumin as a model proteinThis experiment perfectly illustrated a fact of Met to isoThrconversion with yield up to 37 when the sample was heated

Cancer proteogenomics is an emerging field pushed bygrowing availability of tumor genome sequencing It is espe-cially important to detect as many as possible somaticallymutated proteins which can affect molecular pathways andprovide growth and survival of tumor cells However thenumber of known cancer specific protein coding variants is solarge that false positive variant peptide identifications are themost probable outcome of the proteome data search In thisreport we demonstrated that an artifact of sample preparationwhich mimics the Met to Thr nucleic acid-encoded variant isone of the sources of false positive hits during peptideidentification Methionine to isothreonine conversion should betaken into account in the corresponding workflows especially ifthey involve a sample heating after iodoacetamide treatment

Supplementary data to this article can be found online athttpdxdoiorg101016jjprot201503003

Transparency document

The Transparency document associated with this article canbe found in the online version

methionine and peptides withMet gt (iso)Thr conversion for

Identifiedpeptides with Met gt Thr

Both Met gt (iso)Thr andMet sequences identified

289 246119 88287 241247 201231 184214 171347 289331 279255 144

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53

178 J O U R N A L O F P R O T E O M I C S 1 2 0 ( 2 0 1 5 ) 1 6 9 ndash 1 7 8

Acknowledgments

The work was supported by the Russian Scientific Fund grant14-15-00395

R E F E R E N C E S

[1] Makarov A Electrostatic axially harmonic orbital trapping ahigh-performance technique of mass analysis Anal Chem200072(6)1156ndash62

[2] PirmoradianM BudamguntaH Chingin K Zhang B et al Rapidand deep human proteome analysis by single-dimensionshotgun proteomics Mol Cell Proteomics 2013123330ndash8

[3] Nagaraj N Kulak NA Cox J Neuhauser N Mayr K Hoerning Oet al System-wide perturbation analysis with nearlycomplete coverage of the yeast proteome by single-shot ultraHPLC runs on a bench top Orbitrap Mol Cell Proteomics 201211(3) [M111013722]

[4] Hebert AS Richards AL Bailey DJ Ulbrich A Coughlin EEWestphall MS et al The one hour yeast proteome Mol CellProteomics 201413(1)339ndash47

[5] Nesvizhskii AI Protein identification by tandem massspectrometry and sequence database searching MethodsMol Biol 200736787ndash119

[6] Shteynberg D Nesvizhskii AI Moritz RL Deutsch EWCombining results of multiple search engines in proteomicsMol Cell Proteomics 2013122383ndash93

[7] Jeong K Kim S Pevzner PA UniNovo a universal tool for denovo peptide sequencing Bioinformatics 2013291953ndash62

[8] Pan C Park BH McDonald WH Carey PA Banfield JFVerBerkmoes NC et al A high-throughput de novo sequencingapproach for shotgun proteomics using high-resolutiontandemmass spectrometry BMC Bioinformatics 201011118

[9] Chi H Chen H He K Wu L Yang B Sun RX et al pNovo+ denovo peptide sequencing using complementary HCD and ETDtandem mass spectra J Proteome Res 201312(2)615ndash25

[10] Allmer J Algorithms for the de novo sequencing of peptidesfrom tandem mass spectra Expert Rev Proteomics 20118(5)645ndash57

[11] Hughes C Ma B Lajoie GA De novo sequencing methods inproteomics Methods Mol Biol 2010604105ndash21

[12] Bunger MK Cargile BJ Sevinsky JR Deyanova E Yates NAHendrickson RC et al Detection and validation ofnon-synonymous coding SNPs from orthogonal analysis ofshotgun proteomics data J Proteome Res 200762331ndash40

[13] Menon R Im H Zhang EY Wu SL Chen R Snyder M et alDistinct splice variants and pathway enrichment in thecell-line models of aggressive human breast cancersubtypes J Proteome Res 201413(1)212ndash27

[14] Nesvizhskii AI Proteogenomics concepts applications andcomputational strategies Nat Methods 201411(11)1114ndash25

[15] Helmy M Sugiyama N Tomita M Ishihama YOnco-proteogenomics a novel approach to identifycancer-specific mutations combining proteomics andtranscriptome deep sequencing Genome Biol 201011(Suppl 1)17

[16] Karpova MA Karpov DS Ivanov MV Pyatnitskiy MAChernobrovkin AL Lobas AA et al Exome-drivencharacterization of the cancer cell lines at the proteome levelthe NCI-60 case study J Proteome Res 201413(12)5551ndash60

[17] Alfaro JA Sinha A Kislinger T Boutros PCOnco-proteogenomics cancer proteomics joins forces withgenomics Nat Methods 2014111107ndash13

[18] Sheynkman GM Shortreed MR Frey BL Scalf M Smith LMLarge-scale mass spectrometric detection of variant peptidesresulting from nonsynonymous nucleotide differences JProteome Res 201413(1)228ndash40

[19] Hao P Ren Y Alpert AJ Sze SK Detection evaluation andminimization of nonenzymatic deamidation in proteomicsample preparation Mol Cell Proteomics 201110(10)[O111009381]

[20] Moghaddas Gholami A Hahne H Wu Z Auer FJ et al Globalproteome analysis of the NCI-60 cell line panel Cell Rep 20134609ndash20

[21] The Cancer Genome Atlas Network Comprehensivemolecular characterization of human colon and rectal cancerNature 2012487330ndash7

[22] Woo S Cha SW Na S Guest C Liu T Smith RD et alProteogenomic strategies for identification of aberrant cancerpeptides using large-scale next-generation sequencing dataProteomics 201414(23ndash24)2719ndash30

[23] Neuhauser N Nagaraj N McHardy P Zanivan S Scheltema RCox J et al High performance computational analysis oflarge-scale proteome data sets to assess incrementalcontribution to coverage of the human genome J ProteomeRes 201312(6)2858ndash68

[24] Craig R Beavis RC TANDEM matching proteins with tandemmass spectra Bioinformatics 200420(9)1466ndash7

[25] Ivanov MV Levitsky LI Lobas AA Panic T Laskay UumlAMitulovic G et al Empirical multidimensional space forscoring peptide spectrum matches in shotgun proteomics JProteome Res 201413(4)1911ndash20

[26] Stewart JJP Optimization of parameters for semiempiricalmethods I Method J Comput Chem 198910209ndash20

[27] Abaan OD Polley EC Davis SR Zhu YJ Bilke S Walker RLet al The exomes of the NCI-60 panel a genomic resource forcancer biology and systems pharmacology Cancer Res 201373(14)4372ndash82

[28] Forbes SA Bindal N Bamford S Cole C Kok CY Beare D et alCOSMIC mining complete cancer genomes in the Catalogueof Somatic Mutations in Cancer Nucleic Acids Res 201139D945ndash50

[29] Blanc V Davidson NO C-to-U RNA editing mechanismsleading to genetic diversity J Biol Chem 20032781395ndash8

[30] Schroeder WA Shelton JB Shelton JR An examination ofconditions for the cleavage of polypeptide chains withcyanogen bromide Arch Biochem Biophys 1969130(1)551ndash6

[31] Kruumlger R Hung CW Edelson-Averbukh M Lehmann WDIodoacetamide-alkylated methionine can mimic neutral lossof phosphoric acid from phosphopeptides as exemplified bynano-electrospray ionization quadrupole time-of-flightparent ion scanning Rapid Commun Mass Spectrom 200519(12)1709ndash16

[32] Lundblad RL Techniques in protein modification LondonCRC PressTaylor amp Francis Publishing Group 1994

[33] Gundlach HG Moore S Stein WH The reaction of iodoacetatewith methionine J Biol Chem 1959234(7)1761ndash4

[34] Su HF Xue L Li YH Lin SC Wen YM Huang RB et al Probinghydrogen bond energies by mass spectrometry J Am ChemSoc 2013135(16)6122ndash9

[35] Zhou J Zhou T Cao R Liu Z Shen J Chen P et al Evaluationof the application of sodium deoxycholate to proteomicanalysis of rat hippocampal plasma membrane J ProteomeRes 20065(10)2547ndash53