Plasma Proteome Database as a resource for proteomics research: 2014 update
-
Upload
johnshopkins -
Category
Documents
-
view
1 -
download
0
Transcript of Plasma Proteome Database as a resource for proteomics research: 2014 update
Plasma Proteome Database as a resource forproteomics research 2014 updateVishalakshi Nanjappa12 Joji Kurian Thomas12 Arivusudar Marimuthu1
Babylakshmi Muthusamy13 Aneesha Radhakrishnan14 Rakesh Sharma5
Aafaque Ahmad Khan1 Lavanya Balakrishnan16 Nandini A Sahasrabuddhe1
Satwant Kumar1 Binit Nitinbhai Jhaveri7 Kaushal Vinaykumar Sheth7
Ramesh Kumar Khatana8 Patrick G Shaw9 Srinivas Manda Srikanth13
Premendu P Mathur4 Subramanian Shankar10 Dindagur Nagaraja11 Rita Christopher5
Suresh Mathivanan12 Rajesh Raju1 Ravi Sirdeshmukh1 Aditi Chatterjee14
Richard J Simpson12 H C Harsha1 Akhilesh Pandey13141516 and
T S Keshava Prasad123
1Institute of Bioinformatics International Technology Park Bangalore 560 066 Karnataka India 2Amrita Schoolof Biotechnology Amrita University Kollam 690 525 Kerala India 3Centre of Excellence in BioinformaticsSchool of Life Sciences Pondicherry University Puducherry 605 014 India 4Department of Biochemistry andMolecular Biology Pondicherry University Puducherry 605014 India 5Department of Neurochemistry NationalInstitute of Mental Health and Neurosciences Bangalore 560 022 Karnataka India 6Department ofBiotechnology Kuvempu University Shankaraghatta 577 451 Karnataka India 7Government Medical CollegeBhavnagar 364 001 Gujarat India 8Mahatma Gandhi Institute of Medical Sciences Sevagram Wardha 442012 Maharashtra India 9The Department of Environmental Health Sciences Johns Hopkins Bloomberg Schoolof Public Health Baltimore MD 21205 USA 10Department of Internal Medicine Armed Forces MedicalCollege Pune 411 040 Maharashtra India 11Department of Neurology National Institute of Mental Health andNeurosciences Bangalore 560 022 Karnataka India 12Department of Biochemistry La Trobe Institute forMolecular Science La Trobe University Melbourne Victoria 3084 Australia 13McKusick-Nathans Institute ofGenetic Medicine Johns Hopkins University Baltimore MD 21205 USA 14Department of Biological ChemistryJohns Hopkins University Baltimore MD 21205 USA 15Department of Oncology Johns Hopkins UniversityBaltimore MD 21205 USA and 16Department of Pathology Johns Hopkins University BaltimoreMD 21205 USA
Received August 16 2013 Revised November 10 2013 Accepted November 11 2013
ABSTRACT
Plasma Proteome Database (PPD httpwwwplasmaproteomedatabaseorg) was initiallydescribed in the year 2005 as a part of HumanProteome Organizationrsquos (HUPOrsquos) pilot initiativeon Human Plasma Proteome Project Since thenimprovements in proteomic technologies andincreased throughput have led to identification of alarge number of novel plasma proteins To keep upwith this increase in data we have significantlyenriched the proteomic information in PPD This
database currently contains information on 10 546proteins detected in serumplasma of which 3784have been reported in two or more studies Thelatest version of the database also incorporatesmass spectrometry-derived data including experi-mentally verified proteotypic peptides used formultiple reaction monitoring assays Other novelfeatures include published plasmaserum concen-trations for 1278 proteins along with a separatecategory of plasma-derived extracellular vesicleproteins As plasma proteins have become a major
To whom correspondence should be addressed Tel +410 502 6662 Fax +410 502 7544 Email pandeyjhmieduCorrespondence may also be addressed to T S Keshava Prasad Tel +080 28416140 Fax +28416132 Email keshavibioinformaticsorgPresent addressDindagur Nagaraja Dharwad Institute of Mental Health and Neuro Sciences (DIMHANS) Dharwad Karnataka India
Published online 3 December 2013 Nucleic Acids Research 2014 Vol 42 Database issue D959ndashD965doi101093nargkt1251
The Author(s) 2013 Published by Oxford University PressThis is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (httpcreativecommonsorglicensesby-nc30) which permits non-commercial re-use distribution and reproduction in any medium provided the original work is properly cited For commercialre-use please contact journalspermissionsoupcom
thrust in the field of biomarkers we have enabled abatch-based query designated Plasma ProteomeExplorer which will permit the users in screening alist of proteins or peptides against known plasmaproteins to assess novelty of their data set Webelieve that PPD will facilitate both clinical andbasic research by serving as a comprehensive ref-erence of plasma proteins in humans and acceleratebiomarker discovery and translation efforts
INTRODUCTION
Plasma proteome represents an important subproteomeas it harbors proteins secreted from almost all tissues (12)In addition to classical blood proteins plasma containsproteins secreted by various cells glands and tissuesalong with proteins derived from commensal and infec-tious organisms and parasites residing inside the body(1) The plasma proteome comprises 22 highly abundantproteins including albumin immunoglobulins transferrinand haptoglobin which make up 99 of total proteinabundance in plasma The remaining fraction iscomposed of proteins of much lower abundance includingproteolytically cleaved protein fragments (3) This widedynamic range of protein abundance greater than 10orders of magnitude makes plasma proteome achallenging proteome to analyze Plasma proteins alsoundergo various types of post-translational modificationssuch as glycosylation which add to its complexity (45)An additional level of heterogeneity is brought about byinter- and intra-individual variation due to age genderand genetic factors All these factors make plasma themost complex and diverse body fluid and its analysisposes challenge for the proteomic communityDespite the analysis of plasma not being straightforward
from an analytical standpoint it is the most investigatedbody fluid in clinical diagnostics Components of plasmaincluding circulating tumor cells cell-free RNA and DNAmetabolites electrolytes and proteins are all considered asmolecular markers for early detection of diseases diseasemonitoring and prognosis (6ndash9) When compared withother body fluids such as cerebrospinal fluid gastric juicebile and synovial fluid plasma is more readily accessibleand requires a simple collection procedure (11011)Thus detection and quantitation of endogenous orforeign antigens or antibodies directed against theseantigens in plasma could be used to determine the physio-logical and pathological states of an individual (4) Severalplasma or serum proteins have already been identified aspotential biomarkers of diseases including cardiovasculardiseases autoimmune diseases infectious diseases andneurological disorders (12ndash19)Owing to the importance of plasma proteins several
proteomic efforts have been carried out to explorehuman plasma proteins In 2002 Anderson andAnderson compiled immunoassay- and 2D gel electro-phoresis-based investigations of plasma proteome andreported the presence of 289 proteins (1) Adkins et alused two different separation techniques followed bymass spectrometry (MS) for characterization of proteins
from depleted serum and reported 490 proteins (20)Human Proteome Organization (HUPO) initiated a pilotphase of a major community initiativemdashthe HumanPlasma Proteome Project (HPPP)mdashin 2002 to determinethe human plasma or serum protein constituents (21)With the involvement of 35 laboratories across theglobe this led to the identification of 3020 plasmaproteins (22) As a part of this initiative we developed aweb-based resource called the Plasma Proteome Database(PPD) (23) Recent incorporation of depletion strategiesto remove high-abundance proteins and multiple fraction-ation strategies coupled to LC-MSMS approaches andhigh-resolution MS have resulted in a substantialincrease in the number of proteins identified fromplasma For example a study by Liu et al coupled twodifferent fractionation methods to MS and identified 9087proteins in plasma which is the largest data set on plasmaproteins reported thus far (24) Farrah et al have analyzedraw MSMS data from plasmaserum submitted to theProteomics Identifications Database (PRIDE) andHuman Plasma PeptideAtlas using Trans-ProteomicPipeline and reported a set of high-confidence 1929proteins (25ndash27) We systematically documented thesenewly described human plasma proteins and made themavailable for the biomedical community through theupdated version of PPD
RESULTS
Documentation of plasma proteins in PPD
Plasma proteome occupies an important niche at the inter-face of proteomics diagnostics and medicine To enablesharing of data across laboratories involved in HPPP wedeveloped the PPD as a web-based resource for plasmaproteins (23) To keep up with exponential increase inplasma proteome data published in recent times we sys-tematically documented newly described human plasmaproteins and curated information pertaining to them inPPD For this as a first step we carried out a PubMed-based literature by searching for groups that have exten-sively contributed to plasma or serum proteomicsAdditional PubMed searches were carried out to fetch sci-entific articles by using keywords pertaining to plasmaproteins The information from articles was annotated inPPD by experienced researchers at the Institute ofBioinformatics (IOB) along with clinical investigatorsfrom collaborating clinical centers in India CurrentlyPPD contains 10 546 proteins linked to 509 scientificarticles However as experimental data were obtainedfrom diverse experimental platforms and published inthe form of peer-reviewed research articles we did notattempt any post hoc determination of any false positivesin the database However as an alternative to increase theconfidence of identification of proteins in plasma orserum we have included the number of articles reportingthe detection of each protein in plasma or serum This islisted as lsquoTotal number of studiesrsquo under lsquoExperimentalevidencersquo in the molecule page of respective protein asshown in Figure 1 In this database the detection of 12proteins in plasma is supported by gt50 publications and
D960 Nucleic Acids Research 2014 Vol 42 Database issue
that of 167 proteins is supported by gt10 publications Ofthe remaining 10 367 proteins 2199 are supported by twopublications whereas 6762 proteins are supported by asingle publication A histogram depicting the distributionof proteins with corresponding number of publications isshown Figure 2 In the current update we have includedpeptide data for proteins identified from MS-basedstudies which is one of the newly added features ofPPD A comparison of previous and current versionsof PPD provided in Table 1 shows a 3-fold increase ofprotein information in this 2014 update From ouranalysis we observed that 4668 and 1972 proteins wereunique to PPD when compared with data from otherpublicly available data resources on plasma or serumproteinsmdashthe Sys-BodyFluid and Healthy HumanIndividualrsquos Integrated Plasma Proteome (HIP2) respect-ively (2829) The data present in the Sys-BodyFluiddatabase were annotated from 15 articles whereas theHIP2 database contained plasma proteome data catalogedfrom 4 articles In contrast to other databases that provideplasmaserum protein catalogs generated from a limitednumber of studies PPD includes data annotated from509 peer-reviewed publications
In addition to documentation of plasma proteins andpeptide information for each protein in PPD we haveprovided external links to Human Protein ReferenceDatabase (HPRD) (30) NetPath (31) Entrez Gene (32)and UniProt (33) (Figure 1) For the benefit of users acollection of important published research articles andreviews describing high-throughput investigations onplasmaserum proteome has been provided We areseeking participation of the scientific community byinviting investigators to submit relevant articles for inclu-sion in PPD and have created a new portal in PPD tosubmit such articles To facilitate engagement with thecommunity there is a lsquofeedbackrsquo option which allowsusers to submit their comments or suggestions Finallywe have provided an option to download PPD datain XML and Microsoft Excel formats (httpwwwplasmaproteomedatabaseorgdownload)
Plasma concentration of proteins
An important step in plasma biomarker discovery is therelative quantification of proteins in healthy and diseaseconditions Determination of relative and absolute
A
B
D
C
Figure 1 Data analysis workflow using Plasma Proteome Explorer and a screenshot of a lsquoMolecule Pagersquo for Haptoglobin in PPD When queriedusing UniProt IDs (as shown in A) the plasma proteome explorer displays two different type of results (shown in B) These correspond to(i) proteins present in PPD which are hyperlinked to their corresponding molecule pages and (ii) proteins not present in PPD which are linkedto an external database UniProt Clicking the molecule leads the user to the respective molecule page (shown in C) The graphical representationshows domains and motifs found in the protein The molecule page also displays the alternate localization of protein the associated biologicalprocess and molecular function of the protein In addition the plasma concentration reported in healthy individuals along with correspondingPubMed identifiers and any other information regarding presence in plasma EVs is displayed Further MRM data is provided with information onproteotypic peptides peptide mz values charge collision energy transitions determined type of fragment ions and the mass spectrometer usedalong with a link to the corresponding publication (shown in D)
Nucleic Acids Research 2014 Vol 42 Database issue D961
concentration of plasma proteins is helpful for monitoringdisease predisposition onset diagnosis and progressionSeveral studies have been carried out in this directionAn early effort was reported by Anderson and Andersonin 2002 where they cataloged plasma concentration rangefor 70 proteins from published literature (1) Subsequentlyin 2005 and 2008 their group documented plasma levelsfor 177 and 211 proteins respectively which werereported to be associated with cardiovascular diseasestroke and cancer based on literature evidence (3435)Recently Farrah et al provided an estimate for plasmalevels for 1200 proteins based on spectral counting (26)
We have incorporated plasma protein levels in this updateof PPD and documented plasma protein concentrationsfor 1278 proteins from 276 articles Availability of thisinformation will be useful for both clinicians and re-searchers alike To document plasma protein levels wesearched PubMed using Gene symbols OR Protein nameOR alternate name(s) of the protein AND Plasma levelOR Serum level as queries We engaged four medicalstudents (SK BNJ KVS and RKK who are alsoco-authors) to document the plasma protein levels innormal individuals from the literature The plasmaprotein concentration data in PPD can be browsed asshown in Figure 1 along with the method used for themeasurement in each case We also developed a web-based submission portal (httpwwwplasmaproteomedatabaseorgprotein_concentration) for submission ofdata pertaining to protein concentration in plasmaserumby the clinicalanalytical community The researchers cansubmit the information obtained from their study throughthis portal and the information will be processed in PPDalong with the source of the information The lsquoPlasmaprotein concentration datarsquo file contains a list of plasmaproteins sorted based on their concentration in plasmaserum This file can be downloaded at httpwwwplasmaproteomedatabaseorgdownload
Incorporation of data from multiple reactionmonitoring experiments
Traditionally enzyme-linked immunosorbent assay andwestern blotting have been used for protein identificationand quantitation in plasma The success of these conven-tional methods relies heavily on the availability of specificantibodies against each protein These methods alsoprovide limited throughput as each assay can monitorone protein at a time Recent advances in MS haveenabled simultaneous identification and quantitation ofhundreds of proteins in plasma in a single experiment(36ndash38) Targeted approaches like multiple reaction moni-toring (MRM) have enabled quantitation of plasmaproteins with high specificity as well as increased through-put (3638)
MRM assays use multiple parameters to identifyand quantify specific proteins in complex samplesProteotypic peptides that uniquely represent a proteinare first selected and assays are developed which takeinto account the retention time of the peptides their mzvalues and measurement of specific fragment ions (39)This multi-step process provides superior specificity andallows for accurate quantification of protein levels over awide dynamic range of abundance (3640) MRM analysishas thus revolutionized biomarker research and is beingwidely used to determine altered protein levels acrossvarious disease phenotypes A compilation of proteotypicpeptides of plasma proteins that generate good signals inMS experiments along with their transitions will be im-mensely beneficial in developing MRM assays In PPDwe have annotated 591 reported peptides corresponding to279 proteins for MRM analysis This includes details ofprecursor mz fragment ion mz collision energy charge
1-10
11-1
00
101-
1000
1001
-200
0
gt200
0
0
100
200
300
400
500
Number of proteins
Nu
mb
er o
f ar
ticl
es
Figure 2 Distribution of proteins annotated in PPD according tonumber of corresponding publications The histogram shows thata high number of proteins (8793 proteins) in plasma have beenreported only in a single study Of the 509 articles annotated 426were found to support the presence of 10 proteins in plasma orserum
Table 1 A comparison of data in the current version of PPD with
the initial database in 2005
Data Features PPD-2005 PPD-2014 Number ofarticlesannotatedin the currentupdate
Total number of proteins 3778 10 546 509Total number of proteinswith plasma levels
Notannotated
1278 276
Total number of proteinsderived from plasmaextracellular vesicles
Notannotated
318 9
Total number of proteinswith MRM data
Notincluded
279 24
D962 Nucleic Acids Research 2014 Vol 42 Database issue
of the precursor ion and instrument used for the analysis(Figure 1)
Proteins identified from extracellular vesicles in plasma
Extracellular vesicles (EVs) are membranous sacs releasedby both normal and diseased cell types and are alsoreported in body fluids such as blood urine saliva andsynovial fluid (41ndash44) EVs include ectosomes exosomesand apoptotic bodies They facilitate communicationbetween cells by transfer of membrane and cytosolicproteins and are also shown to play role in mediatingimmune response and antigen presentation (44) EVs areshown to be present in plasma from healthy and certaincancer patients including glioblastoma and colon cancer(4546) Therefore identification and characterization ofEVs in plasma should aid in disease diagnosis and discov-ery of biomarkers Considering the importance of EVs wehave curated and included plasma EV proteins in PPDThere are 318 EV proteins annotated in PPD and each islinked to Exocarta and Vesiclepedia both manuallycurated resources for proteins messenger RNA microRNA and lipids reported from EVs including exosomes(4243)
Plasma Proteome Explorerndasha batch query interfacein PPD
We have enabled a batch query utility called lsquoPlasmaProteome Explorerrsquo which allows users to query theplasma proteome by providing a list of peptides genesymbols Entrez Gene IDs RefSeq accessions orUniProt IDs Data analysis workflow of plasmaproteome explorer is represented in Figure 1 The resultsobtained are displayed in a web page with followingfeaturesmdashPPD ID which are hyperlinked to PPDlsquomolecule pagersquo gene symbol and Entrez Gene ID
Plasma proteome explorer will enable biologists withlimited bioinformatics skills to compare their own experi-mental data set with that of plasma proteome In a singlestep users can differentiate between novel plasmaproteins novel MS-derived peptides detected fromplasma known plasma proteins and peptides This willenable researchers in biomarker development to screencandidate molecules that can be detected in plasmafrom a larger set of candidate protein biomarkersidentified in the discovery phase to pursue further for val-idation We believe that the plasma proteome explorer willoccupy the intermediate niche between lsquobiomarker discov-eryrsquo and lsquobiomarker validationrsquo phases We have enabled aseparate query tab which allows querying of proteinsbased on protein name gene symbol Entrez Gene IDUniProt ID sample type experimental method and MS-based platform In addition search by protein sequencecan be carried out in PPD The lsquoBrowsersquo option allows theuser to pursue proteins categorized on the basis of theirprotein name biological process molecular functionproteins with information on MRM data and plasmaexosomal proteins
CONCLUSIONS
The main goal of the PPD is to foster research in the areaof clinical proteomics especially the discovery of plasmabiomarkers to diagnose and monitor human diseases Wehave introduced a number of new features to expand theutility of this database particularly in the field of bio-marker discovery and validation Plasma proteomeexplorer will prove useful for selecting secreted candidatebiomarkers from a larger set of differentially expressedgenesproteins in any disease or perturbed biological con-ditions As the database develops we foresee the need forthe incorporation of information on disease associationand availability of reagents such as enzyme-linked im-munosorbent assay kits We anticipate that PPD willprovide a platform to bring together the communities ofbiomedical researchers engaged in biomarker discoveryand validation along with proteomic researchers develop-ing analytical solutions to overcome problems in plasmaproteome research We foresee that PPD will serve as acentralized reference repository for such a communityapproach
ACKNOWLEDGEMENTS
The authors thank the Department of Biotechnology(DBT) Government of India for research support tothe Institute of Bioinformatics Bangalore BM andAR are recipients of Senior Research Fellowship fromthe Council of Scientific and Industrial Research (CSIR)New Delhi India RS is a Research Associate supportedby DBT SMS is a recipient of Senior ResearchFellowship from University Grants Commission (UGC)India HCH is a Wellcome TrustDBT India AllianceEarly Career Fellow TSKP is a recipient of theresearch grant on lsquoDevelopment of Infrastructure and aComputational Framework for Analysis of ProteomicDatarsquo from DBT AP is supported by a contractHHSN268201000032C from the National Heart Lungand Blood Institute and an NIH Roadmap grantlsquoTechnology Center for Networks and Pathwaysrsquo[U54GM103520]
FUNDING
Funding for open access charge Waived by OxfordUniversity Press
Conflict of interest statement None declared
REFERENCES
1 AndersonNL and AndersonNG (2002) The human plasmaproteome history character and diagnostic prospects Mol CellProteomics 1 845ndash867
2 ZhangH LiuAY LoriauxP WollscheidB ZhouYWattsJD and AebersoldR (2007) Mass spectrometric detectionof tissue proteins in plasma Mol Cell Proteomics 6 64ndash71
3 TirumalaiRS ChanKC PrietoDA IssaqHJ ConradsTPand VeenstraTD (2003) Characterization of the low molecularweight human serum proteome Mol Cell Proteomics 21096ndash1103
Nucleic Acids Research 2014 Vol 42 Database issue D963
4 PingP VondriskaTM CreightonCJ GandhiTK YangZMenonR KwonMS ChoSY DrwalG KellmannM et al(2005) A functional annotation of subproteomes in humanplasma Proteomics 5 3506ndash3519
5 YangZ HancockWS ChewTR and BonillaL (2005) Astudy of glycoproteins in human serum and plasma referencestandards (HUPO) using multilectin affinity chromatographycoupled with RPLC-MSMS Proteomics 5 3353ndash3366
6 CristofanilliM BuddGT EllisMJ StopeckA MateraJMillerMC ReubenJM DoyleGV AllardWJTerstappenLW et al (2004) Circulating tumor cells diseaseprogression and survival in metastatic breast cancer N EnglJ Med 351 781ndash791
7 HanashSM PitteriSJ and FacaVM (2008) Mining theplasma proteome for cancer biomarkers Nature 452 571ndash579
8 KopreskiMS BenkoFA KwakLW and GockeCD (1999)Detection of tumor messenger RNA in the serum of patients withmalignant melanoma Clin Cancer Res 5 1961ndash1965
9 WangBG HuangHY ChenYC BristowRE KassaueiKChengCC RodenR SokollLJ ChanDW and Shih IeM(2003) Increased plasma DNA integrity in cancer patients CancerRes 63 3966ndash3968
10 LathropJT AndersonNL AndersonNG and HammondDJ(2003) Therapeutic potential of the plasma proteome Curr OpinMol Ther 5 250ndash257
11 RosenblattKP Bryant-GreenwoodP KillianJK MehtaAGehoD EspinaV PetricoinEF III and LiottaLA (2004)Serum proteomics in cancer diagnosis and management AnnuRev Med 55 97ndash112
12 AgranoffD Fernandez-ReyesD PapadopoulosMCRojasSA HerbsterM LoosemoreA TarelliE SheldonJSchwenkA PollokR et al (2006) Identification of diagnosticmarkers for tuberculosis by proteomic fingerprinting of serumLancet 368 1012ndash1021
13 BauerJW BaechlerEC PetriM BatliwallaFMCrawfordD OrtmannWA EspeKJ LiW PatelDDGregersenPK et al (2006) Elevated serum levels of interferon-regulated chemokines are biomarkers for active human systemiclupus erythematosus PLoS Med 3 e491
14 BerhaneBT ZongC LiemDA HuangA LeSEdmondsonRD JonesRC QiaoX WhiteleggeJP PingPet al (2005) Cardiovascular-related proteins identified in humanplasma by the HUPO plasma proteome project pilot phaseProteomics 5 3520ndash3530
15 HeQY LauGK ZhouY YuenST LinMC KungHFand ChiuJF (2003) Serum biomarkers of hepatitis B virusinfected liver inflammation a proteomic study Proteomics 3666ndash674
16 KimHJ ChoEH YooJH KimPK ShinJS KimMRand KimCW (2007) Proteome analysis of serum from type 2diabetics with nephropathy J Proteome Res 6 735ndash743
17 LiaoH WuJ KuhnE ChinW ChangB JonesMDOrsquoNeilS ClauserKR KarlJ HaslerF et al (2004) Use ofmass spectrometry to identify protein biomarkers of diseaseseverity in the synovial fluid and serum of patients withrheumatoid arthritis Arthritis Rheum 50 3792ndash3803
18 SawaiS UmemuraH MoriM SatohM HayakawaSKoderaY TomonagaT KuwabaraS and NomuraF (2010)Serum levels of complement C4 fragments correlate with diseaseactivity in multiple sclerosis proteomic analysis JNeuroimmunol 218 112ndash115
19 Zimmermann-IvolCG BurkhardPR Le Floch-RohrJAllardL HochstrasserDF and SanchezJC (2004) Fatty acidbinding protein as a serum marker for the early diagnosis ofstroke a pilot study Mol Cell Proteomics 3 66ndash72
20 AdkinsJN VarnumSM AuberryKJ MooreRJAngellNH SmithRD SpringerDL and PoundsJG (2002)Toward a human blood serum proteome analysis bymultidimensional separation coupled with mass spectrometryMol Cell Proteomics 1 947ndash955
21 OmennGS (2004) The Human Proteome Organization PlasmaProteome Project pilot phase reference specimens technologyplatform comparisons and standardized data submissions andanalyses Proteomics 4 1235ndash1240
22 OmennGS StatesDJ AdamskiM BlackwellTW MenonRHermjakobH ApweilerR HaabBB SimpsonRJ EddesJSet al (2005) Overview of the HUPO Plasma Proteome Projectresults from the pilot phase with 35 collaborating laboratoriesand multiple analytical groups generating a core dataset of 3020proteins and a publicly-available database Proteomics 53226ndash3245
23 MuthusamyB HanumanthuG SureshS RekhaB SrinivasDKarthickL VrushabendraBM SharmaS MishraGChatterjeeP et al (2005) Plasma Proteome Database as aresource for proteomics research Proteomics 5 3531ndash3536
24 LiuX ValentineSJ PlasenciaMD TrimpinS NaylorS andClemmerDE (2007) Mapping the human plasma proteome bySCX-LC-IMS-MS J Am Soc Mass Spectrom 18 1249ndash1264
25 FarrahT DeutschEW and AebersoldR (2011) Using thehuman plasma PeptideAtlas to study human plasma proteinsMethods Mol Biol 728 349ndash374
26 FarrahT DeutschEW OmennGS CampbellDS SunZBletzJA MallickP KatzJE MalmstromJ OssolaR et al(2011) A high-confidence human plasma proteome reference setwith estimated concentrations in PeptideAtlas Mol CellProteomics 10 M110 006353
27 VizcainoJA CoteRG CsordasA DianesJA FabregatAFosterJM GrissJ AlpiE BirimM ContellJ et al (2013)The PRoteomics IDEntifications (PRIDE) database andassociated tools status in 2013 Nucleic Acids Res 41D1063ndashD1069
28 LiSJ PengM LiH LiuBS WangC WuJR LiYX andZengR (2009) Sys-BodyFluid a systematical database for humanbody fluid proteome research Nucleic Acids Res 37 D907ndashD912
29 SahaS HarrisonSH ShenC TangH RadivojacPArnoldRJ ZhangX and ChenJY (2008) HIP2 an onlinedatabase of human plasma proteins from healthy individualsBMC Med Genomics 1 12
30 PrasadTSK GoelR KandasamyK KeerthikumarSKumarS MathivananS TelikicherlaD RajuR ShafreenBVenugopalA et al (2009) Human protein reference databasemdash2009 update Nucleic Acids Res 37 D767ndashD772
31 KandasamyK MohanSS RajuR KeerthikumarSKumarGS VenugopalAK TelikicherlaD NavarroJDMathivananS PecquetC et al (2010) NetPath a publicresource of curated signal transduction pathways Genome Biol11 R3
32 MaglottD OstellJ PruittKD and TatusovaT (2011) EntrezGene gene-centered information at NCBI Nucleic Acids Res 39D52ndashD57
33 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47
34 AndersonL (2005) Candidate-based proteomics in the search forbiomarkers of cardiovascular disease J Physiol 563 23ndash60
35 PolanskiM and AndersonNL (2007) A list of candidatecancer biomarkers for targeted proteomics Biomark Insights 11ndash48
36 DomanskiD PercyAJ YangJ ChambersAG HillJSFreueGV and BorchersCH (2012) MRM-based multiplexedquantitation of 67 putative cardiovascular disease biomarkers inhuman plasma Proteomics 12 1222ndash1243
37 HuttenhainR SosteM SelevsekN RostH SethiACarapitoC FarrahT DeutschEW KusebauchU MoritzRLet al (2012) Reproducible quantification of cancer-associatedproteins in body fluids using targeted proteomics Sci TranslMed 4 142ra194
38 PercyAJ ChambersAG YangJ and BorchersCH (2013)Multiplexed MRM-based quantitation of candidate cancerbiomarker proteins in undepleted and non-enriched humanplasma Proteomics 13 2202ndash2215
39 LangeV PicottiP DomonB and AebersoldR (2008) Selectedreaction monitoring for quantitative proteomics a tutorial MolSyst Biol 4 222
40 Stahl-ZengJ LangeV OssolaR EckhardtK KrekWAebersoldR and DomonB (2007) High sensitivity detection ofplasma proteins by multiple reaction monitoring of N-glycositesMol Cell Proteomics 6 1809ndash1817
D964 Nucleic Acids Research 2014 Vol 42 Database issue
41 CabyMP LankarD Vincendeau-ScherrerC RaposoG andBonnerotC (2005) Exosomal-like vesicles are present in humanblood plasma Int Immunol 17 879ndash887
42 KalraH SimpsonRJ JiH AikawaE AltevogtP AskenasePBondVC BorrasFE BreakefieldX BudnikV et al (2012)Vesiclepedia a compendium for extracellular vesicles withcontinuous community annotation PLoS Biol 10 e1001450
43 MathivananS FahnerCJ ReidGE and SimpsonRJ (2012)ExoCarta 2012 database of exosomal proteins RNA and lipidsNucleic Acids Res 40 D1241ndashD1244
44 SimpsonRJ LimJW MoritzRL and MathivananS (2009)Exosomes proteomic insights and diagnostic potential ExpertRev Proteomics 6 267ndash283
45 JiH GreeningDW BarnesTW LimJW TauroBJ RaiAXuR AddaC MathivananS ZhaoW et al (2013) Proteomeprofiling of exosomes derived from human primary andmetastatic colorectal cancer cells reveal differential expression ofkey metastatic factors and signal transduction componentsProteomics 13 1672ndash1686
46 ShaoH ChungJ BalajL CharestA BignerDD CarterBSHochbergFH BreakefieldXO WeisslederR and LeeH (2012)Protein typing of circulating microvesicles allows real-timemonitoring of glioblastoma therapy Nat Med 18 1835ndash1840
Nucleic Acids Research 2014 Vol 42 Database issue D965
thrust in the field of biomarkers we have enabled abatch-based query designated Plasma ProteomeExplorer which will permit the users in screening alist of proteins or peptides against known plasmaproteins to assess novelty of their data set Webelieve that PPD will facilitate both clinical andbasic research by serving as a comprehensive ref-erence of plasma proteins in humans and acceleratebiomarker discovery and translation efforts
INTRODUCTION
Plasma proteome represents an important subproteomeas it harbors proteins secreted from almost all tissues (12)In addition to classical blood proteins plasma containsproteins secreted by various cells glands and tissuesalong with proteins derived from commensal and infec-tious organisms and parasites residing inside the body(1) The plasma proteome comprises 22 highly abundantproteins including albumin immunoglobulins transferrinand haptoglobin which make up 99 of total proteinabundance in plasma The remaining fraction iscomposed of proteins of much lower abundance includingproteolytically cleaved protein fragments (3) This widedynamic range of protein abundance greater than 10orders of magnitude makes plasma proteome achallenging proteome to analyze Plasma proteins alsoundergo various types of post-translational modificationssuch as glycosylation which add to its complexity (45)An additional level of heterogeneity is brought about byinter- and intra-individual variation due to age genderand genetic factors All these factors make plasma themost complex and diverse body fluid and its analysisposes challenge for the proteomic communityDespite the analysis of plasma not being straightforward
from an analytical standpoint it is the most investigatedbody fluid in clinical diagnostics Components of plasmaincluding circulating tumor cells cell-free RNA and DNAmetabolites electrolytes and proteins are all considered asmolecular markers for early detection of diseases diseasemonitoring and prognosis (6ndash9) When compared withother body fluids such as cerebrospinal fluid gastric juicebile and synovial fluid plasma is more readily accessibleand requires a simple collection procedure (11011)Thus detection and quantitation of endogenous orforeign antigens or antibodies directed against theseantigens in plasma could be used to determine the physio-logical and pathological states of an individual (4) Severalplasma or serum proteins have already been identified aspotential biomarkers of diseases including cardiovasculardiseases autoimmune diseases infectious diseases andneurological disorders (12ndash19)Owing to the importance of plasma proteins several
proteomic efforts have been carried out to explorehuman plasma proteins In 2002 Anderson andAnderson compiled immunoassay- and 2D gel electro-phoresis-based investigations of plasma proteome andreported the presence of 289 proteins (1) Adkins et alused two different separation techniques followed bymass spectrometry (MS) for characterization of proteins
from depleted serum and reported 490 proteins (20)Human Proteome Organization (HUPO) initiated a pilotphase of a major community initiativemdashthe HumanPlasma Proteome Project (HPPP)mdashin 2002 to determinethe human plasma or serum protein constituents (21)With the involvement of 35 laboratories across theglobe this led to the identification of 3020 plasmaproteins (22) As a part of this initiative we developed aweb-based resource called the Plasma Proteome Database(PPD) (23) Recent incorporation of depletion strategiesto remove high-abundance proteins and multiple fraction-ation strategies coupled to LC-MSMS approaches andhigh-resolution MS have resulted in a substantialincrease in the number of proteins identified fromplasma For example a study by Liu et al coupled twodifferent fractionation methods to MS and identified 9087proteins in plasma which is the largest data set on plasmaproteins reported thus far (24) Farrah et al have analyzedraw MSMS data from plasmaserum submitted to theProteomics Identifications Database (PRIDE) andHuman Plasma PeptideAtlas using Trans-ProteomicPipeline and reported a set of high-confidence 1929proteins (25ndash27) We systematically documented thesenewly described human plasma proteins and made themavailable for the biomedical community through theupdated version of PPD
RESULTS
Documentation of plasma proteins in PPD
Plasma proteome occupies an important niche at the inter-face of proteomics diagnostics and medicine To enablesharing of data across laboratories involved in HPPP wedeveloped the PPD as a web-based resource for plasmaproteins (23) To keep up with exponential increase inplasma proteome data published in recent times we sys-tematically documented newly described human plasmaproteins and curated information pertaining to them inPPD For this as a first step we carried out a PubMed-based literature by searching for groups that have exten-sively contributed to plasma or serum proteomicsAdditional PubMed searches were carried out to fetch sci-entific articles by using keywords pertaining to plasmaproteins The information from articles was annotated inPPD by experienced researchers at the Institute ofBioinformatics (IOB) along with clinical investigatorsfrom collaborating clinical centers in India CurrentlyPPD contains 10 546 proteins linked to 509 scientificarticles However as experimental data were obtainedfrom diverse experimental platforms and published inthe form of peer-reviewed research articles we did notattempt any post hoc determination of any false positivesin the database However as an alternative to increase theconfidence of identification of proteins in plasma orserum we have included the number of articles reportingthe detection of each protein in plasma or serum This islisted as lsquoTotal number of studiesrsquo under lsquoExperimentalevidencersquo in the molecule page of respective protein asshown in Figure 1 In this database the detection of 12proteins in plasma is supported by gt50 publications and
D960 Nucleic Acids Research 2014 Vol 42 Database issue
that of 167 proteins is supported by gt10 publications Ofthe remaining 10 367 proteins 2199 are supported by twopublications whereas 6762 proteins are supported by asingle publication A histogram depicting the distributionof proteins with corresponding number of publications isshown Figure 2 In the current update we have includedpeptide data for proteins identified from MS-basedstudies which is one of the newly added features ofPPD A comparison of previous and current versionsof PPD provided in Table 1 shows a 3-fold increase ofprotein information in this 2014 update From ouranalysis we observed that 4668 and 1972 proteins wereunique to PPD when compared with data from otherpublicly available data resources on plasma or serumproteinsmdashthe Sys-BodyFluid and Healthy HumanIndividualrsquos Integrated Plasma Proteome (HIP2) respect-ively (2829) The data present in the Sys-BodyFluiddatabase were annotated from 15 articles whereas theHIP2 database contained plasma proteome data catalogedfrom 4 articles In contrast to other databases that provideplasmaserum protein catalogs generated from a limitednumber of studies PPD includes data annotated from509 peer-reviewed publications
In addition to documentation of plasma proteins andpeptide information for each protein in PPD we haveprovided external links to Human Protein ReferenceDatabase (HPRD) (30) NetPath (31) Entrez Gene (32)and UniProt (33) (Figure 1) For the benefit of users acollection of important published research articles andreviews describing high-throughput investigations onplasmaserum proteome has been provided We areseeking participation of the scientific community byinviting investigators to submit relevant articles for inclu-sion in PPD and have created a new portal in PPD tosubmit such articles To facilitate engagement with thecommunity there is a lsquofeedbackrsquo option which allowsusers to submit their comments or suggestions Finallywe have provided an option to download PPD datain XML and Microsoft Excel formats (httpwwwplasmaproteomedatabaseorgdownload)
Plasma concentration of proteins
An important step in plasma biomarker discovery is therelative quantification of proteins in healthy and diseaseconditions Determination of relative and absolute
A
B
D
C
Figure 1 Data analysis workflow using Plasma Proteome Explorer and a screenshot of a lsquoMolecule Pagersquo for Haptoglobin in PPD When queriedusing UniProt IDs (as shown in A) the plasma proteome explorer displays two different type of results (shown in B) These correspond to(i) proteins present in PPD which are hyperlinked to their corresponding molecule pages and (ii) proteins not present in PPD which are linkedto an external database UniProt Clicking the molecule leads the user to the respective molecule page (shown in C) The graphical representationshows domains and motifs found in the protein The molecule page also displays the alternate localization of protein the associated biologicalprocess and molecular function of the protein In addition the plasma concentration reported in healthy individuals along with correspondingPubMed identifiers and any other information regarding presence in plasma EVs is displayed Further MRM data is provided with information onproteotypic peptides peptide mz values charge collision energy transitions determined type of fragment ions and the mass spectrometer usedalong with a link to the corresponding publication (shown in D)
Nucleic Acids Research 2014 Vol 42 Database issue D961
concentration of plasma proteins is helpful for monitoringdisease predisposition onset diagnosis and progressionSeveral studies have been carried out in this directionAn early effort was reported by Anderson and Andersonin 2002 where they cataloged plasma concentration rangefor 70 proteins from published literature (1) Subsequentlyin 2005 and 2008 their group documented plasma levelsfor 177 and 211 proteins respectively which werereported to be associated with cardiovascular diseasestroke and cancer based on literature evidence (3435)Recently Farrah et al provided an estimate for plasmalevels for 1200 proteins based on spectral counting (26)
We have incorporated plasma protein levels in this updateof PPD and documented plasma protein concentrationsfor 1278 proteins from 276 articles Availability of thisinformation will be useful for both clinicians and re-searchers alike To document plasma protein levels wesearched PubMed using Gene symbols OR Protein nameOR alternate name(s) of the protein AND Plasma levelOR Serum level as queries We engaged four medicalstudents (SK BNJ KVS and RKK who are alsoco-authors) to document the plasma protein levels innormal individuals from the literature The plasmaprotein concentration data in PPD can be browsed asshown in Figure 1 along with the method used for themeasurement in each case We also developed a web-based submission portal (httpwwwplasmaproteomedatabaseorgprotein_concentration) for submission ofdata pertaining to protein concentration in plasmaserumby the clinicalanalytical community The researchers cansubmit the information obtained from their study throughthis portal and the information will be processed in PPDalong with the source of the information The lsquoPlasmaprotein concentration datarsquo file contains a list of plasmaproteins sorted based on their concentration in plasmaserum This file can be downloaded at httpwwwplasmaproteomedatabaseorgdownload
Incorporation of data from multiple reactionmonitoring experiments
Traditionally enzyme-linked immunosorbent assay andwestern blotting have been used for protein identificationand quantitation in plasma The success of these conven-tional methods relies heavily on the availability of specificantibodies against each protein These methods alsoprovide limited throughput as each assay can monitorone protein at a time Recent advances in MS haveenabled simultaneous identification and quantitation ofhundreds of proteins in plasma in a single experiment(36ndash38) Targeted approaches like multiple reaction moni-toring (MRM) have enabled quantitation of plasmaproteins with high specificity as well as increased through-put (3638)
MRM assays use multiple parameters to identifyand quantify specific proteins in complex samplesProteotypic peptides that uniquely represent a proteinare first selected and assays are developed which takeinto account the retention time of the peptides their mzvalues and measurement of specific fragment ions (39)This multi-step process provides superior specificity andallows for accurate quantification of protein levels over awide dynamic range of abundance (3640) MRM analysishas thus revolutionized biomarker research and is beingwidely used to determine altered protein levels acrossvarious disease phenotypes A compilation of proteotypicpeptides of plasma proteins that generate good signals inMS experiments along with their transitions will be im-mensely beneficial in developing MRM assays In PPDwe have annotated 591 reported peptides corresponding to279 proteins for MRM analysis This includes details ofprecursor mz fragment ion mz collision energy charge
1-10
11-1
00
101-
1000
1001
-200
0
gt200
0
0
100
200
300
400
500
Number of proteins
Nu
mb
er o
f ar
ticl
es
Figure 2 Distribution of proteins annotated in PPD according tonumber of corresponding publications The histogram shows thata high number of proteins (8793 proteins) in plasma have beenreported only in a single study Of the 509 articles annotated 426were found to support the presence of 10 proteins in plasma orserum
Table 1 A comparison of data in the current version of PPD with
the initial database in 2005
Data Features PPD-2005 PPD-2014 Number ofarticlesannotatedin the currentupdate
Total number of proteins 3778 10 546 509Total number of proteinswith plasma levels
Notannotated
1278 276
Total number of proteinsderived from plasmaextracellular vesicles
Notannotated
318 9
Total number of proteinswith MRM data
Notincluded
279 24
D962 Nucleic Acids Research 2014 Vol 42 Database issue
of the precursor ion and instrument used for the analysis(Figure 1)
Proteins identified from extracellular vesicles in plasma
Extracellular vesicles (EVs) are membranous sacs releasedby both normal and diseased cell types and are alsoreported in body fluids such as blood urine saliva andsynovial fluid (41ndash44) EVs include ectosomes exosomesand apoptotic bodies They facilitate communicationbetween cells by transfer of membrane and cytosolicproteins and are also shown to play role in mediatingimmune response and antigen presentation (44) EVs areshown to be present in plasma from healthy and certaincancer patients including glioblastoma and colon cancer(4546) Therefore identification and characterization ofEVs in plasma should aid in disease diagnosis and discov-ery of biomarkers Considering the importance of EVs wehave curated and included plasma EV proteins in PPDThere are 318 EV proteins annotated in PPD and each islinked to Exocarta and Vesiclepedia both manuallycurated resources for proteins messenger RNA microRNA and lipids reported from EVs including exosomes(4243)
Plasma Proteome Explorerndasha batch query interfacein PPD
We have enabled a batch query utility called lsquoPlasmaProteome Explorerrsquo which allows users to query theplasma proteome by providing a list of peptides genesymbols Entrez Gene IDs RefSeq accessions orUniProt IDs Data analysis workflow of plasmaproteome explorer is represented in Figure 1 The resultsobtained are displayed in a web page with followingfeaturesmdashPPD ID which are hyperlinked to PPDlsquomolecule pagersquo gene symbol and Entrez Gene ID
Plasma proteome explorer will enable biologists withlimited bioinformatics skills to compare their own experi-mental data set with that of plasma proteome In a singlestep users can differentiate between novel plasmaproteins novel MS-derived peptides detected fromplasma known plasma proteins and peptides This willenable researchers in biomarker development to screencandidate molecules that can be detected in plasmafrom a larger set of candidate protein biomarkersidentified in the discovery phase to pursue further for val-idation We believe that the plasma proteome explorer willoccupy the intermediate niche between lsquobiomarker discov-eryrsquo and lsquobiomarker validationrsquo phases We have enabled aseparate query tab which allows querying of proteinsbased on protein name gene symbol Entrez Gene IDUniProt ID sample type experimental method and MS-based platform In addition search by protein sequencecan be carried out in PPD The lsquoBrowsersquo option allows theuser to pursue proteins categorized on the basis of theirprotein name biological process molecular functionproteins with information on MRM data and plasmaexosomal proteins
CONCLUSIONS
The main goal of the PPD is to foster research in the areaof clinical proteomics especially the discovery of plasmabiomarkers to diagnose and monitor human diseases Wehave introduced a number of new features to expand theutility of this database particularly in the field of bio-marker discovery and validation Plasma proteomeexplorer will prove useful for selecting secreted candidatebiomarkers from a larger set of differentially expressedgenesproteins in any disease or perturbed biological con-ditions As the database develops we foresee the need forthe incorporation of information on disease associationand availability of reagents such as enzyme-linked im-munosorbent assay kits We anticipate that PPD willprovide a platform to bring together the communities ofbiomedical researchers engaged in biomarker discoveryand validation along with proteomic researchers develop-ing analytical solutions to overcome problems in plasmaproteome research We foresee that PPD will serve as acentralized reference repository for such a communityapproach
ACKNOWLEDGEMENTS
The authors thank the Department of Biotechnology(DBT) Government of India for research support tothe Institute of Bioinformatics Bangalore BM andAR are recipients of Senior Research Fellowship fromthe Council of Scientific and Industrial Research (CSIR)New Delhi India RS is a Research Associate supportedby DBT SMS is a recipient of Senior ResearchFellowship from University Grants Commission (UGC)India HCH is a Wellcome TrustDBT India AllianceEarly Career Fellow TSKP is a recipient of theresearch grant on lsquoDevelopment of Infrastructure and aComputational Framework for Analysis of ProteomicDatarsquo from DBT AP is supported by a contractHHSN268201000032C from the National Heart Lungand Blood Institute and an NIH Roadmap grantlsquoTechnology Center for Networks and Pathwaysrsquo[U54GM103520]
FUNDING
Funding for open access charge Waived by OxfordUniversity Press
Conflict of interest statement None declared
REFERENCES
1 AndersonNL and AndersonNG (2002) The human plasmaproteome history character and diagnostic prospects Mol CellProteomics 1 845ndash867
2 ZhangH LiuAY LoriauxP WollscheidB ZhouYWattsJD and AebersoldR (2007) Mass spectrometric detectionof tissue proteins in plasma Mol Cell Proteomics 6 64ndash71
3 TirumalaiRS ChanKC PrietoDA IssaqHJ ConradsTPand VeenstraTD (2003) Characterization of the low molecularweight human serum proteome Mol Cell Proteomics 21096ndash1103
Nucleic Acids Research 2014 Vol 42 Database issue D963
4 PingP VondriskaTM CreightonCJ GandhiTK YangZMenonR KwonMS ChoSY DrwalG KellmannM et al(2005) A functional annotation of subproteomes in humanplasma Proteomics 5 3506ndash3519
5 YangZ HancockWS ChewTR and BonillaL (2005) Astudy of glycoproteins in human serum and plasma referencestandards (HUPO) using multilectin affinity chromatographycoupled with RPLC-MSMS Proteomics 5 3353ndash3366
6 CristofanilliM BuddGT EllisMJ StopeckA MateraJMillerMC ReubenJM DoyleGV AllardWJTerstappenLW et al (2004) Circulating tumor cells diseaseprogression and survival in metastatic breast cancer N EnglJ Med 351 781ndash791
7 HanashSM PitteriSJ and FacaVM (2008) Mining theplasma proteome for cancer biomarkers Nature 452 571ndash579
8 KopreskiMS BenkoFA KwakLW and GockeCD (1999)Detection of tumor messenger RNA in the serum of patients withmalignant melanoma Clin Cancer Res 5 1961ndash1965
9 WangBG HuangHY ChenYC BristowRE KassaueiKChengCC RodenR SokollLJ ChanDW and Shih IeM(2003) Increased plasma DNA integrity in cancer patients CancerRes 63 3966ndash3968
10 LathropJT AndersonNL AndersonNG and HammondDJ(2003) Therapeutic potential of the plasma proteome Curr OpinMol Ther 5 250ndash257
11 RosenblattKP Bryant-GreenwoodP KillianJK MehtaAGehoD EspinaV PetricoinEF III and LiottaLA (2004)Serum proteomics in cancer diagnosis and management AnnuRev Med 55 97ndash112
12 AgranoffD Fernandez-ReyesD PapadopoulosMCRojasSA HerbsterM LoosemoreA TarelliE SheldonJSchwenkA PollokR et al (2006) Identification of diagnosticmarkers for tuberculosis by proteomic fingerprinting of serumLancet 368 1012ndash1021
13 BauerJW BaechlerEC PetriM BatliwallaFMCrawfordD OrtmannWA EspeKJ LiW PatelDDGregersenPK et al (2006) Elevated serum levels of interferon-regulated chemokines are biomarkers for active human systemiclupus erythematosus PLoS Med 3 e491
14 BerhaneBT ZongC LiemDA HuangA LeSEdmondsonRD JonesRC QiaoX WhiteleggeJP PingPet al (2005) Cardiovascular-related proteins identified in humanplasma by the HUPO plasma proteome project pilot phaseProteomics 5 3520ndash3530
15 HeQY LauGK ZhouY YuenST LinMC KungHFand ChiuJF (2003) Serum biomarkers of hepatitis B virusinfected liver inflammation a proteomic study Proteomics 3666ndash674
16 KimHJ ChoEH YooJH KimPK ShinJS KimMRand KimCW (2007) Proteome analysis of serum from type 2diabetics with nephropathy J Proteome Res 6 735ndash743
17 LiaoH WuJ KuhnE ChinW ChangB JonesMDOrsquoNeilS ClauserKR KarlJ HaslerF et al (2004) Use ofmass spectrometry to identify protein biomarkers of diseaseseverity in the synovial fluid and serum of patients withrheumatoid arthritis Arthritis Rheum 50 3792ndash3803
18 SawaiS UmemuraH MoriM SatohM HayakawaSKoderaY TomonagaT KuwabaraS and NomuraF (2010)Serum levels of complement C4 fragments correlate with diseaseactivity in multiple sclerosis proteomic analysis JNeuroimmunol 218 112ndash115
19 Zimmermann-IvolCG BurkhardPR Le Floch-RohrJAllardL HochstrasserDF and SanchezJC (2004) Fatty acidbinding protein as a serum marker for the early diagnosis ofstroke a pilot study Mol Cell Proteomics 3 66ndash72
20 AdkinsJN VarnumSM AuberryKJ MooreRJAngellNH SmithRD SpringerDL and PoundsJG (2002)Toward a human blood serum proteome analysis bymultidimensional separation coupled with mass spectrometryMol Cell Proteomics 1 947ndash955
21 OmennGS (2004) The Human Proteome Organization PlasmaProteome Project pilot phase reference specimens technologyplatform comparisons and standardized data submissions andanalyses Proteomics 4 1235ndash1240
22 OmennGS StatesDJ AdamskiM BlackwellTW MenonRHermjakobH ApweilerR HaabBB SimpsonRJ EddesJSet al (2005) Overview of the HUPO Plasma Proteome Projectresults from the pilot phase with 35 collaborating laboratoriesand multiple analytical groups generating a core dataset of 3020proteins and a publicly-available database Proteomics 53226ndash3245
23 MuthusamyB HanumanthuG SureshS RekhaB SrinivasDKarthickL VrushabendraBM SharmaS MishraGChatterjeeP et al (2005) Plasma Proteome Database as aresource for proteomics research Proteomics 5 3531ndash3536
24 LiuX ValentineSJ PlasenciaMD TrimpinS NaylorS andClemmerDE (2007) Mapping the human plasma proteome bySCX-LC-IMS-MS J Am Soc Mass Spectrom 18 1249ndash1264
25 FarrahT DeutschEW and AebersoldR (2011) Using thehuman plasma PeptideAtlas to study human plasma proteinsMethods Mol Biol 728 349ndash374
26 FarrahT DeutschEW OmennGS CampbellDS SunZBletzJA MallickP KatzJE MalmstromJ OssolaR et al(2011) A high-confidence human plasma proteome reference setwith estimated concentrations in PeptideAtlas Mol CellProteomics 10 M110 006353
27 VizcainoJA CoteRG CsordasA DianesJA FabregatAFosterJM GrissJ AlpiE BirimM ContellJ et al (2013)The PRoteomics IDEntifications (PRIDE) database andassociated tools status in 2013 Nucleic Acids Res 41D1063ndashD1069
28 LiSJ PengM LiH LiuBS WangC WuJR LiYX andZengR (2009) Sys-BodyFluid a systematical database for humanbody fluid proteome research Nucleic Acids Res 37 D907ndashD912
29 SahaS HarrisonSH ShenC TangH RadivojacPArnoldRJ ZhangX and ChenJY (2008) HIP2 an onlinedatabase of human plasma proteins from healthy individualsBMC Med Genomics 1 12
30 PrasadTSK GoelR KandasamyK KeerthikumarSKumarS MathivananS TelikicherlaD RajuR ShafreenBVenugopalA et al (2009) Human protein reference databasemdash2009 update Nucleic Acids Res 37 D767ndashD772
31 KandasamyK MohanSS RajuR KeerthikumarSKumarGS VenugopalAK TelikicherlaD NavarroJDMathivananS PecquetC et al (2010) NetPath a publicresource of curated signal transduction pathways Genome Biol11 R3
32 MaglottD OstellJ PruittKD and TatusovaT (2011) EntrezGene gene-centered information at NCBI Nucleic Acids Res 39D52ndashD57
33 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47
34 AndersonL (2005) Candidate-based proteomics in the search forbiomarkers of cardiovascular disease J Physiol 563 23ndash60
35 PolanskiM and AndersonNL (2007) A list of candidatecancer biomarkers for targeted proteomics Biomark Insights 11ndash48
36 DomanskiD PercyAJ YangJ ChambersAG HillJSFreueGV and BorchersCH (2012) MRM-based multiplexedquantitation of 67 putative cardiovascular disease biomarkers inhuman plasma Proteomics 12 1222ndash1243
37 HuttenhainR SosteM SelevsekN RostH SethiACarapitoC FarrahT DeutschEW KusebauchU MoritzRLet al (2012) Reproducible quantification of cancer-associatedproteins in body fluids using targeted proteomics Sci TranslMed 4 142ra194
38 PercyAJ ChambersAG YangJ and BorchersCH (2013)Multiplexed MRM-based quantitation of candidate cancerbiomarker proteins in undepleted and non-enriched humanplasma Proteomics 13 2202ndash2215
39 LangeV PicottiP DomonB and AebersoldR (2008) Selectedreaction monitoring for quantitative proteomics a tutorial MolSyst Biol 4 222
40 Stahl-ZengJ LangeV OssolaR EckhardtK KrekWAebersoldR and DomonB (2007) High sensitivity detection ofplasma proteins by multiple reaction monitoring of N-glycositesMol Cell Proteomics 6 1809ndash1817
D964 Nucleic Acids Research 2014 Vol 42 Database issue
41 CabyMP LankarD Vincendeau-ScherrerC RaposoG andBonnerotC (2005) Exosomal-like vesicles are present in humanblood plasma Int Immunol 17 879ndash887
42 KalraH SimpsonRJ JiH AikawaE AltevogtP AskenasePBondVC BorrasFE BreakefieldX BudnikV et al (2012)Vesiclepedia a compendium for extracellular vesicles withcontinuous community annotation PLoS Biol 10 e1001450
43 MathivananS FahnerCJ ReidGE and SimpsonRJ (2012)ExoCarta 2012 database of exosomal proteins RNA and lipidsNucleic Acids Res 40 D1241ndashD1244
44 SimpsonRJ LimJW MoritzRL and MathivananS (2009)Exosomes proteomic insights and diagnostic potential ExpertRev Proteomics 6 267ndash283
45 JiH GreeningDW BarnesTW LimJW TauroBJ RaiAXuR AddaC MathivananS ZhaoW et al (2013) Proteomeprofiling of exosomes derived from human primary andmetastatic colorectal cancer cells reveal differential expression ofkey metastatic factors and signal transduction componentsProteomics 13 1672ndash1686
46 ShaoH ChungJ BalajL CharestA BignerDD CarterBSHochbergFH BreakefieldXO WeisslederR and LeeH (2012)Protein typing of circulating microvesicles allows real-timemonitoring of glioblastoma therapy Nat Med 18 1835ndash1840
Nucleic Acids Research 2014 Vol 42 Database issue D965
that of 167 proteins is supported by gt10 publications Ofthe remaining 10 367 proteins 2199 are supported by twopublications whereas 6762 proteins are supported by asingle publication A histogram depicting the distributionof proteins with corresponding number of publications isshown Figure 2 In the current update we have includedpeptide data for proteins identified from MS-basedstudies which is one of the newly added features ofPPD A comparison of previous and current versionsof PPD provided in Table 1 shows a 3-fold increase ofprotein information in this 2014 update From ouranalysis we observed that 4668 and 1972 proteins wereunique to PPD when compared with data from otherpublicly available data resources on plasma or serumproteinsmdashthe Sys-BodyFluid and Healthy HumanIndividualrsquos Integrated Plasma Proteome (HIP2) respect-ively (2829) The data present in the Sys-BodyFluiddatabase were annotated from 15 articles whereas theHIP2 database contained plasma proteome data catalogedfrom 4 articles In contrast to other databases that provideplasmaserum protein catalogs generated from a limitednumber of studies PPD includes data annotated from509 peer-reviewed publications
In addition to documentation of plasma proteins andpeptide information for each protein in PPD we haveprovided external links to Human Protein ReferenceDatabase (HPRD) (30) NetPath (31) Entrez Gene (32)and UniProt (33) (Figure 1) For the benefit of users acollection of important published research articles andreviews describing high-throughput investigations onplasmaserum proteome has been provided We areseeking participation of the scientific community byinviting investigators to submit relevant articles for inclu-sion in PPD and have created a new portal in PPD tosubmit such articles To facilitate engagement with thecommunity there is a lsquofeedbackrsquo option which allowsusers to submit their comments or suggestions Finallywe have provided an option to download PPD datain XML and Microsoft Excel formats (httpwwwplasmaproteomedatabaseorgdownload)
Plasma concentration of proteins
An important step in plasma biomarker discovery is therelative quantification of proteins in healthy and diseaseconditions Determination of relative and absolute
A
B
D
C
Figure 1 Data analysis workflow using Plasma Proteome Explorer and a screenshot of a lsquoMolecule Pagersquo for Haptoglobin in PPD When queriedusing UniProt IDs (as shown in A) the plasma proteome explorer displays two different type of results (shown in B) These correspond to(i) proteins present in PPD which are hyperlinked to their corresponding molecule pages and (ii) proteins not present in PPD which are linkedto an external database UniProt Clicking the molecule leads the user to the respective molecule page (shown in C) The graphical representationshows domains and motifs found in the protein The molecule page also displays the alternate localization of protein the associated biologicalprocess and molecular function of the protein In addition the plasma concentration reported in healthy individuals along with correspondingPubMed identifiers and any other information regarding presence in plasma EVs is displayed Further MRM data is provided with information onproteotypic peptides peptide mz values charge collision energy transitions determined type of fragment ions and the mass spectrometer usedalong with a link to the corresponding publication (shown in D)
Nucleic Acids Research 2014 Vol 42 Database issue D961
concentration of plasma proteins is helpful for monitoringdisease predisposition onset diagnosis and progressionSeveral studies have been carried out in this directionAn early effort was reported by Anderson and Andersonin 2002 where they cataloged plasma concentration rangefor 70 proteins from published literature (1) Subsequentlyin 2005 and 2008 their group documented plasma levelsfor 177 and 211 proteins respectively which werereported to be associated with cardiovascular diseasestroke and cancer based on literature evidence (3435)Recently Farrah et al provided an estimate for plasmalevels for 1200 proteins based on spectral counting (26)
We have incorporated plasma protein levels in this updateof PPD and documented plasma protein concentrationsfor 1278 proteins from 276 articles Availability of thisinformation will be useful for both clinicians and re-searchers alike To document plasma protein levels wesearched PubMed using Gene symbols OR Protein nameOR alternate name(s) of the protein AND Plasma levelOR Serum level as queries We engaged four medicalstudents (SK BNJ KVS and RKK who are alsoco-authors) to document the plasma protein levels innormal individuals from the literature The plasmaprotein concentration data in PPD can be browsed asshown in Figure 1 along with the method used for themeasurement in each case We also developed a web-based submission portal (httpwwwplasmaproteomedatabaseorgprotein_concentration) for submission ofdata pertaining to protein concentration in plasmaserumby the clinicalanalytical community The researchers cansubmit the information obtained from their study throughthis portal and the information will be processed in PPDalong with the source of the information The lsquoPlasmaprotein concentration datarsquo file contains a list of plasmaproteins sorted based on their concentration in plasmaserum This file can be downloaded at httpwwwplasmaproteomedatabaseorgdownload
Incorporation of data from multiple reactionmonitoring experiments
Traditionally enzyme-linked immunosorbent assay andwestern blotting have been used for protein identificationand quantitation in plasma The success of these conven-tional methods relies heavily on the availability of specificantibodies against each protein These methods alsoprovide limited throughput as each assay can monitorone protein at a time Recent advances in MS haveenabled simultaneous identification and quantitation ofhundreds of proteins in plasma in a single experiment(36ndash38) Targeted approaches like multiple reaction moni-toring (MRM) have enabled quantitation of plasmaproteins with high specificity as well as increased through-put (3638)
MRM assays use multiple parameters to identifyand quantify specific proteins in complex samplesProteotypic peptides that uniquely represent a proteinare first selected and assays are developed which takeinto account the retention time of the peptides their mzvalues and measurement of specific fragment ions (39)This multi-step process provides superior specificity andallows for accurate quantification of protein levels over awide dynamic range of abundance (3640) MRM analysishas thus revolutionized biomarker research and is beingwidely used to determine altered protein levels acrossvarious disease phenotypes A compilation of proteotypicpeptides of plasma proteins that generate good signals inMS experiments along with their transitions will be im-mensely beneficial in developing MRM assays In PPDwe have annotated 591 reported peptides corresponding to279 proteins for MRM analysis This includes details ofprecursor mz fragment ion mz collision energy charge
1-10
11-1
00
101-
1000
1001
-200
0
gt200
0
0
100
200
300
400
500
Number of proteins
Nu
mb
er o
f ar
ticl
es
Figure 2 Distribution of proteins annotated in PPD according tonumber of corresponding publications The histogram shows thata high number of proteins (8793 proteins) in plasma have beenreported only in a single study Of the 509 articles annotated 426were found to support the presence of 10 proteins in plasma orserum
Table 1 A comparison of data in the current version of PPD with
the initial database in 2005
Data Features PPD-2005 PPD-2014 Number ofarticlesannotatedin the currentupdate
Total number of proteins 3778 10 546 509Total number of proteinswith plasma levels
Notannotated
1278 276
Total number of proteinsderived from plasmaextracellular vesicles
Notannotated
318 9
Total number of proteinswith MRM data
Notincluded
279 24
D962 Nucleic Acids Research 2014 Vol 42 Database issue
of the precursor ion and instrument used for the analysis(Figure 1)
Proteins identified from extracellular vesicles in plasma
Extracellular vesicles (EVs) are membranous sacs releasedby both normal and diseased cell types and are alsoreported in body fluids such as blood urine saliva andsynovial fluid (41ndash44) EVs include ectosomes exosomesand apoptotic bodies They facilitate communicationbetween cells by transfer of membrane and cytosolicproteins and are also shown to play role in mediatingimmune response and antigen presentation (44) EVs areshown to be present in plasma from healthy and certaincancer patients including glioblastoma and colon cancer(4546) Therefore identification and characterization ofEVs in plasma should aid in disease diagnosis and discov-ery of biomarkers Considering the importance of EVs wehave curated and included plasma EV proteins in PPDThere are 318 EV proteins annotated in PPD and each islinked to Exocarta and Vesiclepedia both manuallycurated resources for proteins messenger RNA microRNA and lipids reported from EVs including exosomes(4243)
Plasma Proteome Explorerndasha batch query interfacein PPD
We have enabled a batch query utility called lsquoPlasmaProteome Explorerrsquo which allows users to query theplasma proteome by providing a list of peptides genesymbols Entrez Gene IDs RefSeq accessions orUniProt IDs Data analysis workflow of plasmaproteome explorer is represented in Figure 1 The resultsobtained are displayed in a web page with followingfeaturesmdashPPD ID which are hyperlinked to PPDlsquomolecule pagersquo gene symbol and Entrez Gene ID
Plasma proteome explorer will enable biologists withlimited bioinformatics skills to compare their own experi-mental data set with that of plasma proteome In a singlestep users can differentiate between novel plasmaproteins novel MS-derived peptides detected fromplasma known plasma proteins and peptides This willenable researchers in biomarker development to screencandidate molecules that can be detected in plasmafrom a larger set of candidate protein biomarkersidentified in the discovery phase to pursue further for val-idation We believe that the plasma proteome explorer willoccupy the intermediate niche between lsquobiomarker discov-eryrsquo and lsquobiomarker validationrsquo phases We have enabled aseparate query tab which allows querying of proteinsbased on protein name gene symbol Entrez Gene IDUniProt ID sample type experimental method and MS-based platform In addition search by protein sequencecan be carried out in PPD The lsquoBrowsersquo option allows theuser to pursue proteins categorized on the basis of theirprotein name biological process molecular functionproteins with information on MRM data and plasmaexosomal proteins
CONCLUSIONS
The main goal of the PPD is to foster research in the areaof clinical proteomics especially the discovery of plasmabiomarkers to diagnose and monitor human diseases Wehave introduced a number of new features to expand theutility of this database particularly in the field of bio-marker discovery and validation Plasma proteomeexplorer will prove useful for selecting secreted candidatebiomarkers from a larger set of differentially expressedgenesproteins in any disease or perturbed biological con-ditions As the database develops we foresee the need forthe incorporation of information on disease associationand availability of reagents such as enzyme-linked im-munosorbent assay kits We anticipate that PPD willprovide a platform to bring together the communities ofbiomedical researchers engaged in biomarker discoveryand validation along with proteomic researchers develop-ing analytical solutions to overcome problems in plasmaproteome research We foresee that PPD will serve as acentralized reference repository for such a communityapproach
ACKNOWLEDGEMENTS
The authors thank the Department of Biotechnology(DBT) Government of India for research support tothe Institute of Bioinformatics Bangalore BM andAR are recipients of Senior Research Fellowship fromthe Council of Scientific and Industrial Research (CSIR)New Delhi India RS is a Research Associate supportedby DBT SMS is a recipient of Senior ResearchFellowship from University Grants Commission (UGC)India HCH is a Wellcome TrustDBT India AllianceEarly Career Fellow TSKP is a recipient of theresearch grant on lsquoDevelopment of Infrastructure and aComputational Framework for Analysis of ProteomicDatarsquo from DBT AP is supported by a contractHHSN268201000032C from the National Heart Lungand Blood Institute and an NIH Roadmap grantlsquoTechnology Center for Networks and Pathwaysrsquo[U54GM103520]
FUNDING
Funding for open access charge Waived by OxfordUniversity Press
Conflict of interest statement None declared
REFERENCES
1 AndersonNL and AndersonNG (2002) The human plasmaproteome history character and diagnostic prospects Mol CellProteomics 1 845ndash867
2 ZhangH LiuAY LoriauxP WollscheidB ZhouYWattsJD and AebersoldR (2007) Mass spectrometric detectionof tissue proteins in plasma Mol Cell Proteomics 6 64ndash71
3 TirumalaiRS ChanKC PrietoDA IssaqHJ ConradsTPand VeenstraTD (2003) Characterization of the low molecularweight human serum proteome Mol Cell Proteomics 21096ndash1103
Nucleic Acids Research 2014 Vol 42 Database issue D963
4 PingP VondriskaTM CreightonCJ GandhiTK YangZMenonR KwonMS ChoSY DrwalG KellmannM et al(2005) A functional annotation of subproteomes in humanplasma Proteomics 5 3506ndash3519
5 YangZ HancockWS ChewTR and BonillaL (2005) Astudy of glycoproteins in human serum and plasma referencestandards (HUPO) using multilectin affinity chromatographycoupled with RPLC-MSMS Proteomics 5 3353ndash3366
6 CristofanilliM BuddGT EllisMJ StopeckA MateraJMillerMC ReubenJM DoyleGV AllardWJTerstappenLW et al (2004) Circulating tumor cells diseaseprogression and survival in metastatic breast cancer N EnglJ Med 351 781ndash791
7 HanashSM PitteriSJ and FacaVM (2008) Mining theplasma proteome for cancer biomarkers Nature 452 571ndash579
8 KopreskiMS BenkoFA KwakLW and GockeCD (1999)Detection of tumor messenger RNA in the serum of patients withmalignant melanoma Clin Cancer Res 5 1961ndash1965
9 WangBG HuangHY ChenYC BristowRE KassaueiKChengCC RodenR SokollLJ ChanDW and Shih IeM(2003) Increased plasma DNA integrity in cancer patients CancerRes 63 3966ndash3968
10 LathropJT AndersonNL AndersonNG and HammondDJ(2003) Therapeutic potential of the plasma proteome Curr OpinMol Ther 5 250ndash257
11 RosenblattKP Bryant-GreenwoodP KillianJK MehtaAGehoD EspinaV PetricoinEF III and LiottaLA (2004)Serum proteomics in cancer diagnosis and management AnnuRev Med 55 97ndash112
12 AgranoffD Fernandez-ReyesD PapadopoulosMCRojasSA HerbsterM LoosemoreA TarelliE SheldonJSchwenkA PollokR et al (2006) Identification of diagnosticmarkers for tuberculosis by proteomic fingerprinting of serumLancet 368 1012ndash1021
13 BauerJW BaechlerEC PetriM BatliwallaFMCrawfordD OrtmannWA EspeKJ LiW PatelDDGregersenPK et al (2006) Elevated serum levels of interferon-regulated chemokines are biomarkers for active human systemiclupus erythematosus PLoS Med 3 e491
14 BerhaneBT ZongC LiemDA HuangA LeSEdmondsonRD JonesRC QiaoX WhiteleggeJP PingPet al (2005) Cardiovascular-related proteins identified in humanplasma by the HUPO plasma proteome project pilot phaseProteomics 5 3520ndash3530
15 HeQY LauGK ZhouY YuenST LinMC KungHFand ChiuJF (2003) Serum biomarkers of hepatitis B virusinfected liver inflammation a proteomic study Proteomics 3666ndash674
16 KimHJ ChoEH YooJH KimPK ShinJS KimMRand KimCW (2007) Proteome analysis of serum from type 2diabetics with nephropathy J Proteome Res 6 735ndash743
17 LiaoH WuJ KuhnE ChinW ChangB JonesMDOrsquoNeilS ClauserKR KarlJ HaslerF et al (2004) Use ofmass spectrometry to identify protein biomarkers of diseaseseverity in the synovial fluid and serum of patients withrheumatoid arthritis Arthritis Rheum 50 3792ndash3803
18 SawaiS UmemuraH MoriM SatohM HayakawaSKoderaY TomonagaT KuwabaraS and NomuraF (2010)Serum levels of complement C4 fragments correlate with diseaseactivity in multiple sclerosis proteomic analysis JNeuroimmunol 218 112ndash115
19 Zimmermann-IvolCG BurkhardPR Le Floch-RohrJAllardL HochstrasserDF and SanchezJC (2004) Fatty acidbinding protein as a serum marker for the early diagnosis ofstroke a pilot study Mol Cell Proteomics 3 66ndash72
20 AdkinsJN VarnumSM AuberryKJ MooreRJAngellNH SmithRD SpringerDL and PoundsJG (2002)Toward a human blood serum proteome analysis bymultidimensional separation coupled with mass spectrometryMol Cell Proteomics 1 947ndash955
21 OmennGS (2004) The Human Proteome Organization PlasmaProteome Project pilot phase reference specimens technologyplatform comparisons and standardized data submissions andanalyses Proteomics 4 1235ndash1240
22 OmennGS StatesDJ AdamskiM BlackwellTW MenonRHermjakobH ApweilerR HaabBB SimpsonRJ EddesJSet al (2005) Overview of the HUPO Plasma Proteome Projectresults from the pilot phase with 35 collaborating laboratoriesand multiple analytical groups generating a core dataset of 3020proteins and a publicly-available database Proteomics 53226ndash3245
23 MuthusamyB HanumanthuG SureshS RekhaB SrinivasDKarthickL VrushabendraBM SharmaS MishraGChatterjeeP et al (2005) Plasma Proteome Database as aresource for proteomics research Proteomics 5 3531ndash3536
24 LiuX ValentineSJ PlasenciaMD TrimpinS NaylorS andClemmerDE (2007) Mapping the human plasma proteome bySCX-LC-IMS-MS J Am Soc Mass Spectrom 18 1249ndash1264
25 FarrahT DeutschEW and AebersoldR (2011) Using thehuman plasma PeptideAtlas to study human plasma proteinsMethods Mol Biol 728 349ndash374
26 FarrahT DeutschEW OmennGS CampbellDS SunZBletzJA MallickP KatzJE MalmstromJ OssolaR et al(2011) A high-confidence human plasma proteome reference setwith estimated concentrations in PeptideAtlas Mol CellProteomics 10 M110 006353
27 VizcainoJA CoteRG CsordasA DianesJA FabregatAFosterJM GrissJ AlpiE BirimM ContellJ et al (2013)The PRoteomics IDEntifications (PRIDE) database andassociated tools status in 2013 Nucleic Acids Res 41D1063ndashD1069
28 LiSJ PengM LiH LiuBS WangC WuJR LiYX andZengR (2009) Sys-BodyFluid a systematical database for humanbody fluid proteome research Nucleic Acids Res 37 D907ndashD912
29 SahaS HarrisonSH ShenC TangH RadivojacPArnoldRJ ZhangX and ChenJY (2008) HIP2 an onlinedatabase of human plasma proteins from healthy individualsBMC Med Genomics 1 12
30 PrasadTSK GoelR KandasamyK KeerthikumarSKumarS MathivananS TelikicherlaD RajuR ShafreenBVenugopalA et al (2009) Human protein reference databasemdash2009 update Nucleic Acids Res 37 D767ndashD772
31 KandasamyK MohanSS RajuR KeerthikumarSKumarGS VenugopalAK TelikicherlaD NavarroJDMathivananS PecquetC et al (2010) NetPath a publicresource of curated signal transduction pathways Genome Biol11 R3
32 MaglottD OstellJ PruittKD and TatusovaT (2011) EntrezGene gene-centered information at NCBI Nucleic Acids Res 39D52ndashD57
33 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47
34 AndersonL (2005) Candidate-based proteomics in the search forbiomarkers of cardiovascular disease J Physiol 563 23ndash60
35 PolanskiM and AndersonNL (2007) A list of candidatecancer biomarkers for targeted proteomics Biomark Insights 11ndash48
36 DomanskiD PercyAJ YangJ ChambersAG HillJSFreueGV and BorchersCH (2012) MRM-based multiplexedquantitation of 67 putative cardiovascular disease biomarkers inhuman plasma Proteomics 12 1222ndash1243
37 HuttenhainR SosteM SelevsekN RostH SethiACarapitoC FarrahT DeutschEW KusebauchU MoritzRLet al (2012) Reproducible quantification of cancer-associatedproteins in body fluids using targeted proteomics Sci TranslMed 4 142ra194
38 PercyAJ ChambersAG YangJ and BorchersCH (2013)Multiplexed MRM-based quantitation of candidate cancerbiomarker proteins in undepleted and non-enriched humanplasma Proteomics 13 2202ndash2215
39 LangeV PicottiP DomonB and AebersoldR (2008) Selectedreaction monitoring for quantitative proteomics a tutorial MolSyst Biol 4 222
40 Stahl-ZengJ LangeV OssolaR EckhardtK KrekWAebersoldR and DomonB (2007) High sensitivity detection ofplasma proteins by multiple reaction monitoring of N-glycositesMol Cell Proteomics 6 1809ndash1817
D964 Nucleic Acids Research 2014 Vol 42 Database issue
41 CabyMP LankarD Vincendeau-ScherrerC RaposoG andBonnerotC (2005) Exosomal-like vesicles are present in humanblood plasma Int Immunol 17 879ndash887
42 KalraH SimpsonRJ JiH AikawaE AltevogtP AskenasePBondVC BorrasFE BreakefieldX BudnikV et al (2012)Vesiclepedia a compendium for extracellular vesicles withcontinuous community annotation PLoS Biol 10 e1001450
43 MathivananS FahnerCJ ReidGE and SimpsonRJ (2012)ExoCarta 2012 database of exosomal proteins RNA and lipidsNucleic Acids Res 40 D1241ndashD1244
44 SimpsonRJ LimJW MoritzRL and MathivananS (2009)Exosomes proteomic insights and diagnostic potential ExpertRev Proteomics 6 267ndash283
45 JiH GreeningDW BarnesTW LimJW TauroBJ RaiAXuR AddaC MathivananS ZhaoW et al (2013) Proteomeprofiling of exosomes derived from human primary andmetastatic colorectal cancer cells reveal differential expression ofkey metastatic factors and signal transduction componentsProteomics 13 1672ndash1686
46 ShaoH ChungJ BalajL CharestA BignerDD CarterBSHochbergFH BreakefieldXO WeisslederR and LeeH (2012)Protein typing of circulating microvesicles allows real-timemonitoring of glioblastoma therapy Nat Med 18 1835ndash1840
Nucleic Acids Research 2014 Vol 42 Database issue D965
concentration of plasma proteins is helpful for monitoringdisease predisposition onset diagnosis and progressionSeveral studies have been carried out in this directionAn early effort was reported by Anderson and Andersonin 2002 where they cataloged plasma concentration rangefor 70 proteins from published literature (1) Subsequentlyin 2005 and 2008 their group documented plasma levelsfor 177 and 211 proteins respectively which werereported to be associated with cardiovascular diseasestroke and cancer based on literature evidence (3435)Recently Farrah et al provided an estimate for plasmalevels for 1200 proteins based on spectral counting (26)
We have incorporated plasma protein levels in this updateof PPD and documented plasma protein concentrationsfor 1278 proteins from 276 articles Availability of thisinformation will be useful for both clinicians and re-searchers alike To document plasma protein levels wesearched PubMed using Gene symbols OR Protein nameOR alternate name(s) of the protein AND Plasma levelOR Serum level as queries We engaged four medicalstudents (SK BNJ KVS and RKK who are alsoco-authors) to document the plasma protein levels innormal individuals from the literature The plasmaprotein concentration data in PPD can be browsed asshown in Figure 1 along with the method used for themeasurement in each case We also developed a web-based submission portal (httpwwwplasmaproteomedatabaseorgprotein_concentration) for submission ofdata pertaining to protein concentration in plasmaserumby the clinicalanalytical community The researchers cansubmit the information obtained from their study throughthis portal and the information will be processed in PPDalong with the source of the information The lsquoPlasmaprotein concentration datarsquo file contains a list of plasmaproteins sorted based on their concentration in plasmaserum This file can be downloaded at httpwwwplasmaproteomedatabaseorgdownload
Incorporation of data from multiple reactionmonitoring experiments
Traditionally enzyme-linked immunosorbent assay andwestern blotting have been used for protein identificationand quantitation in plasma The success of these conven-tional methods relies heavily on the availability of specificantibodies against each protein These methods alsoprovide limited throughput as each assay can monitorone protein at a time Recent advances in MS haveenabled simultaneous identification and quantitation ofhundreds of proteins in plasma in a single experiment(36ndash38) Targeted approaches like multiple reaction moni-toring (MRM) have enabled quantitation of plasmaproteins with high specificity as well as increased through-put (3638)
MRM assays use multiple parameters to identifyand quantify specific proteins in complex samplesProteotypic peptides that uniquely represent a proteinare first selected and assays are developed which takeinto account the retention time of the peptides their mzvalues and measurement of specific fragment ions (39)This multi-step process provides superior specificity andallows for accurate quantification of protein levels over awide dynamic range of abundance (3640) MRM analysishas thus revolutionized biomarker research and is beingwidely used to determine altered protein levels acrossvarious disease phenotypes A compilation of proteotypicpeptides of plasma proteins that generate good signals inMS experiments along with their transitions will be im-mensely beneficial in developing MRM assays In PPDwe have annotated 591 reported peptides corresponding to279 proteins for MRM analysis This includes details ofprecursor mz fragment ion mz collision energy charge
1-10
11-1
00
101-
1000
1001
-200
0
gt200
0
0
100
200
300
400
500
Number of proteins
Nu
mb
er o
f ar
ticl
es
Figure 2 Distribution of proteins annotated in PPD according tonumber of corresponding publications The histogram shows thata high number of proteins (8793 proteins) in plasma have beenreported only in a single study Of the 509 articles annotated 426were found to support the presence of 10 proteins in plasma orserum
Table 1 A comparison of data in the current version of PPD with
the initial database in 2005
Data Features PPD-2005 PPD-2014 Number ofarticlesannotatedin the currentupdate
Total number of proteins 3778 10 546 509Total number of proteinswith plasma levels
Notannotated
1278 276
Total number of proteinsderived from plasmaextracellular vesicles
Notannotated
318 9
Total number of proteinswith MRM data
Notincluded
279 24
D962 Nucleic Acids Research 2014 Vol 42 Database issue
of the precursor ion and instrument used for the analysis(Figure 1)
Proteins identified from extracellular vesicles in plasma
Extracellular vesicles (EVs) are membranous sacs releasedby both normal and diseased cell types and are alsoreported in body fluids such as blood urine saliva andsynovial fluid (41ndash44) EVs include ectosomes exosomesand apoptotic bodies They facilitate communicationbetween cells by transfer of membrane and cytosolicproteins and are also shown to play role in mediatingimmune response and antigen presentation (44) EVs areshown to be present in plasma from healthy and certaincancer patients including glioblastoma and colon cancer(4546) Therefore identification and characterization ofEVs in plasma should aid in disease diagnosis and discov-ery of biomarkers Considering the importance of EVs wehave curated and included plasma EV proteins in PPDThere are 318 EV proteins annotated in PPD and each islinked to Exocarta and Vesiclepedia both manuallycurated resources for proteins messenger RNA microRNA and lipids reported from EVs including exosomes(4243)
Plasma Proteome Explorerndasha batch query interfacein PPD
We have enabled a batch query utility called lsquoPlasmaProteome Explorerrsquo which allows users to query theplasma proteome by providing a list of peptides genesymbols Entrez Gene IDs RefSeq accessions orUniProt IDs Data analysis workflow of plasmaproteome explorer is represented in Figure 1 The resultsobtained are displayed in a web page with followingfeaturesmdashPPD ID which are hyperlinked to PPDlsquomolecule pagersquo gene symbol and Entrez Gene ID
Plasma proteome explorer will enable biologists withlimited bioinformatics skills to compare their own experi-mental data set with that of plasma proteome In a singlestep users can differentiate between novel plasmaproteins novel MS-derived peptides detected fromplasma known plasma proteins and peptides This willenable researchers in biomarker development to screencandidate molecules that can be detected in plasmafrom a larger set of candidate protein biomarkersidentified in the discovery phase to pursue further for val-idation We believe that the plasma proteome explorer willoccupy the intermediate niche between lsquobiomarker discov-eryrsquo and lsquobiomarker validationrsquo phases We have enabled aseparate query tab which allows querying of proteinsbased on protein name gene symbol Entrez Gene IDUniProt ID sample type experimental method and MS-based platform In addition search by protein sequencecan be carried out in PPD The lsquoBrowsersquo option allows theuser to pursue proteins categorized on the basis of theirprotein name biological process molecular functionproteins with information on MRM data and plasmaexosomal proteins
CONCLUSIONS
The main goal of the PPD is to foster research in the areaof clinical proteomics especially the discovery of plasmabiomarkers to diagnose and monitor human diseases Wehave introduced a number of new features to expand theutility of this database particularly in the field of bio-marker discovery and validation Plasma proteomeexplorer will prove useful for selecting secreted candidatebiomarkers from a larger set of differentially expressedgenesproteins in any disease or perturbed biological con-ditions As the database develops we foresee the need forthe incorporation of information on disease associationand availability of reagents such as enzyme-linked im-munosorbent assay kits We anticipate that PPD willprovide a platform to bring together the communities ofbiomedical researchers engaged in biomarker discoveryand validation along with proteomic researchers develop-ing analytical solutions to overcome problems in plasmaproteome research We foresee that PPD will serve as acentralized reference repository for such a communityapproach
ACKNOWLEDGEMENTS
The authors thank the Department of Biotechnology(DBT) Government of India for research support tothe Institute of Bioinformatics Bangalore BM andAR are recipients of Senior Research Fellowship fromthe Council of Scientific and Industrial Research (CSIR)New Delhi India RS is a Research Associate supportedby DBT SMS is a recipient of Senior ResearchFellowship from University Grants Commission (UGC)India HCH is a Wellcome TrustDBT India AllianceEarly Career Fellow TSKP is a recipient of theresearch grant on lsquoDevelopment of Infrastructure and aComputational Framework for Analysis of ProteomicDatarsquo from DBT AP is supported by a contractHHSN268201000032C from the National Heart Lungand Blood Institute and an NIH Roadmap grantlsquoTechnology Center for Networks and Pathwaysrsquo[U54GM103520]
FUNDING
Funding for open access charge Waived by OxfordUniversity Press
Conflict of interest statement None declared
REFERENCES
1 AndersonNL and AndersonNG (2002) The human plasmaproteome history character and diagnostic prospects Mol CellProteomics 1 845ndash867
2 ZhangH LiuAY LoriauxP WollscheidB ZhouYWattsJD and AebersoldR (2007) Mass spectrometric detectionof tissue proteins in plasma Mol Cell Proteomics 6 64ndash71
3 TirumalaiRS ChanKC PrietoDA IssaqHJ ConradsTPand VeenstraTD (2003) Characterization of the low molecularweight human serum proteome Mol Cell Proteomics 21096ndash1103
Nucleic Acids Research 2014 Vol 42 Database issue D963
4 PingP VondriskaTM CreightonCJ GandhiTK YangZMenonR KwonMS ChoSY DrwalG KellmannM et al(2005) A functional annotation of subproteomes in humanplasma Proteomics 5 3506ndash3519
5 YangZ HancockWS ChewTR and BonillaL (2005) Astudy of glycoproteins in human serum and plasma referencestandards (HUPO) using multilectin affinity chromatographycoupled with RPLC-MSMS Proteomics 5 3353ndash3366
6 CristofanilliM BuddGT EllisMJ StopeckA MateraJMillerMC ReubenJM DoyleGV AllardWJTerstappenLW et al (2004) Circulating tumor cells diseaseprogression and survival in metastatic breast cancer N EnglJ Med 351 781ndash791
7 HanashSM PitteriSJ and FacaVM (2008) Mining theplasma proteome for cancer biomarkers Nature 452 571ndash579
8 KopreskiMS BenkoFA KwakLW and GockeCD (1999)Detection of tumor messenger RNA in the serum of patients withmalignant melanoma Clin Cancer Res 5 1961ndash1965
9 WangBG HuangHY ChenYC BristowRE KassaueiKChengCC RodenR SokollLJ ChanDW and Shih IeM(2003) Increased plasma DNA integrity in cancer patients CancerRes 63 3966ndash3968
10 LathropJT AndersonNL AndersonNG and HammondDJ(2003) Therapeutic potential of the plasma proteome Curr OpinMol Ther 5 250ndash257
11 RosenblattKP Bryant-GreenwoodP KillianJK MehtaAGehoD EspinaV PetricoinEF III and LiottaLA (2004)Serum proteomics in cancer diagnosis and management AnnuRev Med 55 97ndash112
12 AgranoffD Fernandez-ReyesD PapadopoulosMCRojasSA HerbsterM LoosemoreA TarelliE SheldonJSchwenkA PollokR et al (2006) Identification of diagnosticmarkers for tuberculosis by proteomic fingerprinting of serumLancet 368 1012ndash1021
13 BauerJW BaechlerEC PetriM BatliwallaFMCrawfordD OrtmannWA EspeKJ LiW PatelDDGregersenPK et al (2006) Elevated serum levels of interferon-regulated chemokines are biomarkers for active human systemiclupus erythematosus PLoS Med 3 e491
14 BerhaneBT ZongC LiemDA HuangA LeSEdmondsonRD JonesRC QiaoX WhiteleggeJP PingPet al (2005) Cardiovascular-related proteins identified in humanplasma by the HUPO plasma proteome project pilot phaseProteomics 5 3520ndash3530
15 HeQY LauGK ZhouY YuenST LinMC KungHFand ChiuJF (2003) Serum biomarkers of hepatitis B virusinfected liver inflammation a proteomic study Proteomics 3666ndash674
16 KimHJ ChoEH YooJH KimPK ShinJS KimMRand KimCW (2007) Proteome analysis of serum from type 2diabetics with nephropathy J Proteome Res 6 735ndash743
17 LiaoH WuJ KuhnE ChinW ChangB JonesMDOrsquoNeilS ClauserKR KarlJ HaslerF et al (2004) Use ofmass spectrometry to identify protein biomarkers of diseaseseverity in the synovial fluid and serum of patients withrheumatoid arthritis Arthritis Rheum 50 3792ndash3803
18 SawaiS UmemuraH MoriM SatohM HayakawaSKoderaY TomonagaT KuwabaraS and NomuraF (2010)Serum levels of complement C4 fragments correlate with diseaseactivity in multiple sclerosis proteomic analysis JNeuroimmunol 218 112ndash115
19 Zimmermann-IvolCG BurkhardPR Le Floch-RohrJAllardL HochstrasserDF and SanchezJC (2004) Fatty acidbinding protein as a serum marker for the early diagnosis ofstroke a pilot study Mol Cell Proteomics 3 66ndash72
20 AdkinsJN VarnumSM AuberryKJ MooreRJAngellNH SmithRD SpringerDL and PoundsJG (2002)Toward a human blood serum proteome analysis bymultidimensional separation coupled with mass spectrometryMol Cell Proteomics 1 947ndash955
21 OmennGS (2004) The Human Proteome Organization PlasmaProteome Project pilot phase reference specimens technologyplatform comparisons and standardized data submissions andanalyses Proteomics 4 1235ndash1240
22 OmennGS StatesDJ AdamskiM BlackwellTW MenonRHermjakobH ApweilerR HaabBB SimpsonRJ EddesJSet al (2005) Overview of the HUPO Plasma Proteome Projectresults from the pilot phase with 35 collaborating laboratoriesand multiple analytical groups generating a core dataset of 3020proteins and a publicly-available database Proteomics 53226ndash3245
23 MuthusamyB HanumanthuG SureshS RekhaB SrinivasDKarthickL VrushabendraBM SharmaS MishraGChatterjeeP et al (2005) Plasma Proteome Database as aresource for proteomics research Proteomics 5 3531ndash3536
24 LiuX ValentineSJ PlasenciaMD TrimpinS NaylorS andClemmerDE (2007) Mapping the human plasma proteome bySCX-LC-IMS-MS J Am Soc Mass Spectrom 18 1249ndash1264
25 FarrahT DeutschEW and AebersoldR (2011) Using thehuman plasma PeptideAtlas to study human plasma proteinsMethods Mol Biol 728 349ndash374
26 FarrahT DeutschEW OmennGS CampbellDS SunZBletzJA MallickP KatzJE MalmstromJ OssolaR et al(2011) A high-confidence human plasma proteome reference setwith estimated concentrations in PeptideAtlas Mol CellProteomics 10 M110 006353
27 VizcainoJA CoteRG CsordasA DianesJA FabregatAFosterJM GrissJ AlpiE BirimM ContellJ et al (2013)The PRoteomics IDEntifications (PRIDE) database andassociated tools status in 2013 Nucleic Acids Res 41D1063ndashD1069
28 LiSJ PengM LiH LiuBS WangC WuJR LiYX andZengR (2009) Sys-BodyFluid a systematical database for humanbody fluid proteome research Nucleic Acids Res 37 D907ndashD912
29 SahaS HarrisonSH ShenC TangH RadivojacPArnoldRJ ZhangX and ChenJY (2008) HIP2 an onlinedatabase of human plasma proteins from healthy individualsBMC Med Genomics 1 12
30 PrasadTSK GoelR KandasamyK KeerthikumarSKumarS MathivananS TelikicherlaD RajuR ShafreenBVenugopalA et al (2009) Human protein reference databasemdash2009 update Nucleic Acids Res 37 D767ndashD772
31 KandasamyK MohanSS RajuR KeerthikumarSKumarGS VenugopalAK TelikicherlaD NavarroJDMathivananS PecquetC et al (2010) NetPath a publicresource of curated signal transduction pathways Genome Biol11 R3
32 MaglottD OstellJ PruittKD and TatusovaT (2011) EntrezGene gene-centered information at NCBI Nucleic Acids Res 39D52ndashD57
33 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47
34 AndersonL (2005) Candidate-based proteomics in the search forbiomarkers of cardiovascular disease J Physiol 563 23ndash60
35 PolanskiM and AndersonNL (2007) A list of candidatecancer biomarkers for targeted proteomics Biomark Insights 11ndash48
36 DomanskiD PercyAJ YangJ ChambersAG HillJSFreueGV and BorchersCH (2012) MRM-based multiplexedquantitation of 67 putative cardiovascular disease biomarkers inhuman plasma Proteomics 12 1222ndash1243
37 HuttenhainR SosteM SelevsekN RostH SethiACarapitoC FarrahT DeutschEW KusebauchU MoritzRLet al (2012) Reproducible quantification of cancer-associatedproteins in body fluids using targeted proteomics Sci TranslMed 4 142ra194
38 PercyAJ ChambersAG YangJ and BorchersCH (2013)Multiplexed MRM-based quantitation of candidate cancerbiomarker proteins in undepleted and non-enriched humanplasma Proteomics 13 2202ndash2215
39 LangeV PicottiP DomonB and AebersoldR (2008) Selectedreaction monitoring for quantitative proteomics a tutorial MolSyst Biol 4 222
40 Stahl-ZengJ LangeV OssolaR EckhardtK KrekWAebersoldR and DomonB (2007) High sensitivity detection ofplasma proteins by multiple reaction monitoring of N-glycositesMol Cell Proteomics 6 1809ndash1817
D964 Nucleic Acids Research 2014 Vol 42 Database issue
41 CabyMP LankarD Vincendeau-ScherrerC RaposoG andBonnerotC (2005) Exosomal-like vesicles are present in humanblood plasma Int Immunol 17 879ndash887
42 KalraH SimpsonRJ JiH AikawaE AltevogtP AskenasePBondVC BorrasFE BreakefieldX BudnikV et al (2012)Vesiclepedia a compendium for extracellular vesicles withcontinuous community annotation PLoS Biol 10 e1001450
43 MathivananS FahnerCJ ReidGE and SimpsonRJ (2012)ExoCarta 2012 database of exosomal proteins RNA and lipidsNucleic Acids Res 40 D1241ndashD1244
44 SimpsonRJ LimJW MoritzRL and MathivananS (2009)Exosomes proteomic insights and diagnostic potential ExpertRev Proteomics 6 267ndash283
45 JiH GreeningDW BarnesTW LimJW TauroBJ RaiAXuR AddaC MathivananS ZhaoW et al (2013) Proteomeprofiling of exosomes derived from human primary andmetastatic colorectal cancer cells reveal differential expression ofkey metastatic factors and signal transduction componentsProteomics 13 1672ndash1686
46 ShaoH ChungJ BalajL CharestA BignerDD CarterBSHochbergFH BreakefieldXO WeisslederR and LeeH (2012)Protein typing of circulating microvesicles allows real-timemonitoring of glioblastoma therapy Nat Med 18 1835ndash1840
Nucleic Acids Research 2014 Vol 42 Database issue D965
of the precursor ion and instrument used for the analysis(Figure 1)
Proteins identified from extracellular vesicles in plasma
Extracellular vesicles (EVs) are membranous sacs releasedby both normal and diseased cell types and are alsoreported in body fluids such as blood urine saliva andsynovial fluid (41ndash44) EVs include ectosomes exosomesand apoptotic bodies They facilitate communicationbetween cells by transfer of membrane and cytosolicproteins and are also shown to play role in mediatingimmune response and antigen presentation (44) EVs areshown to be present in plasma from healthy and certaincancer patients including glioblastoma and colon cancer(4546) Therefore identification and characterization ofEVs in plasma should aid in disease diagnosis and discov-ery of biomarkers Considering the importance of EVs wehave curated and included plasma EV proteins in PPDThere are 318 EV proteins annotated in PPD and each islinked to Exocarta and Vesiclepedia both manuallycurated resources for proteins messenger RNA microRNA and lipids reported from EVs including exosomes(4243)
Plasma Proteome Explorerndasha batch query interfacein PPD
We have enabled a batch query utility called lsquoPlasmaProteome Explorerrsquo which allows users to query theplasma proteome by providing a list of peptides genesymbols Entrez Gene IDs RefSeq accessions orUniProt IDs Data analysis workflow of plasmaproteome explorer is represented in Figure 1 The resultsobtained are displayed in a web page with followingfeaturesmdashPPD ID which are hyperlinked to PPDlsquomolecule pagersquo gene symbol and Entrez Gene ID
Plasma proteome explorer will enable biologists withlimited bioinformatics skills to compare their own experi-mental data set with that of plasma proteome In a singlestep users can differentiate between novel plasmaproteins novel MS-derived peptides detected fromplasma known plasma proteins and peptides This willenable researchers in biomarker development to screencandidate molecules that can be detected in plasmafrom a larger set of candidate protein biomarkersidentified in the discovery phase to pursue further for val-idation We believe that the plasma proteome explorer willoccupy the intermediate niche between lsquobiomarker discov-eryrsquo and lsquobiomarker validationrsquo phases We have enabled aseparate query tab which allows querying of proteinsbased on protein name gene symbol Entrez Gene IDUniProt ID sample type experimental method and MS-based platform In addition search by protein sequencecan be carried out in PPD The lsquoBrowsersquo option allows theuser to pursue proteins categorized on the basis of theirprotein name biological process molecular functionproteins with information on MRM data and plasmaexosomal proteins
CONCLUSIONS
The main goal of the PPD is to foster research in the areaof clinical proteomics especially the discovery of plasmabiomarkers to diagnose and monitor human diseases Wehave introduced a number of new features to expand theutility of this database particularly in the field of bio-marker discovery and validation Plasma proteomeexplorer will prove useful for selecting secreted candidatebiomarkers from a larger set of differentially expressedgenesproteins in any disease or perturbed biological con-ditions As the database develops we foresee the need forthe incorporation of information on disease associationand availability of reagents such as enzyme-linked im-munosorbent assay kits We anticipate that PPD willprovide a platform to bring together the communities ofbiomedical researchers engaged in biomarker discoveryand validation along with proteomic researchers develop-ing analytical solutions to overcome problems in plasmaproteome research We foresee that PPD will serve as acentralized reference repository for such a communityapproach
ACKNOWLEDGEMENTS
The authors thank the Department of Biotechnology(DBT) Government of India for research support tothe Institute of Bioinformatics Bangalore BM andAR are recipients of Senior Research Fellowship fromthe Council of Scientific and Industrial Research (CSIR)New Delhi India RS is a Research Associate supportedby DBT SMS is a recipient of Senior ResearchFellowship from University Grants Commission (UGC)India HCH is a Wellcome TrustDBT India AllianceEarly Career Fellow TSKP is a recipient of theresearch grant on lsquoDevelopment of Infrastructure and aComputational Framework for Analysis of ProteomicDatarsquo from DBT AP is supported by a contractHHSN268201000032C from the National Heart Lungand Blood Institute and an NIH Roadmap grantlsquoTechnology Center for Networks and Pathwaysrsquo[U54GM103520]
FUNDING
Funding for open access charge Waived by OxfordUniversity Press
Conflict of interest statement None declared
REFERENCES
1 AndersonNL and AndersonNG (2002) The human plasmaproteome history character and diagnostic prospects Mol CellProteomics 1 845ndash867
2 ZhangH LiuAY LoriauxP WollscheidB ZhouYWattsJD and AebersoldR (2007) Mass spectrometric detectionof tissue proteins in plasma Mol Cell Proteomics 6 64ndash71
3 TirumalaiRS ChanKC PrietoDA IssaqHJ ConradsTPand VeenstraTD (2003) Characterization of the low molecularweight human serum proteome Mol Cell Proteomics 21096ndash1103
Nucleic Acids Research 2014 Vol 42 Database issue D963
4 PingP VondriskaTM CreightonCJ GandhiTK YangZMenonR KwonMS ChoSY DrwalG KellmannM et al(2005) A functional annotation of subproteomes in humanplasma Proteomics 5 3506ndash3519
5 YangZ HancockWS ChewTR and BonillaL (2005) Astudy of glycoproteins in human serum and plasma referencestandards (HUPO) using multilectin affinity chromatographycoupled with RPLC-MSMS Proteomics 5 3353ndash3366
6 CristofanilliM BuddGT EllisMJ StopeckA MateraJMillerMC ReubenJM DoyleGV AllardWJTerstappenLW et al (2004) Circulating tumor cells diseaseprogression and survival in metastatic breast cancer N EnglJ Med 351 781ndash791
7 HanashSM PitteriSJ and FacaVM (2008) Mining theplasma proteome for cancer biomarkers Nature 452 571ndash579
8 KopreskiMS BenkoFA KwakLW and GockeCD (1999)Detection of tumor messenger RNA in the serum of patients withmalignant melanoma Clin Cancer Res 5 1961ndash1965
9 WangBG HuangHY ChenYC BristowRE KassaueiKChengCC RodenR SokollLJ ChanDW and Shih IeM(2003) Increased plasma DNA integrity in cancer patients CancerRes 63 3966ndash3968
10 LathropJT AndersonNL AndersonNG and HammondDJ(2003) Therapeutic potential of the plasma proteome Curr OpinMol Ther 5 250ndash257
11 RosenblattKP Bryant-GreenwoodP KillianJK MehtaAGehoD EspinaV PetricoinEF III and LiottaLA (2004)Serum proteomics in cancer diagnosis and management AnnuRev Med 55 97ndash112
12 AgranoffD Fernandez-ReyesD PapadopoulosMCRojasSA HerbsterM LoosemoreA TarelliE SheldonJSchwenkA PollokR et al (2006) Identification of diagnosticmarkers for tuberculosis by proteomic fingerprinting of serumLancet 368 1012ndash1021
13 BauerJW BaechlerEC PetriM BatliwallaFMCrawfordD OrtmannWA EspeKJ LiW PatelDDGregersenPK et al (2006) Elevated serum levels of interferon-regulated chemokines are biomarkers for active human systemiclupus erythematosus PLoS Med 3 e491
14 BerhaneBT ZongC LiemDA HuangA LeSEdmondsonRD JonesRC QiaoX WhiteleggeJP PingPet al (2005) Cardiovascular-related proteins identified in humanplasma by the HUPO plasma proteome project pilot phaseProteomics 5 3520ndash3530
15 HeQY LauGK ZhouY YuenST LinMC KungHFand ChiuJF (2003) Serum biomarkers of hepatitis B virusinfected liver inflammation a proteomic study Proteomics 3666ndash674
16 KimHJ ChoEH YooJH KimPK ShinJS KimMRand KimCW (2007) Proteome analysis of serum from type 2diabetics with nephropathy J Proteome Res 6 735ndash743
17 LiaoH WuJ KuhnE ChinW ChangB JonesMDOrsquoNeilS ClauserKR KarlJ HaslerF et al (2004) Use ofmass spectrometry to identify protein biomarkers of diseaseseverity in the synovial fluid and serum of patients withrheumatoid arthritis Arthritis Rheum 50 3792ndash3803
18 SawaiS UmemuraH MoriM SatohM HayakawaSKoderaY TomonagaT KuwabaraS and NomuraF (2010)Serum levels of complement C4 fragments correlate with diseaseactivity in multiple sclerosis proteomic analysis JNeuroimmunol 218 112ndash115
19 Zimmermann-IvolCG BurkhardPR Le Floch-RohrJAllardL HochstrasserDF and SanchezJC (2004) Fatty acidbinding protein as a serum marker for the early diagnosis ofstroke a pilot study Mol Cell Proteomics 3 66ndash72
20 AdkinsJN VarnumSM AuberryKJ MooreRJAngellNH SmithRD SpringerDL and PoundsJG (2002)Toward a human blood serum proteome analysis bymultidimensional separation coupled with mass spectrometryMol Cell Proteomics 1 947ndash955
21 OmennGS (2004) The Human Proteome Organization PlasmaProteome Project pilot phase reference specimens technologyplatform comparisons and standardized data submissions andanalyses Proteomics 4 1235ndash1240
22 OmennGS StatesDJ AdamskiM BlackwellTW MenonRHermjakobH ApweilerR HaabBB SimpsonRJ EddesJSet al (2005) Overview of the HUPO Plasma Proteome Projectresults from the pilot phase with 35 collaborating laboratoriesand multiple analytical groups generating a core dataset of 3020proteins and a publicly-available database Proteomics 53226ndash3245
23 MuthusamyB HanumanthuG SureshS RekhaB SrinivasDKarthickL VrushabendraBM SharmaS MishraGChatterjeeP et al (2005) Plasma Proteome Database as aresource for proteomics research Proteomics 5 3531ndash3536
24 LiuX ValentineSJ PlasenciaMD TrimpinS NaylorS andClemmerDE (2007) Mapping the human plasma proteome bySCX-LC-IMS-MS J Am Soc Mass Spectrom 18 1249ndash1264
25 FarrahT DeutschEW and AebersoldR (2011) Using thehuman plasma PeptideAtlas to study human plasma proteinsMethods Mol Biol 728 349ndash374
26 FarrahT DeutschEW OmennGS CampbellDS SunZBletzJA MallickP KatzJE MalmstromJ OssolaR et al(2011) A high-confidence human plasma proteome reference setwith estimated concentrations in PeptideAtlas Mol CellProteomics 10 M110 006353
27 VizcainoJA CoteRG CsordasA DianesJA FabregatAFosterJM GrissJ AlpiE BirimM ContellJ et al (2013)The PRoteomics IDEntifications (PRIDE) database andassociated tools status in 2013 Nucleic Acids Res 41D1063ndashD1069
28 LiSJ PengM LiH LiuBS WangC WuJR LiYX andZengR (2009) Sys-BodyFluid a systematical database for humanbody fluid proteome research Nucleic Acids Res 37 D907ndashD912
29 SahaS HarrisonSH ShenC TangH RadivojacPArnoldRJ ZhangX and ChenJY (2008) HIP2 an onlinedatabase of human plasma proteins from healthy individualsBMC Med Genomics 1 12
30 PrasadTSK GoelR KandasamyK KeerthikumarSKumarS MathivananS TelikicherlaD RajuR ShafreenBVenugopalA et al (2009) Human protein reference databasemdash2009 update Nucleic Acids Res 37 D767ndashD772
31 KandasamyK MohanSS RajuR KeerthikumarSKumarGS VenugopalAK TelikicherlaD NavarroJDMathivananS PecquetC et al (2010) NetPath a publicresource of curated signal transduction pathways Genome Biol11 R3
32 MaglottD OstellJ PruittKD and TatusovaT (2011) EntrezGene gene-centered information at NCBI Nucleic Acids Res 39D52ndashD57
33 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47
34 AndersonL (2005) Candidate-based proteomics in the search forbiomarkers of cardiovascular disease J Physiol 563 23ndash60
35 PolanskiM and AndersonNL (2007) A list of candidatecancer biomarkers for targeted proteomics Biomark Insights 11ndash48
36 DomanskiD PercyAJ YangJ ChambersAG HillJSFreueGV and BorchersCH (2012) MRM-based multiplexedquantitation of 67 putative cardiovascular disease biomarkers inhuman plasma Proteomics 12 1222ndash1243
37 HuttenhainR SosteM SelevsekN RostH SethiACarapitoC FarrahT DeutschEW KusebauchU MoritzRLet al (2012) Reproducible quantification of cancer-associatedproteins in body fluids using targeted proteomics Sci TranslMed 4 142ra194
38 PercyAJ ChambersAG YangJ and BorchersCH (2013)Multiplexed MRM-based quantitation of candidate cancerbiomarker proteins in undepleted and non-enriched humanplasma Proteomics 13 2202ndash2215
39 LangeV PicottiP DomonB and AebersoldR (2008) Selectedreaction monitoring for quantitative proteomics a tutorial MolSyst Biol 4 222
40 Stahl-ZengJ LangeV OssolaR EckhardtK KrekWAebersoldR and DomonB (2007) High sensitivity detection ofplasma proteins by multiple reaction monitoring of N-glycositesMol Cell Proteomics 6 1809ndash1817
D964 Nucleic Acids Research 2014 Vol 42 Database issue
41 CabyMP LankarD Vincendeau-ScherrerC RaposoG andBonnerotC (2005) Exosomal-like vesicles are present in humanblood plasma Int Immunol 17 879ndash887
42 KalraH SimpsonRJ JiH AikawaE AltevogtP AskenasePBondVC BorrasFE BreakefieldX BudnikV et al (2012)Vesiclepedia a compendium for extracellular vesicles withcontinuous community annotation PLoS Biol 10 e1001450
43 MathivananS FahnerCJ ReidGE and SimpsonRJ (2012)ExoCarta 2012 database of exosomal proteins RNA and lipidsNucleic Acids Res 40 D1241ndashD1244
44 SimpsonRJ LimJW MoritzRL and MathivananS (2009)Exosomes proteomic insights and diagnostic potential ExpertRev Proteomics 6 267ndash283
45 JiH GreeningDW BarnesTW LimJW TauroBJ RaiAXuR AddaC MathivananS ZhaoW et al (2013) Proteomeprofiling of exosomes derived from human primary andmetastatic colorectal cancer cells reveal differential expression ofkey metastatic factors and signal transduction componentsProteomics 13 1672ndash1686
46 ShaoH ChungJ BalajL CharestA BignerDD CarterBSHochbergFH BreakefieldXO WeisslederR and LeeH (2012)Protein typing of circulating microvesicles allows real-timemonitoring of glioblastoma therapy Nat Med 18 1835ndash1840
Nucleic Acids Research 2014 Vol 42 Database issue D965
4 PingP VondriskaTM CreightonCJ GandhiTK YangZMenonR KwonMS ChoSY DrwalG KellmannM et al(2005) A functional annotation of subproteomes in humanplasma Proteomics 5 3506ndash3519
5 YangZ HancockWS ChewTR and BonillaL (2005) Astudy of glycoproteins in human serum and plasma referencestandards (HUPO) using multilectin affinity chromatographycoupled with RPLC-MSMS Proteomics 5 3353ndash3366
6 CristofanilliM BuddGT EllisMJ StopeckA MateraJMillerMC ReubenJM DoyleGV AllardWJTerstappenLW et al (2004) Circulating tumor cells diseaseprogression and survival in metastatic breast cancer N EnglJ Med 351 781ndash791
7 HanashSM PitteriSJ and FacaVM (2008) Mining theplasma proteome for cancer biomarkers Nature 452 571ndash579
8 KopreskiMS BenkoFA KwakLW and GockeCD (1999)Detection of tumor messenger RNA in the serum of patients withmalignant melanoma Clin Cancer Res 5 1961ndash1965
9 WangBG HuangHY ChenYC BristowRE KassaueiKChengCC RodenR SokollLJ ChanDW and Shih IeM(2003) Increased plasma DNA integrity in cancer patients CancerRes 63 3966ndash3968
10 LathropJT AndersonNL AndersonNG and HammondDJ(2003) Therapeutic potential of the plasma proteome Curr OpinMol Ther 5 250ndash257
11 RosenblattKP Bryant-GreenwoodP KillianJK MehtaAGehoD EspinaV PetricoinEF III and LiottaLA (2004)Serum proteomics in cancer diagnosis and management AnnuRev Med 55 97ndash112
12 AgranoffD Fernandez-ReyesD PapadopoulosMCRojasSA HerbsterM LoosemoreA TarelliE SheldonJSchwenkA PollokR et al (2006) Identification of diagnosticmarkers for tuberculosis by proteomic fingerprinting of serumLancet 368 1012ndash1021
13 BauerJW BaechlerEC PetriM BatliwallaFMCrawfordD OrtmannWA EspeKJ LiW PatelDDGregersenPK et al (2006) Elevated serum levels of interferon-regulated chemokines are biomarkers for active human systemiclupus erythematosus PLoS Med 3 e491
14 BerhaneBT ZongC LiemDA HuangA LeSEdmondsonRD JonesRC QiaoX WhiteleggeJP PingPet al (2005) Cardiovascular-related proteins identified in humanplasma by the HUPO plasma proteome project pilot phaseProteomics 5 3520ndash3530
15 HeQY LauGK ZhouY YuenST LinMC KungHFand ChiuJF (2003) Serum biomarkers of hepatitis B virusinfected liver inflammation a proteomic study Proteomics 3666ndash674
16 KimHJ ChoEH YooJH KimPK ShinJS KimMRand KimCW (2007) Proteome analysis of serum from type 2diabetics with nephropathy J Proteome Res 6 735ndash743
17 LiaoH WuJ KuhnE ChinW ChangB JonesMDOrsquoNeilS ClauserKR KarlJ HaslerF et al (2004) Use ofmass spectrometry to identify protein biomarkers of diseaseseverity in the synovial fluid and serum of patients withrheumatoid arthritis Arthritis Rheum 50 3792ndash3803
18 SawaiS UmemuraH MoriM SatohM HayakawaSKoderaY TomonagaT KuwabaraS and NomuraF (2010)Serum levels of complement C4 fragments correlate with diseaseactivity in multiple sclerosis proteomic analysis JNeuroimmunol 218 112ndash115
19 Zimmermann-IvolCG BurkhardPR Le Floch-RohrJAllardL HochstrasserDF and SanchezJC (2004) Fatty acidbinding protein as a serum marker for the early diagnosis ofstroke a pilot study Mol Cell Proteomics 3 66ndash72
20 AdkinsJN VarnumSM AuberryKJ MooreRJAngellNH SmithRD SpringerDL and PoundsJG (2002)Toward a human blood serum proteome analysis bymultidimensional separation coupled with mass spectrometryMol Cell Proteomics 1 947ndash955
21 OmennGS (2004) The Human Proteome Organization PlasmaProteome Project pilot phase reference specimens technologyplatform comparisons and standardized data submissions andanalyses Proteomics 4 1235ndash1240
22 OmennGS StatesDJ AdamskiM BlackwellTW MenonRHermjakobH ApweilerR HaabBB SimpsonRJ EddesJSet al (2005) Overview of the HUPO Plasma Proteome Projectresults from the pilot phase with 35 collaborating laboratoriesand multiple analytical groups generating a core dataset of 3020proteins and a publicly-available database Proteomics 53226ndash3245
23 MuthusamyB HanumanthuG SureshS RekhaB SrinivasDKarthickL VrushabendraBM SharmaS MishraGChatterjeeP et al (2005) Plasma Proteome Database as aresource for proteomics research Proteomics 5 3531ndash3536
24 LiuX ValentineSJ PlasenciaMD TrimpinS NaylorS andClemmerDE (2007) Mapping the human plasma proteome bySCX-LC-IMS-MS J Am Soc Mass Spectrom 18 1249ndash1264
25 FarrahT DeutschEW and AebersoldR (2011) Using thehuman plasma PeptideAtlas to study human plasma proteinsMethods Mol Biol 728 349ndash374
26 FarrahT DeutschEW OmennGS CampbellDS SunZBletzJA MallickP KatzJE MalmstromJ OssolaR et al(2011) A high-confidence human plasma proteome reference setwith estimated concentrations in PeptideAtlas Mol CellProteomics 10 M110 006353
27 VizcainoJA CoteRG CsordasA DianesJA FabregatAFosterJM GrissJ AlpiE BirimM ContellJ et al (2013)The PRoteomics IDEntifications (PRIDE) database andassociated tools status in 2013 Nucleic Acids Res 41D1063ndashD1069
28 LiSJ PengM LiH LiuBS WangC WuJR LiYX andZengR (2009) Sys-BodyFluid a systematical database for humanbody fluid proteome research Nucleic Acids Res 37 D907ndashD912
29 SahaS HarrisonSH ShenC TangH RadivojacPArnoldRJ ZhangX and ChenJY (2008) HIP2 an onlinedatabase of human plasma proteins from healthy individualsBMC Med Genomics 1 12
30 PrasadTSK GoelR KandasamyK KeerthikumarSKumarS MathivananS TelikicherlaD RajuR ShafreenBVenugopalA et al (2009) Human protein reference databasemdash2009 update Nucleic Acids Res 37 D767ndashD772
31 KandasamyK MohanSS RajuR KeerthikumarSKumarGS VenugopalAK TelikicherlaD NavarroJDMathivananS PecquetC et al (2010) NetPath a publicresource of curated signal transduction pathways Genome Biol11 R3
32 MaglottD OstellJ PruittKD and TatusovaT (2011) EntrezGene gene-centered information at NCBI Nucleic Acids Res 39D52ndashD57
33 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47
34 AndersonL (2005) Candidate-based proteomics in the search forbiomarkers of cardiovascular disease J Physiol 563 23ndash60
35 PolanskiM and AndersonNL (2007) A list of candidatecancer biomarkers for targeted proteomics Biomark Insights 11ndash48
36 DomanskiD PercyAJ YangJ ChambersAG HillJSFreueGV and BorchersCH (2012) MRM-based multiplexedquantitation of 67 putative cardiovascular disease biomarkers inhuman plasma Proteomics 12 1222ndash1243
37 HuttenhainR SosteM SelevsekN RostH SethiACarapitoC FarrahT DeutschEW KusebauchU MoritzRLet al (2012) Reproducible quantification of cancer-associatedproteins in body fluids using targeted proteomics Sci TranslMed 4 142ra194
38 PercyAJ ChambersAG YangJ and BorchersCH (2013)Multiplexed MRM-based quantitation of candidate cancerbiomarker proteins in undepleted and non-enriched humanplasma Proteomics 13 2202ndash2215
39 LangeV PicottiP DomonB and AebersoldR (2008) Selectedreaction monitoring for quantitative proteomics a tutorial MolSyst Biol 4 222
40 Stahl-ZengJ LangeV OssolaR EckhardtK KrekWAebersoldR and DomonB (2007) High sensitivity detection ofplasma proteins by multiple reaction monitoring of N-glycositesMol Cell Proteomics 6 1809ndash1817
D964 Nucleic Acids Research 2014 Vol 42 Database issue
41 CabyMP LankarD Vincendeau-ScherrerC RaposoG andBonnerotC (2005) Exosomal-like vesicles are present in humanblood plasma Int Immunol 17 879ndash887
42 KalraH SimpsonRJ JiH AikawaE AltevogtP AskenasePBondVC BorrasFE BreakefieldX BudnikV et al (2012)Vesiclepedia a compendium for extracellular vesicles withcontinuous community annotation PLoS Biol 10 e1001450
43 MathivananS FahnerCJ ReidGE and SimpsonRJ (2012)ExoCarta 2012 database of exosomal proteins RNA and lipidsNucleic Acids Res 40 D1241ndashD1244
44 SimpsonRJ LimJW MoritzRL and MathivananS (2009)Exosomes proteomic insights and diagnostic potential ExpertRev Proteomics 6 267ndash283
45 JiH GreeningDW BarnesTW LimJW TauroBJ RaiAXuR AddaC MathivananS ZhaoW et al (2013) Proteomeprofiling of exosomes derived from human primary andmetastatic colorectal cancer cells reveal differential expression ofkey metastatic factors and signal transduction componentsProteomics 13 1672ndash1686
46 ShaoH ChungJ BalajL CharestA BignerDD CarterBSHochbergFH BreakefieldXO WeisslederR and LeeH (2012)Protein typing of circulating microvesicles allows real-timemonitoring of glioblastoma therapy Nat Med 18 1835ndash1840
Nucleic Acids Research 2014 Vol 42 Database issue D965
41 CabyMP LankarD Vincendeau-ScherrerC RaposoG andBonnerotC (2005) Exosomal-like vesicles are present in humanblood plasma Int Immunol 17 879ndash887
42 KalraH SimpsonRJ JiH AikawaE AltevogtP AskenasePBondVC BorrasFE BreakefieldX BudnikV et al (2012)Vesiclepedia a compendium for extracellular vesicles withcontinuous community annotation PLoS Biol 10 e1001450
43 MathivananS FahnerCJ ReidGE and SimpsonRJ (2012)ExoCarta 2012 database of exosomal proteins RNA and lipidsNucleic Acids Res 40 D1241ndashD1244
44 SimpsonRJ LimJW MoritzRL and MathivananS (2009)Exosomes proteomic insights and diagnostic potential ExpertRev Proteomics 6 267ndash283
45 JiH GreeningDW BarnesTW LimJW TauroBJ RaiAXuR AddaC MathivananS ZhaoW et al (2013) Proteomeprofiling of exosomes derived from human primary andmetastatic colorectal cancer cells reveal differential expression ofkey metastatic factors and signal transduction componentsProteomics 13 1672ndash1686
46 ShaoH ChungJ BalajL CharestA BignerDD CarterBSHochbergFH BreakefieldXO WeisslederR and LeeH (2012)Protein typing of circulating microvesicles allows real-timemonitoring of glioblastoma therapy Nat Med 18 1835ndash1840
Nucleic Acids Research 2014 Vol 42 Database issue D965