SMPDB 2.0: Big Improvements to the Small Molecule Pathway Database

Post on 11-Apr-2023

1 views 0 download

Transcript of SMPDB 2.0: Big Improvements to the Small Molecule Pathway Database

SMPDB 20 Big Improvements to the SmallMolecule Pathway DatabaseTimothy Jewison1 Yilu Su1 Fatemeh Miri Disfany1 Yongjie Liang1 Craig Knox1

Adam Maciejewski1 Jenna Poelzer1 Jessica Huynh1 You Zhou1 David Arndt1

Yannick Djoumbou1 Yifeng Liu1 Lu Deng1 An Chi Guo1 Beomsoo Han1 Allison Pon1

Michael Wilson1 Shahrzad Rafatnia1 Philip Liu1 and David S Wishart123

1Department of Computing Science University of Alberta Edmonton AB Canada T6G 2E8 2Department ofBiological Sciences University of Alberta Edmonton AB Canada T6G 2E8 and 3National Institute forNanotechnology 11421 Saskatchewan Drive Edmonton AB Canada T6G 2M9

Received September 15 2013 Revised October 12 2013 Accepted October 14 2013

ABSTRACT

The Small Molecule Pathway Database (SMPDBhttpwwwsmpdbca) is a comprehensivecolorful fully searchable and highly interactivedatabase for visualizing human metabolic drugaction drug metabolism physiological activity andmetabolic disease pathways SMPDB contains gt600pathways with nearly 75 of its pathways not foundin any other database All SMPDB pathway diagramsare extensively hyperlinked and include detailed in-formation on the relevant tissues organs organ-elles subcellular compartments protein cofactorsprotein locations metabolite locations chemicalstructures and protein quaternary structures Sinceits last release in 2010 SMPDB has undergone sub-stantial upgrades and significant expansion In par-ticular the total number of pathways in SMPDBhas grown by gt70 Additionally every previouslyentered pathway has been completely redrawnstandardized corrected updated and enhancedwith additional molecular or cellular informationMany SMPDB pathways now include transporterproteins as well as much more physiologicaltissue target organ and reaction compartmentdata Thanks to the development of a standardizedpathway drawing tool (called PathWhiz) all SMPDBpathways are now much more easily drawn and farmore rapidly updated PathWhiz has also allowed allSMPDB pathways to be saved in a BioPAX formatSignificant improvements to SMPDBrsquos visualizationinterface now make the browsing selection re-coloring and zooming of pathways far easier andfar more intuitive Because of its utility and

breadth of coverage SMPDB is now integratedinto several other databases including HMDB andDrugBank

INTRODUCTION

Biological pathways are the wiring diagrams of life Theyprovide a rich and surprisingly compact view of howgenes proteins and metabolites work together in cellstissues or organs In fact most of todayrsquos life scientistslearned their biochemistry and molecular biology bystudying richly annotated carefully hand-drawnpathway diagrams found in books or on wall chartsWith the advent of the internet many biologicalpathway diagrams started migrating from the printedpage to the web As a result there is now a plethora ofvery popular very high-quality web-accessible pathwaydatabases such as KEGG (1) the lsquoCycrsquo databases (23)the Reactome database (4) WikiPathways (5) andPharmGKB (6) The advantages of using the web to illus-trate and disseminate biological pathway information aremanifold These include greater accessibility improvedinteractivity and enhanced functionality However thelimitations associated with creating hyperlinked web illus-trations often means that visual interactivity has to comeat the expense of artistic quality or biochemical detailThis can be particularly problematic in the fields ofclinical metabolomics and clinical chemistry wheredetails about chemical structures as well as targetorgans tissue locations organelle function and physio-logical activity are particularly importantIn an effort to make web-based biological pathway data

more visually appealing and physiologically relevant par-ticularly for biomedical applications we developed theSmall Molecule Pathway Database or SMPDB (7) Thefirst version of SMPDB which was released in 2010

To whom correspondence should be addressed Tel +1 780 492 0383 Fax +1 780 492 1071 Email davidwishartualbertaca

Nucleic Acids Research 2013 1ndash7doi101093nargkt1067

The Author(s) 2013 Published by Oxford University PressThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (httpcreativecommonsorglicensesby30) whichpermits unrestricted reuse distribution and reproduction in any medium provided the original work is properly cited

Nucleic Acids Research Advance Access published November 6 2013 at U

niversity of Alberta on N

ovember 25 2013

httpnaroxfordjournalsorgD

ownloaded from

contained gt350 colorful artistically rendered and bio-chemically complete pathway illustrations describingmany key aspects of human metabolism These includedhyperlinked diagrams of human metabolic pathwaysmetabolic disease pathways metabolite signalingpathways and drug-action pathways Many of thesepathway diagrams included information on the tissuelocations relevant organs organelles subcellular com-partments protein cofactors protein locations metabolitelocations chemical structures and protein quaternarystructures As with many high-quality web-basedpathway databases SMPDB also provided extensivehyperlinking browsing searching and annotation func-tions along with detailed pathway descriptions andreferencesHowever the first version of SMPDB was not without

some shortcomings In particular it was difficult to updateand maintain it was limited in size and scope itspathways lacked a certain degree of standardization itsphysiological information on organelles and organs wasincomplete and its visual interactivity (zooming and imagenavigation) was awkward and somewhat restrictedFurthermore SMPDB pathway diagrams were onlydownloadable as static images and not available instandard formats such as BioPAX or SBML (8) WhileSMPDB did link to well-known databases such asDrugBank (9) HMDB (10) or UniProt (11) these data-bases did not link to it In other words SMPDB was notparticularly visible to the metabolomics drug research orsystems biology communitiesWith the release of SMPDB 20 we believe we have

addressed all of these shortcomings In particular wehave developed tools to simplify SMPDBrsquos maintenanceand enhance its illustration standards We have also cor-rected and added much more physiological (tissue organorganelle and transporter) information substantiallyimproved its visual interactivity and quality createddownloadable BioPAX pathway files for all of SMPDBrsquospathways and increased SMPDBrsquos visibility by integratingSMPDB into DrugBank HMDB and MetaboAnalystFinally we have substantially expanded the size andscope of SMPDB to include gt600 pathways (a 70increase) and have added a large number of drug metab-olism and physiological action pathways With theseenhancements we believe that SMPDB 20 will appealto a much broader community of researchers and willoffer users a much more enriched interactive and inform-ative experience A detailed description of SMPDB 20follows

INCREASED SIZE AND SCOPE

When it was initially released SMPDB (version 10) con-tained 350 hand-drawn pathways These small moleculepathways were limited to three general categories basicmetabolism metabolic diseases and selected drug-actionpathways Today SMPDB 20 contains gt600 pathways(a 70 increase) and it now includes six kinds ofsmall molecule pathway categories including basic metab-olism metabolic diseases drug-action pathways drug

metabolism pathways metabolite signaling pathwaysand physiological action pathways The inclusion ofdrug metabolism pathways in SMPDB 20 was driven byuser requests and the rapidly growing interest in drug me-tabolism in many fields of drug research and clinicalmetabolomics The addition of metabolite signaling andphysiological action pathways was motivated by the needto handle a larger number of drug-action pathways It wasalso brought on by the growing awareness by members ofthe metabolomics and nutritional science communitiesthat many metabolic pathways act at the level of organsand tissues and not only within cells In addition to thesignificant numerical increase in SMPDBrsquos size and scopemany of the pathway diagrams in SMPDB 20 have beenenhanced with additional cellular or physiological infor-mation Now nearly every SMPDB pathway includes in-formation about where the reactions occur (cellularcompartment(s) organs or tissues) the siteorgan ofaction for drugs (for drug-action pathways) or toxic me-tabolites (for metabolic disorders) as well as informationon key membrane-bound transporters (which move drugsand metabolites in and out of almost every cell) This kindof physiological or extracellular information is rarelycaptured or displayed in other online pathway databasesHowever because SMPDB is a human-only pathwayresource this sort of physiological cellular biochemicalor biomedical information is particularly importantConsequently considerable effort over the past 3 yearshas been put into capturing and displaying these data

As with its predecessor SMPDB 20 is still strictlyfocused on curating human-only small molecule-onlypathways Therefore SMPDB only displays thosepathways where small molecules (metabolites or drugslt1500 daltons) play key roles andor where small mol-ecules represent a significant proportion of functionalentities in a given pathway All of the pathways inSMPDB 20 still retain the standard (and in some casesunique) database functionalities of the original databaseincluding hyperlinked structures (to HMDB orDrugBank) hyperlinked protein images (to UniProt)text searching utilities (TextQuery) chemical searchingutilities (ChemQuery) sequence searching utilities(Sequence Search) pathway browsing capabilities (SMP-Browse with filtering options) detailed pathway descrip-tions extensive references metabolite highlightingcapabilities (SMP-Analyzer and SMP-Highlight)pathway legends and metabolite mapping formetabolomic applications (SMP-MAP) Additionaldetails about these functions and how they can be usedare provided in the original SMPDB paper An updatedand detailed comparison of SMPDB 20 to its predecessor(SMPDB 10) and to other pathway databases is providedin Table 1

ENHANCEMENTS IN VISUALIZATION ANDINTERACTIVITY

One of the distinguishing features of SMPDB has been thestrong focus on providing users with high-quality artistic-ally pleasing pathway diagrams that were not only correct

2 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

and informative but also colorful interactive and richlydetailed This continues to be a major focus for SMPDB20 To this end a significant effort over the past year hasbeen put into enhancing the visual displays and interactiv-ity in SMPDB The previous version of SMPDB onlyoffered a three-step zooming capability that was furtherhindered by the requirement that users navigate throughthe zoomed-in images using multiple scroll bars This re-stricted zooming capability proved to be both awkwardand inconvenient for many users Likewise the placementof the pathway descriptions and references (above andbelow the pathway images) limited how much of apathway could be viewed on a web page or computerscreen In addition the size of the chemical structures(too large) the text labels (too small) and the pathwayarrows (too thin) was often problematic for a number ofpathways depending on what level of zooming was usedThe uniformly dark blue background (to indicate anaqueous environment) was also challenging for thosewishing to print SMPDB pathways on paper forpreparing slides and for visualizing certain features or dis-tinguishing among subcellular locations

For SMPDB 20 a completely new visualization enginewas built using scalable vector graphics (SVG) and aweb interface technology inspired by Google Maps(httpenwikipediaorgwikiGoogle_Maps) This en-hancement now allows rapid and continuous zoomingusing a mouse scroll wheel or through simply clickingon-screen zoom icons It also allows facile navigationaround zoomed-in pathway diagrams in SMPDB 20through a simple click-and-drag operation or throughclicking on-screen updown or leftright arrows locatednear the on-screen zoom functions A full-screen view ofeach SMPDB map is also available This view can betoggled off and on by clicking the full-screen iconlocated on SMPDBrsquos new information and control panel(located on the right side of each pathway map) The

information and control panel allows users to easilyclick on labeled buttons to view text (pathway descriptionsand references) highlight (SMP-Highlight) annotate con-centration data (SMP-Analyze) download files and adjustimage settings By moving the old text boxes containingpathway descriptions and pathway references to SMPDB20rsquos information and control panel more on-screen realestate is now available for viewing pathway diagramsFurthermore by resizing recoloring and standardizingthe size of the chemical structures arrows and text in allof SMPDBrsquos pathway images most features under mostzooming conditions should be much more visible andeasily discerned Using the image settings located in thecontrol panel users can also adjust the background colors(from blue to white) and the cellular membrane display(simple versus detailed) to facilitate printing imagecapture or slide preparation The recoloring process hasalso been extended to other parts of SMPDB as well Inparticular all reactions that take place in certain organ-elles or subcellular locations have now been recolored sothat the organellersquos background color serves as the back-ground color for the (zoomed-in) reaction This shouldmake compartment-specific reactions or pathways farmore distinctive and discernable Another enhancementto SMPDB is now available through the home pagewhere a scrollable lsquocarouselrsquo of SMPDB pathway imagesis available This carousel which can be scrolled throughby simply lsquoswipingrsquo the mouse over the images will allowusers to view and select newly loaded or recently updatedSMPDB pathway maps A montage of many of these visu-alization features and enhancements is shown in Figure 1

IMPROVED STANDARDIZATION AND REDUCEDMAINTENANCE

SMPDB was originally built by hand-drawing eachpathway using PowerPoint incorporating a collection of

Table 1 Comparison of SMPDB 20 to SMPDB 10 to KEGG HumanCyc Reactome BioCarta and WikiPathwaysGenMAPP

Feature SMPDB 20 SMPDB 10 KEGG Reactome HumanCyc BioCarta WikiPathways

Number of metabolic pathways 92 70 78 (forhumans)

64 (forhumans)

333 (shortpathways)

69 50 (forhumans)

Number of disease pathways 221 113 52 11 0 24 16Number of drug action pathways 232 168 0 3 3 10 5Number of drug metabolism pathways 53 0 8 1 0 0 2Number of physio action pathways 5 0 31 36 0 5 22Number of small mol signaling pathways 15 0 30 27 0 30 15Provides multiple organism pathways No No Yes Yes No Yes YesChemical structures shown in diagrams Yes Yes No No Yes (when zoomed) Some NoProtein 4o structures shown in diagrams Yes Yes No No No Some NoCell structures shown in pathway diagrams Yes Yes Some Yes No Yes NoOrgans shown in pathway diagrams Most Some No No No No NoDescriptions of pathways provided Detailed Detailed Limited Detailed Detailed Detailed LimitedPathway images are hyperlinked Yes Yes Yes Yes Yes Yes NoPathway images are easily zoomable Yes No No Yes Yes No LimitedInformation provided on pathway entities Detailed Detailed Modest Modest Moderate Limited LimitedSupports advanced text search Yes Yes No Yes Yes No NoSupports sequence searching Yes Yes No No Yes No NoSupports graphical chem structure search Yes Yes No No No No NoSupports chemical expression mapping Yes Yes Yes Yes Yes No NoSupports geneprotein expression mapping Yes Yes Yes Yes Yes No NoDownloadable Yes Yes Limited Yes Yes No YesBioPax CellML or SBML compatible Yes No Partial Yes Yes No No

Nucleic Acids Research 2013 3

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

Figure 1 A screenshot montage of SMPDB 20rsquos various viewing and searching features

4 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

visual icons and a well-defined standard operatingprotocol (SOP) Each PowerPoint image was then con-verted to three differently sized PNG images (smallmedium and large) and manually image-mapped to appro-priate database links Prior to the image mapping processpathway constituents (proteins and chemicals) weremanually identified their database IDs manuallydetermined and each data point entered in a separate filefor each SMPDB pathway These text files were used topermit text and structure searching as well as compoundhighlighting (via SMP-MAP and SMP-Highlight) Thismanually intensive process proved to be arduous time-consuming and error prone The many different skillsand multiple steps required to generate and post aSMPDB pathway meant that a team of more than 10 dif-ferently trained individuals were needed to constructupdate and maintain the database Even with SOPs anda standardized icon library some individual variabilityoccurred with regard to pathway layout pathwaydetails icon placement directional arrow size and otherimage features Additionally the manual placement ofimage-mapped hyperlinks meant that some links wereplaced imprecisely leading to off-centre effects thatbecame further exacerbated through zooming IfSMPDB pathways needed to be updated or correctedmultiple images still had to be generated and multipleimage maps still had to be prepared If a large numberof highly similar pathways were involved (such as drugmechanism pathways) a simple single molecule updatecould often take many hours of repetitive error-proneediting Because all of SMPDBrsquos pathways were essen-tially image-mapped pictures it was not possible to easilyrender them into BioPAX or SBML formats As a resultefforts to convert SMPDB pathways into BioPAX orSBML equivalents were stymied

To simplify the updating process and improve the levelof standardization and figure rendering consistency forSMPDB 20 we developed an online pathway illustrationand editing tool called PathWhiz (manuscript in prepar-ation) PathWhiz which is a combination of an onlinedigital painting program and a webpage buildingprogram allowed SMPDB curators to digitally renderupdate and annotate pathways in a fraction of the timethat the old lsquoanalogrsquo approach required PathWhiz alsoallowed our curation team to produce far more consist-ently rendered pathways and consistently sized structuresthat looked as if they all came from a single artistPathWhiz uses a pull-down library of standard imageicons for membranes membrane compartments organstissues organelles membranes and proteins to help accel-erate the pathway illustration process It also has astandard set of arrows and arrow-drawing utilities thatallow all SMPDB reaction components to be linked orconnected in a uniform visually pleasing fashionBecause PathWhiz also captures icon placement and an-notation information digitally it allowed all hyperlinks tobe precisely mapped all pathway images to be rendered asSVG and all pathway component information (namesstructures database IDs etc) to be readily capturedThe updating process is also made much easier as allchanges in one pathway (images hyperlinks names

database IDs) can be digitally transferred to all othersimilar pathways at the click of a button The digitalnature of the drawing and icon placement processthrough PathWhiz meant that it also became very easyto convert all SMPDB 20 pathways into BioPAXformat Conversion of the SMPDB pathways to SBMLis underway now and should be completed before 2014Thanks to PathWhiz all SMPDB pathways have beensignificantly updated corrected and improved Likewisemany new pathways were able to be added in a fractionof the time (compared to the old approach) allowing asignificant increase in both the quantity and quality ofpathways in SMPDB 20

ENHANCED DATA DOWNLOADS

Another important and increasingly unique feature ofSMPDB is the open availability of its data In particularSMPDB supports both data downloads and image down-loads for all of its pathways The data files include proteinand gene sequences for all the annotated proteins andgenes in SMPDB SMPDB downloads also include all me-tabolite and drug structures (in SDF format) as well asCSV files containing different combinations of compoundnames or protein names linked to pathways and databaseidentifiers The image files (in SMPDB 10) included boththe original PowerPoint files and the differently sized(small medium large) PNG files corresponding to eachpathway With the release of SMPDB 20 the same kind ofdata download information is still available although thenumber of sequences structures and database links is nowsubstantially greater In addition SMPDBrsquos data down-loads now include all of its pathway information inBioPAX format BioPAX is an RDFOWL-basedstandard exchange language designed to compactly repre-sent biological pathways at the molecular and cellularlevel (12) The availability of SMPDBrsquos data in BioPAXformat and the impending availability of the samepathway data in SBML (expected in late 2013) shouldgreatly expand SMPDBrsquos appeal to systems biologists aswell as its potential utility in a variety of pathway bio-chemical and metabolomic analysis packages It is alsoexpected that the availability of SMPDB BioPAX (andSBML) files will encourage greater interest in communityannotation allowing SMPDB to evolve from a centrallycurated database (which is inefficient) to one that may becurated and enhanced by a community of experts (which ismore efficient) SMPDB 20rsquos image file downloads havealso been modified and enhanced to include not only high-resolution PNG files but also a SVG images of everypathway with all of the BioPAX data imbedded (asmetadata) into every file With these enhancements toSMPDB 20rsquos downloadable resources we believe it nowoffers the most comprehensive collection of downloadabledata of any online pathway database

IMPROVED QUALITY AND QUALITY ASSURANCE

As noted earlier every pathway in SMPDB 20 has beenredrawn using the new PathWhiz rendering tool This

Nucleic Acids Research 2013 5

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

effort allowed us to revisit all of SMPDBrsquos pathways andto update them with new information and to correctimprove or reformat preexisting pathways so that theywere more consistent and informative Many pathwayswere improved through the addition of more organelleand reaction compartment information Others wereenhanced through the addition of target organ informa-tion and protein transporter information Likewise imagesthat were too small arrows that were too thin andpathways that were too convoluted were corrected andor simplified In redrawing old pathways and renderingnew pathways in SMPDB 20 the same level of qualityassurance the same referential resources and the samekinds of data checkingvalidation routines were used asdescribed in the original SMPDB paper (7) In particularall pathways in SMPDB 20 were drawn using a standardoperating procedure (SOP) with a checklist of featuresthat each of the pathway artistsannotators was requiredto follow This SOP and checklist is also provided in thelsquoAboutrsquo menu To ensure that the pathways have beenknowledgably illustrated all of the SMPDB pathwaycurators were required to have degrees or advanced uni-versity coursework in biology chemistry biochemistry orbioinformatics During the redrawing of existing SMPDBpathways and the generation of new pathways forSMPDB 20 all SMPDB curators were also required toconsult a variety of sources including biochemistry text-books OMIM (13) Wikipedia KEGG (1) MetaCyc (3)Reactome (4) UniProt (11) the HMDB (10) DrugBank(9) and PharmGKB (6) This allowed the curation team toidentify and consolidate pathway nomenclature keypathway components critical reactions cellular compart-ments as well as target organs compartments or organ-elles Pathway layouts were also frequently assessedcompared and discussed by SMPDB team membersprior to manually generating any pathway diagramsBecause of the unique rendering requirements and thestrict SOPs for SMPDB no pathway in SMPDB waslsquocopiedrsquo from any other pathway diagram in any otherdatabase Furthermore SMPDBrsquos metabolic diseasepathways drug-action and drug metabolism pathwayshad to be generated independently of any pathway data-base Instead relevant information was gathered fromvarious medical and pharmacology textbooks specializedencyclopedias and fragmentary data contained on severalonline databases such as OMIM (13) PharmGKB (6) andDrugBank (9) As an additional layer of quality assuranceall of the pathway diagrams in the SMPDB 20 have beeninspected and corrected by two or more curators havingadvanced degrees in biochemistry or physiology

INCREASED CONNECTIVITY ANDINTEROPERABILITY

When SMPDB was first released it was primarily a stand-alone database that simply accessed other databasesthrough its hyperlinks to HMDB DrugBank andUniProt Over the past 2 years efforts have beendirected at integrating SMPDB into a number of otherpopular online databases so that effectively other

databases now access SMPDB In particular SMPDB isnow linked into the pathway data fields of the latestversion of HMDB (10) and the latest release ofDrugBank (9) It has also been linked into several onlinemetabolomic data analysis resources includingMetaboAnalyst (14) and MSEA (15) Discussions areongoing to have SMPDB 20 linked to a number ofother chemical entity databases as well as other well-known pathway databases in the near future Inaddition to trying to increase SMPDBrsquos database connect-ivity we are also trying to make it far more interoperableThe availability of SMPDBrsquos data in BioPAX format andthe expected availability of the same pathway data inSBML (in late 2013) should greatly improve SMPDBrsquosinteroperability It should also encourage users to freelyexchange SMPDB data and to incorporate these data intoa variety of pathway or metabolomic analysis softwarepackages It is also expected that the availability ofSMPDB BioPAX (and soon SBML) files will encouragegreater interest in community annotation allowingSMPDB to evolve from a centrally curated database toa community curated resource

FUTURE PLANS AND CONCLUSIONS

SMPDB continues to expand with new pathways beingadded at an almost daily rate Over the coming 2ndash3years we expect that the number of pathways inSMPDB will likely double It is likely that the greatestgrowth in SMPDBrsquos pathways will be in the drug anddrug metabolism pathways as there are literally thou-sands of reactions reactants and pathways that arereadily available in the literature With the impendingrelease of PathWhiz (SMPDBrsquos pathway drawing tooland editor) we are hopeful that many of SMPDBrsquos newpathways will actually be added by interested communitymembers It is also expected that PathWhiz will enableSMPDB-like pathways and pathway databases to beeasily generated for other model organisms such asSaccharomyces cerevisae Escherichia coli Arabidopsisand Drosophila With the latest additions and enhance-ments in SMPDB 20 we believe that SMPDB has reacheda critical threshold giving it sufficient breadth depth andinterconnectivity so that it will appeal to a much largercommunity of users Overall SMPDB 20rsquos new colorfulinformative artfully designed pathway diagramscombined with its wide range of visualization annotationand querying tools should provide users a much moreenriched interactive and informative pathway viewingexperience

FUNDING

Canadian Institutes for Health Research (CIHR) AlbertaInnovates Health Solutions (AIHS) and Genome Albertaa division of Genome Canada Funding for open accesscharge Genome Canada

Conflict of interest statement None declared

6 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

REFERENCES

1 KanehisaM GotoS SatoY FurumichiM and TanabeM(2012) KEGG for integration and interpretation of large-scalemolecular data sets Nucleic Acids Res 40 D109ndashD114

2 KeselerIM MackieA Peralta-GilM Santos-ZavaletaAGama-CastroS Bonavides-MartınezC FulcherC HuertaAMKothariA KrummenackerM et al (2013) EcoCyc fusing modelorganism databases with systems biology Nucleic Acids Res 41D605ndashD612

3 CaspiR AltmanT DreherK FulcherCA SubhravetiPKeselerIM KothariA KrummenackerM LatendresseMMuellerLA et al (2012) The MetaCyc database of metabolicpathways and enzymes and the BioCyc collection of pathwaygenome databases Nucleic Acids Res 40 D742ndashD753

4 CroftD OrsquoKellyG WuG HawR GillespieM MatthewsLCaudyM GarapatiP GopinathG JassalB et al (2011)Reactome a database of reactions pathways and biologicalprocesses Nucleic Acids Res 39 D691ndashD697

5 PicoAR KelderT van IerselMP HanspersK ConklinBRand EveloC (2008) WikiPathways pathway editing for thepeople PLoS Biol 6 e184

6 ThornCF KleinTE and AltmanRB (2013) PharmGKB thePharmacogenomics Knowledge Base Methods Mol Biol 1015311ndash320

7 FrolkisA KnoxC LimE JewisonT LawV HauDDLiuP GautamB LyS GuoAC et al (2010) SMPDB TheSmall Molecule Pathway Database Nucleic Acids Res 38D480ndashD487

8 StrombackL and LambrixP (2005) Representations ofmolecular pathways an evaluation of SBML PSI MI andBioPAX Bioinformatics 21 4401ndash4407

9 KnoxC LawV JewisonT LiuP LyS FrolkisA PonABancoK MakC NeveuV et al (2011) DrugBank 30 acomprehensive resources for lsquoOmicsrsquo research on drugs NucleicAcids Res 39 D1035ndashD1041

10 WishartDS JewisonT GuoAC WilsonM KnoxC LiuYDjoumbouY MandalR AziatF DongE et al (2013) HMDB30ndashThe Human Metabolome Database in 2013 Nucleic AcidsRes 41 D801ndashD807

11 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47

12 WebbRL and MarsquoayanA (2011) Sig2BioPAX Java tool forconverting flat files to BioPAX Level 3 format Source Code BiolMed 6 5

13 HamoshA ScottAF AmbergerJS BocchiniCA andMcKusickVA (2005) Online Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disordersNucleic Acids Res 33 D514ndashD517

14 XiaJ MandalR SinelnikovIV BroadhurstD andWishartDS (2012) MetaboAnalyst 20ndasha comprehensive serverfor metabolomic data analysis Nucleic Acids Res 40W127ndashW133

15 XiaJ and WishartDS (2010) MSEA a web-based tool toidentify biologically meaningful patterns in quantitativemetabolomic data Nucleic Acids Res 38 W71ndashW77

Nucleic Acids Research 2013 7

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

contained gt350 colorful artistically rendered and bio-chemically complete pathway illustrations describingmany key aspects of human metabolism These includedhyperlinked diagrams of human metabolic pathwaysmetabolic disease pathways metabolite signalingpathways and drug-action pathways Many of thesepathway diagrams included information on the tissuelocations relevant organs organelles subcellular com-partments protein cofactors protein locations metabolitelocations chemical structures and protein quaternarystructures As with many high-quality web-basedpathway databases SMPDB also provided extensivehyperlinking browsing searching and annotation func-tions along with detailed pathway descriptions andreferencesHowever the first version of SMPDB was not without

some shortcomings In particular it was difficult to updateand maintain it was limited in size and scope itspathways lacked a certain degree of standardization itsphysiological information on organelles and organs wasincomplete and its visual interactivity (zooming and imagenavigation) was awkward and somewhat restrictedFurthermore SMPDB pathway diagrams were onlydownloadable as static images and not available instandard formats such as BioPAX or SBML (8) WhileSMPDB did link to well-known databases such asDrugBank (9) HMDB (10) or UniProt (11) these data-bases did not link to it In other words SMPDB was notparticularly visible to the metabolomics drug research orsystems biology communitiesWith the release of SMPDB 20 we believe we have

addressed all of these shortcomings In particular wehave developed tools to simplify SMPDBrsquos maintenanceand enhance its illustration standards We have also cor-rected and added much more physiological (tissue organorganelle and transporter) information substantiallyimproved its visual interactivity and quality createddownloadable BioPAX pathway files for all of SMPDBrsquospathways and increased SMPDBrsquos visibility by integratingSMPDB into DrugBank HMDB and MetaboAnalystFinally we have substantially expanded the size andscope of SMPDB to include gt600 pathways (a 70increase) and have added a large number of drug metab-olism and physiological action pathways With theseenhancements we believe that SMPDB 20 will appealto a much broader community of researchers and willoffer users a much more enriched interactive and inform-ative experience A detailed description of SMPDB 20follows

INCREASED SIZE AND SCOPE

When it was initially released SMPDB (version 10) con-tained 350 hand-drawn pathways These small moleculepathways were limited to three general categories basicmetabolism metabolic diseases and selected drug-actionpathways Today SMPDB 20 contains gt600 pathways(a 70 increase) and it now includes six kinds ofsmall molecule pathway categories including basic metab-olism metabolic diseases drug-action pathways drug

metabolism pathways metabolite signaling pathwaysand physiological action pathways The inclusion ofdrug metabolism pathways in SMPDB 20 was driven byuser requests and the rapidly growing interest in drug me-tabolism in many fields of drug research and clinicalmetabolomics The addition of metabolite signaling andphysiological action pathways was motivated by the needto handle a larger number of drug-action pathways It wasalso brought on by the growing awareness by members ofthe metabolomics and nutritional science communitiesthat many metabolic pathways act at the level of organsand tissues and not only within cells In addition to thesignificant numerical increase in SMPDBrsquos size and scopemany of the pathway diagrams in SMPDB 20 have beenenhanced with additional cellular or physiological infor-mation Now nearly every SMPDB pathway includes in-formation about where the reactions occur (cellularcompartment(s) organs or tissues) the siteorgan ofaction for drugs (for drug-action pathways) or toxic me-tabolites (for metabolic disorders) as well as informationon key membrane-bound transporters (which move drugsand metabolites in and out of almost every cell) This kindof physiological or extracellular information is rarelycaptured or displayed in other online pathway databasesHowever because SMPDB is a human-only pathwayresource this sort of physiological cellular biochemicalor biomedical information is particularly importantConsequently considerable effort over the past 3 yearshas been put into capturing and displaying these data

As with its predecessor SMPDB 20 is still strictlyfocused on curating human-only small molecule-onlypathways Therefore SMPDB only displays thosepathways where small molecules (metabolites or drugslt1500 daltons) play key roles andor where small mol-ecules represent a significant proportion of functionalentities in a given pathway All of the pathways inSMPDB 20 still retain the standard (and in some casesunique) database functionalities of the original databaseincluding hyperlinked structures (to HMDB orDrugBank) hyperlinked protein images (to UniProt)text searching utilities (TextQuery) chemical searchingutilities (ChemQuery) sequence searching utilities(Sequence Search) pathway browsing capabilities (SMP-Browse with filtering options) detailed pathway descrip-tions extensive references metabolite highlightingcapabilities (SMP-Analyzer and SMP-Highlight)pathway legends and metabolite mapping formetabolomic applications (SMP-MAP) Additionaldetails about these functions and how they can be usedare provided in the original SMPDB paper An updatedand detailed comparison of SMPDB 20 to its predecessor(SMPDB 10) and to other pathway databases is providedin Table 1

ENHANCEMENTS IN VISUALIZATION ANDINTERACTIVITY

One of the distinguishing features of SMPDB has been thestrong focus on providing users with high-quality artistic-ally pleasing pathway diagrams that were not only correct

2 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

and informative but also colorful interactive and richlydetailed This continues to be a major focus for SMPDB20 To this end a significant effort over the past year hasbeen put into enhancing the visual displays and interactiv-ity in SMPDB The previous version of SMPDB onlyoffered a three-step zooming capability that was furtherhindered by the requirement that users navigate throughthe zoomed-in images using multiple scroll bars This re-stricted zooming capability proved to be both awkwardand inconvenient for many users Likewise the placementof the pathway descriptions and references (above andbelow the pathway images) limited how much of apathway could be viewed on a web page or computerscreen In addition the size of the chemical structures(too large) the text labels (too small) and the pathwayarrows (too thin) was often problematic for a number ofpathways depending on what level of zooming was usedThe uniformly dark blue background (to indicate anaqueous environment) was also challenging for thosewishing to print SMPDB pathways on paper forpreparing slides and for visualizing certain features or dis-tinguishing among subcellular locations

For SMPDB 20 a completely new visualization enginewas built using scalable vector graphics (SVG) and aweb interface technology inspired by Google Maps(httpenwikipediaorgwikiGoogle_Maps) This en-hancement now allows rapid and continuous zoomingusing a mouse scroll wheel or through simply clickingon-screen zoom icons It also allows facile navigationaround zoomed-in pathway diagrams in SMPDB 20through a simple click-and-drag operation or throughclicking on-screen updown or leftright arrows locatednear the on-screen zoom functions A full-screen view ofeach SMPDB map is also available This view can betoggled off and on by clicking the full-screen iconlocated on SMPDBrsquos new information and control panel(located on the right side of each pathway map) The

information and control panel allows users to easilyclick on labeled buttons to view text (pathway descriptionsand references) highlight (SMP-Highlight) annotate con-centration data (SMP-Analyze) download files and adjustimage settings By moving the old text boxes containingpathway descriptions and pathway references to SMPDB20rsquos information and control panel more on-screen realestate is now available for viewing pathway diagramsFurthermore by resizing recoloring and standardizingthe size of the chemical structures arrows and text in allof SMPDBrsquos pathway images most features under mostzooming conditions should be much more visible andeasily discerned Using the image settings located in thecontrol panel users can also adjust the background colors(from blue to white) and the cellular membrane display(simple versus detailed) to facilitate printing imagecapture or slide preparation The recoloring process hasalso been extended to other parts of SMPDB as well Inparticular all reactions that take place in certain organ-elles or subcellular locations have now been recolored sothat the organellersquos background color serves as the back-ground color for the (zoomed-in) reaction This shouldmake compartment-specific reactions or pathways farmore distinctive and discernable Another enhancementto SMPDB is now available through the home pagewhere a scrollable lsquocarouselrsquo of SMPDB pathway imagesis available This carousel which can be scrolled throughby simply lsquoswipingrsquo the mouse over the images will allowusers to view and select newly loaded or recently updatedSMPDB pathway maps A montage of many of these visu-alization features and enhancements is shown in Figure 1

IMPROVED STANDARDIZATION AND REDUCEDMAINTENANCE

SMPDB was originally built by hand-drawing eachpathway using PowerPoint incorporating a collection of

Table 1 Comparison of SMPDB 20 to SMPDB 10 to KEGG HumanCyc Reactome BioCarta and WikiPathwaysGenMAPP

Feature SMPDB 20 SMPDB 10 KEGG Reactome HumanCyc BioCarta WikiPathways

Number of metabolic pathways 92 70 78 (forhumans)

64 (forhumans)

333 (shortpathways)

69 50 (forhumans)

Number of disease pathways 221 113 52 11 0 24 16Number of drug action pathways 232 168 0 3 3 10 5Number of drug metabolism pathways 53 0 8 1 0 0 2Number of physio action pathways 5 0 31 36 0 5 22Number of small mol signaling pathways 15 0 30 27 0 30 15Provides multiple organism pathways No No Yes Yes No Yes YesChemical structures shown in diagrams Yes Yes No No Yes (when zoomed) Some NoProtein 4o structures shown in diagrams Yes Yes No No No Some NoCell structures shown in pathway diagrams Yes Yes Some Yes No Yes NoOrgans shown in pathway diagrams Most Some No No No No NoDescriptions of pathways provided Detailed Detailed Limited Detailed Detailed Detailed LimitedPathway images are hyperlinked Yes Yes Yes Yes Yes Yes NoPathway images are easily zoomable Yes No No Yes Yes No LimitedInformation provided on pathway entities Detailed Detailed Modest Modest Moderate Limited LimitedSupports advanced text search Yes Yes No Yes Yes No NoSupports sequence searching Yes Yes No No Yes No NoSupports graphical chem structure search Yes Yes No No No No NoSupports chemical expression mapping Yes Yes Yes Yes Yes No NoSupports geneprotein expression mapping Yes Yes Yes Yes Yes No NoDownloadable Yes Yes Limited Yes Yes No YesBioPax CellML or SBML compatible Yes No Partial Yes Yes No No

Nucleic Acids Research 2013 3

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

Figure 1 A screenshot montage of SMPDB 20rsquos various viewing and searching features

4 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

visual icons and a well-defined standard operatingprotocol (SOP) Each PowerPoint image was then con-verted to three differently sized PNG images (smallmedium and large) and manually image-mapped to appro-priate database links Prior to the image mapping processpathway constituents (proteins and chemicals) weremanually identified their database IDs manuallydetermined and each data point entered in a separate filefor each SMPDB pathway These text files were used topermit text and structure searching as well as compoundhighlighting (via SMP-MAP and SMP-Highlight) Thismanually intensive process proved to be arduous time-consuming and error prone The many different skillsand multiple steps required to generate and post aSMPDB pathway meant that a team of more than 10 dif-ferently trained individuals were needed to constructupdate and maintain the database Even with SOPs anda standardized icon library some individual variabilityoccurred with regard to pathway layout pathwaydetails icon placement directional arrow size and otherimage features Additionally the manual placement ofimage-mapped hyperlinks meant that some links wereplaced imprecisely leading to off-centre effects thatbecame further exacerbated through zooming IfSMPDB pathways needed to be updated or correctedmultiple images still had to be generated and multipleimage maps still had to be prepared If a large numberof highly similar pathways were involved (such as drugmechanism pathways) a simple single molecule updatecould often take many hours of repetitive error-proneediting Because all of SMPDBrsquos pathways were essen-tially image-mapped pictures it was not possible to easilyrender them into BioPAX or SBML formats As a resultefforts to convert SMPDB pathways into BioPAX orSBML equivalents were stymied

To simplify the updating process and improve the levelof standardization and figure rendering consistency forSMPDB 20 we developed an online pathway illustrationand editing tool called PathWhiz (manuscript in prepar-ation) PathWhiz which is a combination of an onlinedigital painting program and a webpage buildingprogram allowed SMPDB curators to digitally renderupdate and annotate pathways in a fraction of the timethat the old lsquoanalogrsquo approach required PathWhiz alsoallowed our curation team to produce far more consist-ently rendered pathways and consistently sized structuresthat looked as if they all came from a single artistPathWhiz uses a pull-down library of standard imageicons for membranes membrane compartments organstissues organelles membranes and proteins to help accel-erate the pathway illustration process It also has astandard set of arrows and arrow-drawing utilities thatallow all SMPDB reaction components to be linked orconnected in a uniform visually pleasing fashionBecause PathWhiz also captures icon placement and an-notation information digitally it allowed all hyperlinks tobe precisely mapped all pathway images to be rendered asSVG and all pathway component information (namesstructures database IDs etc) to be readily capturedThe updating process is also made much easier as allchanges in one pathway (images hyperlinks names

database IDs) can be digitally transferred to all othersimilar pathways at the click of a button The digitalnature of the drawing and icon placement processthrough PathWhiz meant that it also became very easyto convert all SMPDB 20 pathways into BioPAXformat Conversion of the SMPDB pathways to SBMLis underway now and should be completed before 2014Thanks to PathWhiz all SMPDB pathways have beensignificantly updated corrected and improved Likewisemany new pathways were able to be added in a fractionof the time (compared to the old approach) allowing asignificant increase in both the quantity and quality ofpathways in SMPDB 20

ENHANCED DATA DOWNLOADS

Another important and increasingly unique feature ofSMPDB is the open availability of its data In particularSMPDB supports both data downloads and image down-loads for all of its pathways The data files include proteinand gene sequences for all the annotated proteins andgenes in SMPDB SMPDB downloads also include all me-tabolite and drug structures (in SDF format) as well asCSV files containing different combinations of compoundnames or protein names linked to pathways and databaseidentifiers The image files (in SMPDB 10) included boththe original PowerPoint files and the differently sized(small medium large) PNG files corresponding to eachpathway With the release of SMPDB 20 the same kind ofdata download information is still available although thenumber of sequences structures and database links is nowsubstantially greater In addition SMPDBrsquos data down-loads now include all of its pathway information inBioPAX format BioPAX is an RDFOWL-basedstandard exchange language designed to compactly repre-sent biological pathways at the molecular and cellularlevel (12) The availability of SMPDBrsquos data in BioPAXformat and the impending availability of the samepathway data in SBML (expected in late 2013) shouldgreatly expand SMPDBrsquos appeal to systems biologists aswell as its potential utility in a variety of pathway bio-chemical and metabolomic analysis packages It is alsoexpected that the availability of SMPDB BioPAX (andSBML) files will encourage greater interest in communityannotation allowing SMPDB to evolve from a centrallycurated database (which is inefficient) to one that may becurated and enhanced by a community of experts (which ismore efficient) SMPDB 20rsquos image file downloads havealso been modified and enhanced to include not only high-resolution PNG files but also a SVG images of everypathway with all of the BioPAX data imbedded (asmetadata) into every file With these enhancements toSMPDB 20rsquos downloadable resources we believe it nowoffers the most comprehensive collection of downloadabledata of any online pathway database

IMPROVED QUALITY AND QUALITY ASSURANCE

As noted earlier every pathway in SMPDB 20 has beenredrawn using the new PathWhiz rendering tool This

Nucleic Acids Research 2013 5

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

effort allowed us to revisit all of SMPDBrsquos pathways andto update them with new information and to correctimprove or reformat preexisting pathways so that theywere more consistent and informative Many pathwayswere improved through the addition of more organelleand reaction compartment information Others wereenhanced through the addition of target organ informa-tion and protein transporter information Likewise imagesthat were too small arrows that were too thin andpathways that were too convoluted were corrected andor simplified In redrawing old pathways and renderingnew pathways in SMPDB 20 the same level of qualityassurance the same referential resources and the samekinds of data checkingvalidation routines were used asdescribed in the original SMPDB paper (7) In particularall pathways in SMPDB 20 were drawn using a standardoperating procedure (SOP) with a checklist of featuresthat each of the pathway artistsannotators was requiredto follow This SOP and checklist is also provided in thelsquoAboutrsquo menu To ensure that the pathways have beenknowledgably illustrated all of the SMPDB pathwaycurators were required to have degrees or advanced uni-versity coursework in biology chemistry biochemistry orbioinformatics During the redrawing of existing SMPDBpathways and the generation of new pathways forSMPDB 20 all SMPDB curators were also required toconsult a variety of sources including biochemistry text-books OMIM (13) Wikipedia KEGG (1) MetaCyc (3)Reactome (4) UniProt (11) the HMDB (10) DrugBank(9) and PharmGKB (6) This allowed the curation team toidentify and consolidate pathway nomenclature keypathway components critical reactions cellular compart-ments as well as target organs compartments or organ-elles Pathway layouts were also frequently assessedcompared and discussed by SMPDB team membersprior to manually generating any pathway diagramsBecause of the unique rendering requirements and thestrict SOPs for SMPDB no pathway in SMPDB waslsquocopiedrsquo from any other pathway diagram in any otherdatabase Furthermore SMPDBrsquos metabolic diseasepathways drug-action and drug metabolism pathwayshad to be generated independently of any pathway data-base Instead relevant information was gathered fromvarious medical and pharmacology textbooks specializedencyclopedias and fragmentary data contained on severalonline databases such as OMIM (13) PharmGKB (6) andDrugBank (9) As an additional layer of quality assuranceall of the pathway diagrams in the SMPDB 20 have beeninspected and corrected by two or more curators havingadvanced degrees in biochemistry or physiology

INCREASED CONNECTIVITY ANDINTEROPERABILITY

When SMPDB was first released it was primarily a stand-alone database that simply accessed other databasesthrough its hyperlinks to HMDB DrugBank andUniProt Over the past 2 years efforts have beendirected at integrating SMPDB into a number of otherpopular online databases so that effectively other

databases now access SMPDB In particular SMPDB isnow linked into the pathway data fields of the latestversion of HMDB (10) and the latest release ofDrugBank (9) It has also been linked into several onlinemetabolomic data analysis resources includingMetaboAnalyst (14) and MSEA (15) Discussions areongoing to have SMPDB 20 linked to a number ofother chemical entity databases as well as other well-known pathway databases in the near future Inaddition to trying to increase SMPDBrsquos database connect-ivity we are also trying to make it far more interoperableThe availability of SMPDBrsquos data in BioPAX format andthe expected availability of the same pathway data inSBML (in late 2013) should greatly improve SMPDBrsquosinteroperability It should also encourage users to freelyexchange SMPDB data and to incorporate these data intoa variety of pathway or metabolomic analysis softwarepackages It is also expected that the availability ofSMPDB BioPAX (and soon SBML) files will encouragegreater interest in community annotation allowingSMPDB to evolve from a centrally curated database toa community curated resource

FUTURE PLANS AND CONCLUSIONS

SMPDB continues to expand with new pathways beingadded at an almost daily rate Over the coming 2ndash3years we expect that the number of pathways inSMPDB will likely double It is likely that the greatestgrowth in SMPDBrsquos pathways will be in the drug anddrug metabolism pathways as there are literally thou-sands of reactions reactants and pathways that arereadily available in the literature With the impendingrelease of PathWhiz (SMPDBrsquos pathway drawing tooland editor) we are hopeful that many of SMPDBrsquos newpathways will actually be added by interested communitymembers It is also expected that PathWhiz will enableSMPDB-like pathways and pathway databases to beeasily generated for other model organisms such asSaccharomyces cerevisae Escherichia coli Arabidopsisand Drosophila With the latest additions and enhance-ments in SMPDB 20 we believe that SMPDB has reacheda critical threshold giving it sufficient breadth depth andinterconnectivity so that it will appeal to a much largercommunity of users Overall SMPDB 20rsquos new colorfulinformative artfully designed pathway diagramscombined with its wide range of visualization annotationand querying tools should provide users a much moreenriched interactive and informative pathway viewingexperience

FUNDING

Canadian Institutes for Health Research (CIHR) AlbertaInnovates Health Solutions (AIHS) and Genome Albertaa division of Genome Canada Funding for open accesscharge Genome Canada

Conflict of interest statement None declared

6 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

REFERENCES

1 KanehisaM GotoS SatoY FurumichiM and TanabeM(2012) KEGG for integration and interpretation of large-scalemolecular data sets Nucleic Acids Res 40 D109ndashD114

2 KeselerIM MackieA Peralta-GilM Santos-ZavaletaAGama-CastroS Bonavides-MartınezC FulcherC HuertaAMKothariA KrummenackerM et al (2013) EcoCyc fusing modelorganism databases with systems biology Nucleic Acids Res 41D605ndashD612

3 CaspiR AltmanT DreherK FulcherCA SubhravetiPKeselerIM KothariA KrummenackerM LatendresseMMuellerLA et al (2012) The MetaCyc database of metabolicpathways and enzymes and the BioCyc collection of pathwaygenome databases Nucleic Acids Res 40 D742ndashD753

4 CroftD OrsquoKellyG WuG HawR GillespieM MatthewsLCaudyM GarapatiP GopinathG JassalB et al (2011)Reactome a database of reactions pathways and biologicalprocesses Nucleic Acids Res 39 D691ndashD697

5 PicoAR KelderT van IerselMP HanspersK ConklinBRand EveloC (2008) WikiPathways pathway editing for thepeople PLoS Biol 6 e184

6 ThornCF KleinTE and AltmanRB (2013) PharmGKB thePharmacogenomics Knowledge Base Methods Mol Biol 1015311ndash320

7 FrolkisA KnoxC LimE JewisonT LawV HauDDLiuP GautamB LyS GuoAC et al (2010) SMPDB TheSmall Molecule Pathway Database Nucleic Acids Res 38D480ndashD487

8 StrombackL and LambrixP (2005) Representations ofmolecular pathways an evaluation of SBML PSI MI andBioPAX Bioinformatics 21 4401ndash4407

9 KnoxC LawV JewisonT LiuP LyS FrolkisA PonABancoK MakC NeveuV et al (2011) DrugBank 30 acomprehensive resources for lsquoOmicsrsquo research on drugs NucleicAcids Res 39 D1035ndashD1041

10 WishartDS JewisonT GuoAC WilsonM KnoxC LiuYDjoumbouY MandalR AziatF DongE et al (2013) HMDB30ndashThe Human Metabolome Database in 2013 Nucleic AcidsRes 41 D801ndashD807

11 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47

12 WebbRL and MarsquoayanA (2011) Sig2BioPAX Java tool forconverting flat files to BioPAX Level 3 format Source Code BiolMed 6 5

13 HamoshA ScottAF AmbergerJS BocchiniCA andMcKusickVA (2005) Online Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disordersNucleic Acids Res 33 D514ndashD517

14 XiaJ MandalR SinelnikovIV BroadhurstD andWishartDS (2012) MetaboAnalyst 20ndasha comprehensive serverfor metabolomic data analysis Nucleic Acids Res 40W127ndashW133

15 XiaJ and WishartDS (2010) MSEA a web-based tool toidentify biologically meaningful patterns in quantitativemetabolomic data Nucleic Acids Res 38 W71ndashW77

Nucleic Acids Research 2013 7

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

and informative but also colorful interactive and richlydetailed This continues to be a major focus for SMPDB20 To this end a significant effort over the past year hasbeen put into enhancing the visual displays and interactiv-ity in SMPDB The previous version of SMPDB onlyoffered a three-step zooming capability that was furtherhindered by the requirement that users navigate throughthe zoomed-in images using multiple scroll bars This re-stricted zooming capability proved to be both awkwardand inconvenient for many users Likewise the placementof the pathway descriptions and references (above andbelow the pathway images) limited how much of apathway could be viewed on a web page or computerscreen In addition the size of the chemical structures(too large) the text labels (too small) and the pathwayarrows (too thin) was often problematic for a number ofpathways depending on what level of zooming was usedThe uniformly dark blue background (to indicate anaqueous environment) was also challenging for thosewishing to print SMPDB pathways on paper forpreparing slides and for visualizing certain features or dis-tinguishing among subcellular locations

For SMPDB 20 a completely new visualization enginewas built using scalable vector graphics (SVG) and aweb interface technology inspired by Google Maps(httpenwikipediaorgwikiGoogle_Maps) This en-hancement now allows rapid and continuous zoomingusing a mouse scroll wheel or through simply clickingon-screen zoom icons It also allows facile navigationaround zoomed-in pathway diagrams in SMPDB 20through a simple click-and-drag operation or throughclicking on-screen updown or leftright arrows locatednear the on-screen zoom functions A full-screen view ofeach SMPDB map is also available This view can betoggled off and on by clicking the full-screen iconlocated on SMPDBrsquos new information and control panel(located on the right side of each pathway map) The

information and control panel allows users to easilyclick on labeled buttons to view text (pathway descriptionsand references) highlight (SMP-Highlight) annotate con-centration data (SMP-Analyze) download files and adjustimage settings By moving the old text boxes containingpathway descriptions and pathway references to SMPDB20rsquos information and control panel more on-screen realestate is now available for viewing pathway diagramsFurthermore by resizing recoloring and standardizingthe size of the chemical structures arrows and text in allof SMPDBrsquos pathway images most features under mostzooming conditions should be much more visible andeasily discerned Using the image settings located in thecontrol panel users can also adjust the background colors(from blue to white) and the cellular membrane display(simple versus detailed) to facilitate printing imagecapture or slide preparation The recoloring process hasalso been extended to other parts of SMPDB as well Inparticular all reactions that take place in certain organ-elles or subcellular locations have now been recolored sothat the organellersquos background color serves as the back-ground color for the (zoomed-in) reaction This shouldmake compartment-specific reactions or pathways farmore distinctive and discernable Another enhancementto SMPDB is now available through the home pagewhere a scrollable lsquocarouselrsquo of SMPDB pathway imagesis available This carousel which can be scrolled throughby simply lsquoswipingrsquo the mouse over the images will allowusers to view and select newly loaded or recently updatedSMPDB pathway maps A montage of many of these visu-alization features and enhancements is shown in Figure 1

IMPROVED STANDARDIZATION AND REDUCEDMAINTENANCE

SMPDB was originally built by hand-drawing eachpathway using PowerPoint incorporating a collection of

Table 1 Comparison of SMPDB 20 to SMPDB 10 to KEGG HumanCyc Reactome BioCarta and WikiPathwaysGenMAPP

Feature SMPDB 20 SMPDB 10 KEGG Reactome HumanCyc BioCarta WikiPathways

Number of metabolic pathways 92 70 78 (forhumans)

64 (forhumans)

333 (shortpathways)

69 50 (forhumans)

Number of disease pathways 221 113 52 11 0 24 16Number of drug action pathways 232 168 0 3 3 10 5Number of drug metabolism pathways 53 0 8 1 0 0 2Number of physio action pathways 5 0 31 36 0 5 22Number of small mol signaling pathways 15 0 30 27 0 30 15Provides multiple organism pathways No No Yes Yes No Yes YesChemical structures shown in diagrams Yes Yes No No Yes (when zoomed) Some NoProtein 4o structures shown in diagrams Yes Yes No No No Some NoCell structures shown in pathway diagrams Yes Yes Some Yes No Yes NoOrgans shown in pathway diagrams Most Some No No No No NoDescriptions of pathways provided Detailed Detailed Limited Detailed Detailed Detailed LimitedPathway images are hyperlinked Yes Yes Yes Yes Yes Yes NoPathway images are easily zoomable Yes No No Yes Yes No LimitedInformation provided on pathway entities Detailed Detailed Modest Modest Moderate Limited LimitedSupports advanced text search Yes Yes No Yes Yes No NoSupports sequence searching Yes Yes No No Yes No NoSupports graphical chem structure search Yes Yes No No No No NoSupports chemical expression mapping Yes Yes Yes Yes Yes No NoSupports geneprotein expression mapping Yes Yes Yes Yes Yes No NoDownloadable Yes Yes Limited Yes Yes No YesBioPax CellML or SBML compatible Yes No Partial Yes Yes No No

Nucleic Acids Research 2013 3

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

Figure 1 A screenshot montage of SMPDB 20rsquos various viewing and searching features

4 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

visual icons and a well-defined standard operatingprotocol (SOP) Each PowerPoint image was then con-verted to three differently sized PNG images (smallmedium and large) and manually image-mapped to appro-priate database links Prior to the image mapping processpathway constituents (proteins and chemicals) weremanually identified their database IDs manuallydetermined and each data point entered in a separate filefor each SMPDB pathway These text files were used topermit text and structure searching as well as compoundhighlighting (via SMP-MAP and SMP-Highlight) Thismanually intensive process proved to be arduous time-consuming and error prone The many different skillsand multiple steps required to generate and post aSMPDB pathway meant that a team of more than 10 dif-ferently trained individuals were needed to constructupdate and maintain the database Even with SOPs anda standardized icon library some individual variabilityoccurred with regard to pathway layout pathwaydetails icon placement directional arrow size and otherimage features Additionally the manual placement ofimage-mapped hyperlinks meant that some links wereplaced imprecisely leading to off-centre effects thatbecame further exacerbated through zooming IfSMPDB pathways needed to be updated or correctedmultiple images still had to be generated and multipleimage maps still had to be prepared If a large numberof highly similar pathways were involved (such as drugmechanism pathways) a simple single molecule updatecould often take many hours of repetitive error-proneediting Because all of SMPDBrsquos pathways were essen-tially image-mapped pictures it was not possible to easilyrender them into BioPAX or SBML formats As a resultefforts to convert SMPDB pathways into BioPAX orSBML equivalents were stymied

To simplify the updating process and improve the levelof standardization and figure rendering consistency forSMPDB 20 we developed an online pathway illustrationand editing tool called PathWhiz (manuscript in prepar-ation) PathWhiz which is a combination of an onlinedigital painting program and a webpage buildingprogram allowed SMPDB curators to digitally renderupdate and annotate pathways in a fraction of the timethat the old lsquoanalogrsquo approach required PathWhiz alsoallowed our curation team to produce far more consist-ently rendered pathways and consistently sized structuresthat looked as if they all came from a single artistPathWhiz uses a pull-down library of standard imageicons for membranes membrane compartments organstissues organelles membranes and proteins to help accel-erate the pathway illustration process It also has astandard set of arrows and arrow-drawing utilities thatallow all SMPDB reaction components to be linked orconnected in a uniform visually pleasing fashionBecause PathWhiz also captures icon placement and an-notation information digitally it allowed all hyperlinks tobe precisely mapped all pathway images to be rendered asSVG and all pathway component information (namesstructures database IDs etc) to be readily capturedThe updating process is also made much easier as allchanges in one pathway (images hyperlinks names

database IDs) can be digitally transferred to all othersimilar pathways at the click of a button The digitalnature of the drawing and icon placement processthrough PathWhiz meant that it also became very easyto convert all SMPDB 20 pathways into BioPAXformat Conversion of the SMPDB pathways to SBMLis underway now and should be completed before 2014Thanks to PathWhiz all SMPDB pathways have beensignificantly updated corrected and improved Likewisemany new pathways were able to be added in a fractionof the time (compared to the old approach) allowing asignificant increase in both the quantity and quality ofpathways in SMPDB 20

ENHANCED DATA DOWNLOADS

Another important and increasingly unique feature ofSMPDB is the open availability of its data In particularSMPDB supports both data downloads and image down-loads for all of its pathways The data files include proteinand gene sequences for all the annotated proteins andgenes in SMPDB SMPDB downloads also include all me-tabolite and drug structures (in SDF format) as well asCSV files containing different combinations of compoundnames or protein names linked to pathways and databaseidentifiers The image files (in SMPDB 10) included boththe original PowerPoint files and the differently sized(small medium large) PNG files corresponding to eachpathway With the release of SMPDB 20 the same kind ofdata download information is still available although thenumber of sequences structures and database links is nowsubstantially greater In addition SMPDBrsquos data down-loads now include all of its pathway information inBioPAX format BioPAX is an RDFOWL-basedstandard exchange language designed to compactly repre-sent biological pathways at the molecular and cellularlevel (12) The availability of SMPDBrsquos data in BioPAXformat and the impending availability of the samepathway data in SBML (expected in late 2013) shouldgreatly expand SMPDBrsquos appeal to systems biologists aswell as its potential utility in a variety of pathway bio-chemical and metabolomic analysis packages It is alsoexpected that the availability of SMPDB BioPAX (andSBML) files will encourage greater interest in communityannotation allowing SMPDB to evolve from a centrallycurated database (which is inefficient) to one that may becurated and enhanced by a community of experts (which ismore efficient) SMPDB 20rsquos image file downloads havealso been modified and enhanced to include not only high-resolution PNG files but also a SVG images of everypathway with all of the BioPAX data imbedded (asmetadata) into every file With these enhancements toSMPDB 20rsquos downloadable resources we believe it nowoffers the most comprehensive collection of downloadabledata of any online pathway database

IMPROVED QUALITY AND QUALITY ASSURANCE

As noted earlier every pathway in SMPDB 20 has beenredrawn using the new PathWhiz rendering tool This

Nucleic Acids Research 2013 5

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

effort allowed us to revisit all of SMPDBrsquos pathways andto update them with new information and to correctimprove or reformat preexisting pathways so that theywere more consistent and informative Many pathwayswere improved through the addition of more organelleand reaction compartment information Others wereenhanced through the addition of target organ informa-tion and protein transporter information Likewise imagesthat were too small arrows that were too thin andpathways that were too convoluted were corrected andor simplified In redrawing old pathways and renderingnew pathways in SMPDB 20 the same level of qualityassurance the same referential resources and the samekinds of data checkingvalidation routines were used asdescribed in the original SMPDB paper (7) In particularall pathways in SMPDB 20 were drawn using a standardoperating procedure (SOP) with a checklist of featuresthat each of the pathway artistsannotators was requiredto follow This SOP and checklist is also provided in thelsquoAboutrsquo menu To ensure that the pathways have beenknowledgably illustrated all of the SMPDB pathwaycurators were required to have degrees or advanced uni-versity coursework in biology chemistry biochemistry orbioinformatics During the redrawing of existing SMPDBpathways and the generation of new pathways forSMPDB 20 all SMPDB curators were also required toconsult a variety of sources including biochemistry text-books OMIM (13) Wikipedia KEGG (1) MetaCyc (3)Reactome (4) UniProt (11) the HMDB (10) DrugBank(9) and PharmGKB (6) This allowed the curation team toidentify and consolidate pathway nomenclature keypathway components critical reactions cellular compart-ments as well as target organs compartments or organ-elles Pathway layouts were also frequently assessedcompared and discussed by SMPDB team membersprior to manually generating any pathway diagramsBecause of the unique rendering requirements and thestrict SOPs for SMPDB no pathway in SMPDB waslsquocopiedrsquo from any other pathway diagram in any otherdatabase Furthermore SMPDBrsquos metabolic diseasepathways drug-action and drug metabolism pathwayshad to be generated independently of any pathway data-base Instead relevant information was gathered fromvarious medical and pharmacology textbooks specializedencyclopedias and fragmentary data contained on severalonline databases such as OMIM (13) PharmGKB (6) andDrugBank (9) As an additional layer of quality assuranceall of the pathway diagrams in the SMPDB 20 have beeninspected and corrected by two or more curators havingadvanced degrees in biochemistry or physiology

INCREASED CONNECTIVITY ANDINTEROPERABILITY

When SMPDB was first released it was primarily a stand-alone database that simply accessed other databasesthrough its hyperlinks to HMDB DrugBank andUniProt Over the past 2 years efforts have beendirected at integrating SMPDB into a number of otherpopular online databases so that effectively other

databases now access SMPDB In particular SMPDB isnow linked into the pathway data fields of the latestversion of HMDB (10) and the latest release ofDrugBank (9) It has also been linked into several onlinemetabolomic data analysis resources includingMetaboAnalyst (14) and MSEA (15) Discussions areongoing to have SMPDB 20 linked to a number ofother chemical entity databases as well as other well-known pathway databases in the near future Inaddition to trying to increase SMPDBrsquos database connect-ivity we are also trying to make it far more interoperableThe availability of SMPDBrsquos data in BioPAX format andthe expected availability of the same pathway data inSBML (in late 2013) should greatly improve SMPDBrsquosinteroperability It should also encourage users to freelyexchange SMPDB data and to incorporate these data intoa variety of pathway or metabolomic analysis softwarepackages It is also expected that the availability ofSMPDB BioPAX (and soon SBML) files will encouragegreater interest in community annotation allowingSMPDB to evolve from a centrally curated database toa community curated resource

FUTURE PLANS AND CONCLUSIONS

SMPDB continues to expand with new pathways beingadded at an almost daily rate Over the coming 2ndash3years we expect that the number of pathways inSMPDB will likely double It is likely that the greatestgrowth in SMPDBrsquos pathways will be in the drug anddrug metabolism pathways as there are literally thou-sands of reactions reactants and pathways that arereadily available in the literature With the impendingrelease of PathWhiz (SMPDBrsquos pathway drawing tooland editor) we are hopeful that many of SMPDBrsquos newpathways will actually be added by interested communitymembers It is also expected that PathWhiz will enableSMPDB-like pathways and pathway databases to beeasily generated for other model organisms such asSaccharomyces cerevisae Escherichia coli Arabidopsisand Drosophila With the latest additions and enhance-ments in SMPDB 20 we believe that SMPDB has reacheda critical threshold giving it sufficient breadth depth andinterconnectivity so that it will appeal to a much largercommunity of users Overall SMPDB 20rsquos new colorfulinformative artfully designed pathway diagramscombined with its wide range of visualization annotationand querying tools should provide users a much moreenriched interactive and informative pathway viewingexperience

FUNDING

Canadian Institutes for Health Research (CIHR) AlbertaInnovates Health Solutions (AIHS) and Genome Albertaa division of Genome Canada Funding for open accesscharge Genome Canada

Conflict of interest statement None declared

6 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

REFERENCES

1 KanehisaM GotoS SatoY FurumichiM and TanabeM(2012) KEGG for integration and interpretation of large-scalemolecular data sets Nucleic Acids Res 40 D109ndashD114

2 KeselerIM MackieA Peralta-GilM Santos-ZavaletaAGama-CastroS Bonavides-MartınezC FulcherC HuertaAMKothariA KrummenackerM et al (2013) EcoCyc fusing modelorganism databases with systems biology Nucleic Acids Res 41D605ndashD612

3 CaspiR AltmanT DreherK FulcherCA SubhravetiPKeselerIM KothariA KrummenackerM LatendresseMMuellerLA et al (2012) The MetaCyc database of metabolicpathways and enzymes and the BioCyc collection of pathwaygenome databases Nucleic Acids Res 40 D742ndashD753

4 CroftD OrsquoKellyG WuG HawR GillespieM MatthewsLCaudyM GarapatiP GopinathG JassalB et al (2011)Reactome a database of reactions pathways and biologicalprocesses Nucleic Acids Res 39 D691ndashD697

5 PicoAR KelderT van IerselMP HanspersK ConklinBRand EveloC (2008) WikiPathways pathway editing for thepeople PLoS Biol 6 e184

6 ThornCF KleinTE and AltmanRB (2013) PharmGKB thePharmacogenomics Knowledge Base Methods Mol Biol 1015311ndash320

7 FrolkisA KnoxC LimE JewisonT LawV HauDDLiuP GautamB LyS GuoAC et al (2010) SMPDB TheSmall Molecule Pathway Database Nucleic Acids Res 38D480ndashD487

8 StrombackL and LambrixP (2005) Representations ofmolecular pathways an evaluation of SBML PSI MI andBioPAX Bioinformatics 21 4401ndash4407

9 KnoxC LawV JewisonT LiuP LyS FrolkisA PonABancoK MakC NeveuV et al (2011) DrugBank 30 acomprehensive resources for lsquoOmicsrsquo research on drugs NucleicAcids Res 39 D1035ndashD1041

10 WishartDS JewisonT GuoAC WilsonM KnoxC LiuYDjoumbouY MandalR AziatF DongE et al (2013) HMDB30ndashThe Human Metabolome Database in 2013 Nucleic AcidsRes 41 D801ndashD807

11 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47

12 WebbRL and MarsquoayanA (2011) Sig2BioPAX Java tool forconverting flat files to BioPAX Level 3 format Source Code BiolMed 6 5

13 HamoshA ScottAF AmbergerJS BocchiniCA andMcKusickVA (2005) Online Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disordersNucleic Acids Res 33 D514ndashD517

14 XiaJ MandalR SinelnikovIV BroadhurstD andWishartDS (2012) MetaboAnalyst 20ndasha comprehensive serverfor metabolomic data analysis Nucleic Acids Res 40W127ndashW133

15 XiaJ and WishartDS (2010) MSEA a web-based tool toidentify biologically meaningful patterns in quantitativemetabolomic data Nucleic Acids Res 38 W71ndashW77

Nucleic Acids Research 2013 7

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

Figure 1 A screenshot montage of SMPDB 20rsquos various viewing and searching features

4 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

visual icons and a well-defined standard operatingprotocol (SOP) Each PowerPoint image was then con-verted to three differently sized PNG images (smallmedium and large) and manually image-mapped to appro-priate database links Prior to the image mapping processpathway constituents (proteins and chemicals) weremanually identified their database IDs manuallydetermined and each data point entered in a separate filefor each SMPDB pathway These text files were used topermit text and structure searching as well as compoundhighlighting (via SMP-MAP and SMP-Highlight) Thismanually intensive process proved to be arduous time-consuming and error prone The many different skillsand multiple steps required to generate and post aSMPDB pathway meant that a team of more than 10 dif-ferently trained individuals were needed to constructupdate and maintain the database Even with SOPs anda standardized icon library some individual variabilityoccurred with regard to pathway layout pathwaydetails icon placement directional arrow size and otherimage features Additionally the manual placement ofimage-mapped hyperlinks meant that some links wereplaced imprecisely leading to off-centre effects thatbecame further exacerbated through zooming IfSMPDB pathways needed to be updated or correctedmultiple images still had to be generated and multipleimage maps still had to be prepared If a large numberof highly similar pathways were involved (such as drugmechanism pathways) a simple single molecule updatecould often take many hours of repetitive error-proneediting Because all of SMPDBrsquos pathways were essen-tially image-mapped pictures it was not possible to easilyrender them into BioPAX or SBML formats As a resultefforts to convert SMPDB pathways into BioPAX orSBML equivalents were stymied

To simplify the updating process and improve the levelof standardization and figure rendering consistency forSMPDB 20 we developed an online pathway illustrationand editing tool called PathWhiz (manuscript in prepar-ation) PathWhiz which is a combination of an onlinedigital painting program and a webpage buildingprogram allowed SMPDB curators to digitally renderupdate and annotate pathways in a fraction of the timethat the old lsquoanalogrsquo approach required PathWhiz alsoallowed our curation team to produce far more consist-ently rendered pathways and consistently sized structuresthat looked as if they all came from a single artistPathWhiz uses a pull-down library of standard imageicons for membranes membrane compartments organstissues organelles membranes and proteins to help accel-erate the pathway illustration process It also has astandard set of arrows and arrow-drawing utilities thatallow all SMPDB reaction components to be linked orconnected in a uniform visually pleasing fashionBecause PathWhiz also captures icon placement and an-notation information digitally it allowed all hyperlinks tobe precisely mapped all pathway images to be rendered asSVG and all pathway component information (namesstructures database IDs etc) to be readily capturedThe updating process is also made much easier as allchanges in one pathway (images hyperlinks names

database IDs) can be digitally transferred to all othersimilar pathways at the click of a button The digitalnature of the drawing and icon placement processthrough PathWhiz meant that it also became very easyto convert all SMPDB 20 pathways into BioPAXformat Conversion of the SMPDB pathways to SBMLis underway now and should be completed before 2014Thanks to PathWhiz all SMPDB pathways have beensignificantly updated corrected and improved Likewisemany new pathways were able to be added in a fractionof the time (compared to the old approach) allowing asignificant increase in both the quantity and quality ofpathways in SMPDB 20

ENHANCED DATA DOWNLOADS

Another important and increasingly unique feature ofSMPDB is the open availability of its data In particularSMPDB supports both data downloads and image down-loads for all of its pathways The data files include proteinand gene sequences for all the annotated proteins andgenes in SMPDB SMPDB downloads also include all me-tabolite and drug structures (in SDF format) as well asCSV files containing different combinations of compoundnames or protein names linked to pathways and databaseidentifiers The image files (in SMPDB 10) included boththe original PowerPoint files and the differently sized(small medium large) PNG files corresponding to eachpathway With the release of SMPDB 20 the same kind ofdata download information is still available although thenumber of sequences structures and database links is nowsubstantially greater In addition SMPDBrsquos data down-loads now include all of its pathway information inBioPAX format BioPAX is an RDFOWL-basedstandard exchange language designed to compactly repre-sent biological pathways at the molecular and cellularlevel (12) The availability of SMPDBrsquos data in BioPAXformat and the impending availability of the samepathway data in SBML (expected in late 2013) shouldgreatly expand SMPDBrsquos appeal to systems biologists aswell as its potential utility in a variety of pathway bio-chemical and metabolomic analysis packages It is alsoexpected that the availability of SMPDB BioPAX (andSBML) files will encourage greater interest in communityannotation allowing SMPDB to evolve from a centrallycurated database (which is inefficient) to one that may becurated and enhanced by a community of experts (which ismore efficient) SMPDB 20rsquos image file downloads havealso been modified and enhanced to include not only high-resolution PNG files but also a SVG images of everypathway with all of the BioPAX data imbedded (asmetadata) into every file With these enhancements toSMPDB 20rsquos downloadable resources we believe it nowoffers the most comprehensive collection of downloadabledata of any online pathway database

IMPROVED QUALITY AND QUALITY ASSURANCE

As noted earlier every pathway in SMPDB 20 has beenredrawn using the new PathWhiz rendering tool This

Nucleic Acids Research 2013 5

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

effort allowed us to revisit all of SMPDBrsquos pathways andto update them with new information and to correctimprove or reformat preexisting pathways so that theywere more consistent and informative Many pathwayswere improved through the addition of more organelleand reaction compartment information Others wereenhanced through the addition of target organ informa-tion and protein transporter information Likewise imagesthat were too small arrows that were too thin andpathways that were too convoluted were corrected andor simplified In redrawing old pathways and renderingnew pathways in SMPDB 20 the same level of qualityassurance the same referential resources and the samekinds of data checkingvalidation routines were used asdescribed in the original SMPDB paper (7) In particularall pathways in SMPDB 20 were drawn using a standardoperating procedure (SOP) with a checklist of featuresthat each of the pathway artistsannotators was requiredto follow This SOP and checklist is also provided in thelsquoAboutrsquo menu To ensure that the pathways have beenknowledgably illustrated all of the SMPDB pathwaycurators were required to have degrees or advanced uni-versity coursework in biology chemistry biochemistry orbioinformatics During the redrawing of existing SMPDBpathways and the generation of new pathways forSMPDB 20 all SMPDB curators were also required toconsult a variety of sources including biochemistry text-books OMIM (13) Wikipedia KEGG (1) MetaCyc (3)Reactome (4) UniProt (11) the HMDB (10) DrugBank(9) and PharmGKB (6) This allowed the curation team toidentify and consolidate pathway nomenclature keypathway components critical reactions cellular compart-ments as well as target organs compartments or organ-elles Pathway layouts were also frequently assessedcompared and discussed by SMPDB team membersprior to manually generating any pathway diagramsBecause of the unique rendering requirements and thestrict SOPs for SMPDB no pathway in SMPDB waslsquocopiedrsquo from any other pathway diagram in any otherdatabase Furthermore SMPDBrsquos metabolic diseasepathways drug-action and drug metabolism pathwayshad to be generated independently of any pathway data-base Instead relevant information was gathered fromvarious medical and pharmacology textbooks specializedencyclopedias and fragmentary data contained on severalonline databases such as OMIM (13) PharmGKB (6) andDrugBank (9) As an additional layer of quality assuranceall of the pathway diagrams in the SMPDB 20 have beeninspected and corrected by two or more curators havingadvanced degrees in biochemistry or physiology

INCREASED CONNECTIVITY ANDINTEROPERABILITY

When SMPDB was first released it was primarily a stand-alone database that simply accessed other databasesthrough its hyperlinks to HMDB DrugBank andUniProt Over the past 2 years efforts have beendirected at integrating SMPDB into a number of otherpopular online databases so that effectively other

databases now access SMPDB In particular SMPDB isnow linked into the pathway data fields of the latestversion of HMDB (10) and the latest release ofDrugBank (9) It has also been linked into several onlinemetabolomic data analysis resources includingMetaboAnalyst (14) and MSEA (15) Discussions areongoing to have SMPDB 20 linked to a number ofother chemical entity databases as well as other well-known pathway databases in the near future Inaddition to trying to increase SMPDBrsquos database connect-ivity we are also trying to make it far more interoperableThe availability of SMPDBrsquos data in BioPAX format andthe expected availability of the same pathway data inSBML (in late 2013) should greatly improve SMPDBrsquosinteroperability It should also encourage users to freelyexchange SMPDB data and to incorporate these data intoa variety of pathway or metabolomic analysis softwarepackages It is also expected that the availability ofSMPDB BioPAX (and soon SBML) files will encouragegreater interest in community annotation allowingSMPDB to evolve from a centrally curated database toa community curated resource

FUTURE PLANS AND CONCLUSIONS

SMPDB continues to expand with new pathways beingadded at an almost daily rate Over the coming 2ndash3years we expect that the number of pathways inSMPDB will likely double It is likely that the greatestgrowth in SMPDBrsquos pathways will be in the drug anddrug metabolism pathways as there are literally thou-sands of reactions reactants and pathways that arereadily available in the literature With the impendingrelease of PathWhiz (SMPDBrsquos pathway drawing tooland editor) we are hopeful that many of SMPDBrsquos newpathways will actually be added by interested communitymembers It is also expected that PathWhiz will enableSMPDB-like pathways and pathway databases to beeasily generated for other model organisms such asSaccharomyces cerevisae Escherichia coli Arabidopsisand Drosophila With the latest additions and enhance-ments in SMPDB 20 we believe that SMPDB has reacheda critical threshold giving it sufficient breadth depth andinterconnectivity so that it will appeal to a much largercommunity of users Overall SMPDB 20rsquos new colorfulinformative artfully designed pathway diagramscombined with its wide range of visualization annotationand querying tools should provide users a much moreenriched interactive and informative pathway viewingexperience

FUNDING

Canadian Institutes for Health Research (CIHR) AlbertaInnovates Health Solutions (AIHS) and Genome Albertaa division of Genome Canada Funding for open accesscharge Genome Canada

Conflict of interest statement None declared

6 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

REFERENCES

1 KanehisaM GotoS SatoY FurumichiM and TanabeM(2012) KEGG for integration and interpretation of large-scalemolecular data sets Nucleic Acids Res 40 D109ndashD114

2 KeselerIM MackieA Peralta-GilM Santos-ZavaletaAGama-CastroS Bonavides-MartınezC FulcherC HuertaAMKothariA KrummenackerM et al (2013) EcoCyc fusing modelorganism databases with systems biology Nucleic Acids Res 41D605ndashD612

3 CaspiR AltmanT DreherK FulcherCA SubhravetiPKeselerIM KothariA KrummenackerM LatendresseMMuellerLA et al (2012) The MetaCyc database of metabolicpathways and enzymes and the BioCyc collection of pathwaygenome databases Nucleic Acids Res 40 D742ndashD753

4 CroftD OrsquoKellyG WuG HawR GillespieM MatthewsLCaudyM GarapatiP GopinathG JassalB et al (2011)Reactome a database of reactions pathways and biologicalprocesses Nucleic Acids Res 39 D691ndashD697

5 PicoAR KelderT van IerselMP HanspersK ConklinBRand EveloC (2008) WikiPathways pathway editing for thepeople PLoS Biol 6 e184

6 ThornCF KleinTE and AltmanRB (2013) PharmGKB thePharmacogenomics Knowledge Base Methods Mol Biol 1015311ndash320

7 FrolkisA KnoxC LimE JewisonT LawV HauDDLiuP GautamB LyS GuoAC et al (2010) SMPDB TheSmall Molecule Pathway Database Nucleic Acids Res 38D480ndashD487

8 StrombackL and LambrixP (2005) Representations ofmolecular pathways an evaluation of SBML PSI MI andBioPAX Bioinformatics 21 4401ndash4407

9 KnoxC LawV JewisonT LiuP LyS FrolkisA PonABancoK MakC NeveuV et al (2011) DrugBank 30 acomprehensive resources for lsquoOmicsrsquo research on drugs NucleicAcids Res 39 D1035ndashD1041

10 WishartDS JewisonT GuoAC WilsonM KnoxC LiuYDjoumbouY MandalR AziatF DongE et al (2013) HMDB30ndashThe Human Metabolome Database in 2013 Nucleic AcidsRes 41 D801ndashD807

11 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47

12 WebbRL and MarsquoayanA (2011) Sig2BioPAX Java tool forconverting flat files to BioPAX Level 3 format Source Code BiolMed 6 5

13 HamoshA ScottAF AmbergerJS BocchiniCA andMcKusickVA (2005) Online Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disordersNucleic Acids Res 33 D514ndashD517

14 XiaJ MandalR SinelnikovIV BroadhurstD andWishartDS (2012) MetaboAnalyst 20ndasha comprehensive serverfor metabolomic data analysis Nucleic Acids Res 40W127ndashW133

15 XiaJ and WishartDS (2010) MSEA a web-based tool toidentify biologically meaningful patterns in quantitativemetabolomic data Nucleic Acids Res 38 W71ndashW77

Nucleic Acids Research 2013 7

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

visual icons and a well-defined standard operatingprotocol (SOP) Each PowerPoint image was then con-verted to three differently sized PNG images (smallmedium and large) and manually image-mapped to appro-priate database links Prior to the image mapping processpathway constituents (proteins and chemicals) weremanually identified their database IDs manuallydetermined and each data point entered in a separate filefor each SMPDB pathway These text files were used topermit text and structure searching as well as compoundhighlighting (via SMP-MAP and SMP-Highlight) Thismanually intensive process proved to be arduous time-consuming and error prone The many different skillsand multiple steps required to generate and post aSMPDB pathway meant that a team of more than 10 dif-ferently trained individuals were needed to constructupdate and maintain the database Even with SOPs anda standardized icon library some individual variabilityoccurred with regard to pathway layout pathwaydetails icon placement directional arrow size and otherimage features Additionally the manual placement ofimage-mapped hyperlinks meant that some links wereplaced imprecisely leading to off-centre effects thatbecame further exacerbated through zooming IfSMPDB pathways needed to be updated or correctedmultiple images still had to be generated and multipleimage maps still had to be prepared If a large numberof highly similar pathways were involved (such as drugmechanism pathways) a simple single molecule updatecould often take many hours of repetitive error-proneediting Because all of SMPDBrsquos pathways were essen-tially image-mapped pictures it was not possible to easilyrender them into BioPAX or SBML formats As a resultefforts to convert SMPDB pathways into BioPAX orSBML equivalents were stymied

To simplify the updating process and improve the levelof standardization and figure rendering consistency forSMPDB 20 we developed an online pathway illustrationand editing tool called PathWhiz (manuscript in prepar-ation) PathWhiz which is a combination of an onlinedigital painting program and a webpage buildingprogram allowed SMPDB curators to digitally renderupdate and annotate pathways in a fraction of the timethat the old lsquoanalogrsquo approach required PathWhiz alsoallowed our curation team to produce far more consist-ently rendered pathways and consistently sized structuresthat looked as if they all came from a single artistPathWhiz uses a pull-down library of standard imageicons for membranes membrane compartments organstissues organelles membranes and proteins to help accel-erate the pathway illustration process It also has astandard set of arrows and arrow-drawing utilities thatallow all SMPDB reaction components to be linked orconnected in a uniform visually pleasing fashionBecause PathWhiz also captures icon placement and an-notation information digitally it allowed all hyperlinks tobe precisely mapped all pathway images to be rendered asSVG and all pathway component information (namesstructures database IDs etc) to be readily capturedThe updating process is also made much easier as allchanges in one pathway (images hyperlinks names

database IDs) can be digitally transferred to all othersimilar pathways at the click of a button The digitalnature of the drawing and icon placement processthrough PathWhiz meant that it also became very easyto convert all SMPDB 20 pathways into BioPAXformat Conversion of the SMPDB pathways to SBMLis underway now and should be completed before 2014Thanks to PathWhiz all SMPDB pathways have beensignificantly updated corrected and improved Likewisemany new pathways were able to be added in a fractionof the time (compared to the old approach) allowing asignificant increase in both the quantity and quality ofpathways in SMPDB 20

ENHANCED DATA DOWNLOADS

Another important and increasingly unique feature ofSMPDB is the open availability of its data In particularSMPDB supports both data downloads and image down-loads for all of its pathways The data files include proteinand gene sequences for all the annotated proteins andgenes in SMPDB SMPDB downloads also include all me-tabolite and drug structures (in SDF format) as well asCSV files containing different combinations of compoundnames or protein names linked to pathways and databaseidentifiers The image files (in SMPDB 10) included boththe original PowerPoint files and the differently sized(small medium large) PNG files corresponding to eachpathway With the release of SMPDB 20 the same kind ofdata download information is still available although thenumber of sequences structures and database links is nowsubstantially greater In addition SMPDBrsquos data down-loads now include all of its pathway information inBioPAX format BioPAX is an RDFOWL-basedstandard exchange language designed to compactly repre-sent biological pathways at the molecular and cellularlevel (12) The availability of SMPDBrsquos data in BioPAXformat and the impending availability of the samepathway data in SBML (expected in late 2013) shouldgreatly expand SMPDBrsquos appeal to systems biologists aswell as its potential utility in a variety of pathway bio-chemical and metabolomic analysis packages It is alsoexpected that the availability of SMPDB BioPAX (andSBML) files will encourage greater interest in communityannotation allowing SMPDB to evolve from a centrallycurated database (which is inefficient) to one that may becurated and enhanced by a community of experts (which ismore efficient) SMPDB 20rsquos image file downloads havealso been modified and enhanced to include not only high-resolution PNG files but also a SVG images of everypathway with all of the BioPAX data imbedded (asmetadata) into every file With these enhancements toSMPDB 20rsquos downloadable resources we believe it nowoffers the most comprehensive collection of downloadabledata of any online pathway database

IMPROVED QUALITY AND QUALITY ASSURANCE

As noted earlier every pathway in SMPDB 20 has beenredrawn using the new PathWhiz rendering tool This

Nucleic Acids Research 2013 5

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

effort allowed us to revisit all of SMPDBrsquos pathways andto update them with new information and to correctimprove or reformat preexisting pathways so that theywere more consistent and informative Many pathwayswere improved through the addition of more organelleand reaction compartment information Others wereenhanced through the addition of target organ informa-tion and protein transporter information Likewise imagesthat were too small arrows that were too thin andpathways that were too convoluted were corrected andor simplified In redrawing old pathways and renderingnew pathways in SMPDB 20 the same level of qualityassurance the same referential resources and the samekinds of data checkingvalidation routines were used asdescribed in the original SMPDB paper (7) In particularall pathways in SMPDB 20 were drawn using a standardoperating procedure (SOP) with a checklist of featuresthat each of the pathway artistsannotators was requiredto follow This SOP and checklist is also provided in thelsquoAboutrsquo menu To ensure that the pathways have beenknowledgably illustrated all of the SMPDB pathwaycurators were required to have degrees or advanced uni-versity coursework in biology chemistry biochemistry orbioinformatics During the redrawing of existing SMPDBpathways and the generation of new pathways forSMPDB 20 all SMPDB curators were also required toconsult a variety of sources including biochemistry text-books OMIM (13) Wikipedia KEGG (1) MetaCyc (3)Reactome (4) UniProt (11) the HMDB (10) DrugBank(9) and PharmGKB (6) This allowed the curation team toidentify and consolidate pathway nomenclature keypathway components critical reactions cellular compart-ments as well as target organs compartments or organ-elles Pathway layouts were also frequently assessedcompared and discussed by SMPDB team membersprior to manually generating any pathway diagramsBecause of the unique rendering requirements and thestrict SOPs for SMPDB no pathway in SMPDB waslsquocopiedrsquo from any other pathway diagram in any otherdatabase Furthermore SMPDBrsquos metabolic diseasepathways drug-action and drug metabolism pathwayshad to be generated independently of any pathway data-base Instead relevant information was gathered fromvarious medical and pharmacology textbooks specializedencyclopedias and fragmentary data contained on severalonline databases such as OMIM (13) PharmGKB (6) andDrugBank (9) As an additional layer of quality assuranceall of the pathway diagrams in the SMPDB 20 have beeninspected and corrected by two or more curators havingadvanced degrees in biochemistry or physiology

INCREASED CONNECTIVITY ANDINTEROPERABILITY

When SMPDB was first released it was primarily a stand-alone database that simply accessed other databasesthrough its hyperlinks to HMDB DrugBank andUniProt Over the past 2 years efforts have beendirected at integrating SMPDB into a number of otherpopular online databases so that effectively other

databases now access SMPDB In particular SMPDB isnow linked into the pathway data fields of the latestversion of HMDB (10) and the latest release ofDrugBank (9) It has also been linked into several onlinemetabolomic data analysis resources includingMetaboAnalyst (14) and MSEA (15) Discussions areongoing to have SMPDB 20 linked to a number ofother chemical entity databases as well as other well-known pathway databases in the near future Inaddition to trying to increase SMPDBrsquos database connect-ivity we are also trying to make it far more interoperableThe availability of SMPDBrsquos data in BioPAX format andthe expected availability of the same pathway data inSBML (in late 2013) should greatly improve SMPDBrsquosinteroperability It should also encourage users to freelyexchange SMPDB data and to incorporate these data intoa variety of pathway or metabolomic analysis softwarepackages It is also expected that the availability ofSMPDB BioPAX (and soon SBML) files will encouragegreater interest in community annotation allowingSMPDB to evolve from a centrally curated database toa community curated resource

FUTURE PLANS AND CONCLUSIONS

SMPDB continues to expand with new pathways beingadded at an almost daily rate Over the coming 2ndash3years we expect that the number of pathways inSMPDB will likely double It is likely that the greatestgrowth in SMPDBrsquos pathways will be in the drug anddrug metabolism pathways as there are literally thou-sands of reactions reactants and pathways that arereadily available in the literature With the impendingrelease of PathWhiz (SMPDBrsquos pathway drawing tooland editor) we are hopeful that many of SMPDBrsquos newpathways will actually be added by interested communitymembers It is also expected that PathWhiz will enableSMPDB-like pathways and pathway databases to beeasily generated for other model organisms such asSaccharomyces cerevisae Escherichia coli Arabidopsisand Drosophila With the latest additions and enhance-ments in SMPDB 20 we believe that SMPDB has reacheda critical threshold giving it sufficient breadth depth andinterconnectivity so that it will appeal to a much largercommunity of users Overall SMPDB 20rsquos new colorfulinformative artfully designed pathway diagramscombined with its wide range of visualization annotationand querying tools should provide users a much moreenriched interactive and informative pathway viewingexperience

FUNDING

Canadian Institutes for Health Research (CIHR) AlbertaInnovates Health Solutions (AIHS) and Genome Albertaa division of Genome Canada Funding for open accesscharge Genome Canada

Conflict of interest statement None declared

6 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

REFERENCES

1 KanehisaM GotoS SatoY FurumichiM and TanabeM(2012) KEGG for integration and interpretation of large-scalemolecular data sets Nucleic Acids Res 40 D109ndashD114

2 KeselerIM MackieA Peralta-GilM Santos-ZavaletaAGama-CastroS Bonavides-MartınezC FulcherC HuertaAMKothariA KrummenackerM et al (2013) EcoCyc fusing modelorganism databases with systems biology Nucleic Acids Res 41D605ndashD612

3 CaspiR AltmanT DreherK FulcherCA SubhravetiPKeselerIM KothariA KrummenackerM LatendresseMMuellerLA et al (2012) The MetaCyc database of metabolicpathways and enzymes and the BioCyc collection of pathwaygenome databases Nucleic Acids Res 40 D742ndashD753

4 CroftD OrsquoKellyG WuG HawR GillespieM MatthewsLCaudyM GarapatiP GopinathG JassalB et al (2011)Reactome a database of reactions pathways and biologicalprocesses Nucleic Acids Res 39 D691ndashD697

5 PicoAR KelderT van IerselMP HanspersK ConklinBRand EveloC (2008) WikiPathways pathway editing for thepeople PLoS Biol 6 e184

6 ThornCF KleinTE and AltmanRB (2013) PharmGKB thePharmacogenomics Knowledge Base Methods Mol Biol 1015311ndash320

7 FrolkisA KnoxC LimE JewisonT LawV HauDDLiuP GautamB LyS GuoAC et al (2010) SMPDB TheSmall Molecule Pathway Database Nucleic Acids Res 38D480ndashD487

8 StrombackL and LambrixP (2005) Representations ofmolecular pathways an evaluation of SBML PSI MI andBioPAX Bioinformatics 21 4401ndash4407

9 KnoxC LawV JewisonT LiuP LyS FrolkisA PonABancoK MakC NeveuV et al (2011) DrugBank 30 acomprehensive resources for lsquoOmicsrsquo research on drugs NucleicAcids Res 39 D1035ndashD1041

10 WishartDS JewisonT GuoAC WilsonM KnoxC LiuYDjoumbouY MandalR AziatF DongE et al (2013) HMDB30ndashThe Human Metabolome Database in 2013 Nucleic AcidsRes 41 D801ndashD807

11 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47

12 WebbRL and MarsquoayanA (2011) Sig2BioPAX Java tool forconverting flat files to BioPAX Level 3 format Source Code BiolMed 6 5

13 HamoshA ScottAF AmbergerJS BocchiniCA andMcKusickVA (2005) Online Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disordersNucleic Acids Res 33 D514ndashD517

14 XiaJ MandalR SinelnikovIV BroadhurstD andWishartDS (2012) MetaboAnalyst 20ndasha comprehensive serverfor metabolomic data analysis Nucleic Acids Res 40W127ndashW133

15 XiaJ and WishartDS (2010) MSEA a web-based tool toidentify biologically meaningful patterns in quantitativemetabolomic data Nucleic Acids Res 38 W71ndashW77

Nucleic Acids Research 2013 7

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

effort allowed us to revisit all of SMPDBrsquos pathways andto update them with new information and to correctimprove or reformat preexisting pathways so that theywere more consistent and informative Many pathwayswere improved through the addition of more organelleand reaction compartment information Others wereenhanced through the addition of target organ informa-tion and protein transporter information Likewise imagesthat were too small arrows that were too thin andpathways that were too convoluted were corrected andor simplified In redrawing old pathways and renderingnew pathways in SMPDB 20 the same level of qualityassurance the same referential resources and the samekinds of data checkingvalidation routines were used asdescribed in the original SMPDB paper (7) In particularall pathways in SMPDB 20 were drawn using a standardoperating procedure (SOP) with a checklist of featuresthat each of the pathway artistsannotators was requiredto follow This SOP and checklist is also provided in thelsquoAboutrsquo menu To ensure that the pathways have beenknowledgably illustrated all of the SMPDB pathwaycurators were required to have degrees or advanced uni-versity coursework in biology chemistry biochemistry orbioinformatics During the redrawing of existing SMPDBpathways and the generation of new pathways forSMPDB 20 all SMPDB curators were also required toconsult a variety of sources including biochemistry text-books OMIM (13) Wikipedia KEGG (1) MetaCyc (3)Reactome (4) UniProt (11) the HMDB (10) DrugBank(9) and PharmGKB (6) This allowed the curation team toidentify and consolidate pathway nomenclature keypathway components critical reactions cellular compart-ments as well as target organs compartments or organ-elles Pathway layouts were also frequently assessedcompared and discussed by SMPDB team membersprior to manually generating any pathway diagramsBecause of the unique rendering requirements and thestrict SOPs for SMPDB no pathway in SMPDB waslsquocopiedrsquo from any other pathway diagram in any otherdatabase Furthermore SMPDBrsquos metabolic diseasepathways drug-action and drug metabolism pathwayshad to be generated independently of any pathway data-base Instead relevant information was gathered fromvarious medical and pharmacology textbooks specializedencyclopedias and fragmentary data contained on severalonline databases such as OMIM (13) PharmGKB (6) andDrugBank (9) As an additional layer of quality assuranceall of the pathway diagrams in the SMPDB 20 have beeninspected and corrected by two or more curators havingadvanced degrees in biochemistry or physiology

INCREASED CONNECTIVITY ANDINTEROPERABILITY

When SMPDB was first released it was primarily a stand-alone database that simply accessed other databasesthrough its hyperlinks to HMDB DrugBank andUniProt Over the past 2 years efforts have beendirected at integrating SMPDB into a number of otherpopular online databases so that effectively other

databases now access SMPDB In particular SMPDB isnow linked into the pathway data fields of the latestversion of HMDB (10) and the latest release ofDrugBank (9) It has also been linked into several onlinemetabolomic data analysis resources includingMetaboAnalyst (14) and MSEA (15) Discussions areongoing to have SMPDB 20 linked to a number ofother chemical entity databases as well as other well-known pathway databases in the near future Inaddition to trying to increase SMPDBrsquos database connect-ivity we are also trying to make it far more interoperableThe availability of SMPDBrsquos data in BioPAX format andthe expected availability of the same pathway data inSBML (in late 2013) should greatly improve SMPDBrsquosinteroperability It should also encourage users to freelyexchange SMPDB data and to incorporate these data intoa variety of pathway or metabolomic analysis softwarepackages It is also expected that the availability ofSMPDB BioPAX (and soon SBML) files will encouragegreater interest in community annotation allowingSMPDB to evolve from a centrally curated database toa community curated resource

FUTURE PLANS AND CONCLUSIONS

SMPDB continues to expand with new pathways beingadded at an almost daily rate Over the coming 2ndash3years we expect that the number of pathways inSMPDB will likely double It is likely that the greatestgrowth in SMPDBrsquos pathways will be in the drug anddrug metabolism pathways as there are literally thou-sands of reactions reactants and pathways that arereadily available in the literature With the impendingrelease of PathWhiz (SMPDBrsquos pathway drawing tooland editor) we are hopeful that many of SMPDBrsquos newpathways will actually be added by interested communitymembers It is also expected that PathWhiz will enableSMPDB-like pathways and pathway databases to beeasily generated for other model organisms such asSaccharomyces cerevisae Escherichia coli Arabidopsisand Drosophila With the latest additions and enhance-ments in SMPDB 20 we believe that SMPDB has reacheda critical threshold giving it sufficient breadth depth andinterconnectivity so that it will appeal to a much largercommunity of users Overall SMPDB 20rsquos new colorfulinformative artfully designed pathway diagramscombined with its wide range of visualization annotationand querying tools should provide users a much moreenriched interactive and informative pathway viewingexperience

FUNDING

Canadian Institutes for Health Research (CIHR) AlbertaInnovates Health Solutions (AIHS) and Genome Albertaa division of Genome Canada Funding for open accesscharge Genome Canada

Conflict of interest statement None declared

6 Nucleic Acids Research 2013

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

REFERENCES

1 KanehisaM GotoS SatoY FurumichiM and TanabeM(2012) KEGG for integration and interpretation of large-scalemolecular data sets Nucleic Acids Res 40 D109ndashD114

2 KeselerIM MackieA Peralta-GilM Santos-ZavaletaAGama-CastroS Bonavides-MartınezC FulcherC HuertaAMKothariA KrummenackerM et al (2013) EcoCyc fusing modelorganism databases with systems biology Nucleic Acids Res 41D605ndashD612

3 CaspiR AltmanT DreherK FulcherCA SubhravetiPKeselerIM KothariA KrummenackerM LatendresseMMuellerLA et al (2012) The MetaCyc database of metabolicpathways and enzymes and the BioCyc collection of pathwaygenome databases Nucleic Acids Res 40 D742ndashD753

4 CroftD OrsquoKellyG WuG HawR GillespieM MatthewsLCaudyM GarapatiP GopinathG JassalB et al (2011)Reactome a database of reactions pathways and biologicalprocesses Nucleic Acids Res 39 D691ndashD697

5 PicoAR KelderT van IerselMP HanspersK ConklinBRand EveloC (2008) WikiPathways pathway editing for thepeople PLoS Biol 6 e184

6 ThornCF KleinTE and AltmanRB (2013) PharmGKB thePharmacogenomics Knowledge Base Methods Mol Biol 1015311ndash320

7 FrolkisA KnoxC LimE JewisonT LawV HauDDLiuP GautamB LyS GuoAC et al (2010) SMPDB TheSmall Molecule Pathway Database Nucleic Acids Res 38D480ndashD487

8 StrombackL and LambrixP (2005) Representations ofmolecular pathways an evaluation of SBML PSI MI andBioPAX Bioinformatics 21 4401ndash4407

9 KnoxC LawV JewisonT LiuP LyS FrolkisA PonABancoK MakC NeveuV et al (2011) DrugBank 30 acomprehensive resources for lsquoOmicsrsquo research on drugs NucleicAcids Res 39 D1035ndashD1041

10 WishartDS JewisonT GuoAC WilsonM KnoxC LiuYDjoumbouY MandalR AziatF DongE et al (2013) HMDB30ndashThe Human Metabolome Database in 2013 Nucleic AcidsRes 41 D801ndashD807

11 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47

12 WebbRL and MarsquoayanA (2011) Sig2BioPAX Java tool forconverting flat files to BioPAX Level 3 format Source Code BiolMed 6 5

13 HamoshA ScottAF AmbergerJS BocchiniCA andMcKusickVA (2005) Online Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disordersNucleic Acids Res 33 D514ndashD517

14 XiaJ MandalR SinelnikovIV BroadhurstD andWishartDS (2012) MetaboAnalyst 20ndasha comprehensive serverfor metabolomic data analysis Nucleic Acids Res 40W127ndashW133

15 XiaJ and WishartDS (2010) MSEA a web-based tool toidentify biologically meaningful patterns in quantitativemetabolomic data Nucleic Acids Res 38 W71ndashW77

Nucleic Acids Research 2013 7

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from

REFERENCES

1 KanehisaM GotoS SatoY FurumichiM and TanabeM(2012) KEGG for integration and interpretation of large-scalemolecular data sets Nucleic Acids Res 40 D109ndashD114

2 KeselerIM MackieA Peralta-GilM Santos-ZavaletaAGama-CastroS Bonavides-MartınezC FulcherC HuertaAMKothariA KrummenackerM et al (2013) EcoCyc fusing modelorganism databases with systems biology Nucleic Acids Res 41D605ndashD612

3 CaspiR AltmanT DreherK FulcherCA SubhravetiPKeselerIM KothariA KrummenackerM LatendresseMMuellerLA et al (2012) The MetaCyc database of metabolicpathways and enzymes and the BioCyc collection of pathwaygenome databases Nucleic Acids Res 40 D742ndashD753

4 CroftD OrsquoKellyG WuG HawR GillespieM MatthewsLCaudyM GarapatiP GopinathG JassalB et al (2011)Reactome a database of reactions pathways and biologicalprocesses Nucleic Acids Res 39 D691ndashD697

5 PicoAR KelderT van IerselMP HanspersK ConklinBRand EveloC (2008) WikiPathways pathway editing for thepeople PLoS Biol 6 e184

6 ThornCF KleinTE and AltmanRB (2013) PharmGKB thePharmacogenomics Knowledge Base Methods Mol Biol 1015311ndash320

7 FrolkisA KnoxC LimE JewisonT LawV HauDDLiuP GautamB LyS GuoAC et al (2010) SMPDB TheSmall Molecule Pathway Database Nucleic Acids Res 38D480ndashD487

8 StrombackL and LambrixP (2005) Representations ofmolecular pathways an evaluation of SBML PSI MI andBioPAX Bioinformatics 21 4401ndash4407

9 KnoxC LawV JewisonT LiuP LyS FrolkisA PonABancoK MakC NeveuV et al (2011) DrugBank 30 acomprehensive resources for lsquoOmicsrsquo research on drugs NucleicAcids Res 39 D1035ndashD1041

10 WishartDS JewisonT GuoAC WilsonM KnoxC LiuYDjoumbouY MandalR AziatF DongE et al (2013) HMDB30ndashThe Human Metabolome Database in 2013 Nucleic AcidsRes 41 D801ndashD807

11 UniProt Consortium (2013) Update on activities at the UniversalProtein Resource (UniProt) in 2013 Nucleic Acids Res 41D43ndashD47

12 WebbRL and MarsquoayanA (2011) Sig2BioPAX Java tool forconverting flat files to BioPAX Level 3 format Source Code BiolMed 6 5

13 HamoshA ScottAF AmbergerJS BocchiniCA andMcKusickVA (2005) Online Mendelian Inheritance in Man(OMIM) a knowledgebase of human genes and genetic disordersNucleic Acids Res 33 D514ndashD517

14 XiaJ MandalR SinelnikovIV BroadhurstD andWishartDS (2012) MetaboAnalyst 20ndasha comprehensive serverfor metabolomic data analysis Nucleic Acids Res 40W127ndashW133

15 XiaJ and WishartDS (2010) MSEA a web-based tool toidentify biologically meaningful patterns in quantitativemetabolomic data Nucleic Acids Res 38 W71ndashW77

Nucleic Acids Research 2013 7

at University of A

lberta on Novem

ber 25 2013httpnaroxfordjournalsorg

Dow

nloaded from